scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-26 19:35:12 +00:00

Author	SHA1	Message	Date
Pavel Emelyanov	df6d471e08	hasher: More picky noexcept marking of feed_hash() Commit `5adb8e555c` marked the ::feed_hash() and a visitor lambda of digester::feed_hash() as noexcept. This was quite recklesl as the appending_hash<>::operator()s called by ::feed_hash() are not all marked noexcept. In particular, the appending_hash<row>() is not such and seem to throw. The original intent of the mentioned commit was to facilitate the partition_hasher in repair/ code. The hasher itself had been removed by the `0af7a22c21`, so it no longer needs the feed_hash-s to be noexcepts. The fix is to inherit noexcept from the called hashers, but for the digester::feed_hash part the noexcept is just removed until clang compilation bug #50994 is fixed. fixes: #8983 tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20210706153608.4299-1-xemul@scylladb.com> (cherry picked from commit `63a2fed585`)	2021-07-07 18:36:18 +03:00
Raphael S. Carvalho	92b85da380	LCS: reshape: Fix overlapping check when determining if a sstable set is disjoint Wrong comparison operator is used when checking for overlapping. It would miss overlapping when last key of a sstable is equal to the first key of another sstable that comes next in the set, which is sorted by first key. Fixes #8531. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `39ecddbd34`)	2021-07-07 14:04:22 +03:00
Juliusz Stasiewicz	d214d91a09	tests: Adjusted tests for DC checking in NTS CQL test relied on quietly acceptiong non-existing DCs, so it had to be removed. Also, one boost-test referred to nonexisting `datacenter2` and had to be removed. (cherry picked from commit `97bb15b2f2`)	2021-06-21 17:53:47 +03:00
Nadav Har'El	a7a1e59594	cql-pytest: remove "xfail" tag from two passing tests Issue #7595 was already fixed last week, in commit `b6fb5ee912`, so the two tests which failed because of this issue no longer fail and their "xfail" tag can be removed. Refs #7595. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210216160606.1172855-1-nyh@scylladb.com> (cherry picked from commit `946e63ee6e`)	2021-06-20 19:37:28 +03:00
Juliusz Stasiewicz	856aeb5ddb	locator: Check DC names in NTS The same trick is used as in C*: `79e693e16e/src/java/org/apache/cassandra/locator/NetworkTopologyStrategy.java (L241)` Fixes #7595 (cherry picked from commit `b6fb5ee912`)	2021-06-20 19:24:29 +03:00
Piotr Sarna	61659fdbdb	Merge 'view: fix use-after-move when handling view update failures' Backport of `6726fe79b6`. The code was susceptible to use-after-move if both local and remote updates were going to be sent. The whole routine for sending view updates is now rewritten to avoid use-after-move. Fixes #8830 Tests: unit(release), dtest(secondary_indexes_test.py:TestSecondaryIndexes.test_remove_node_during_index_build) Closes #8834 * backport-6726fe7-4.4: view: fix use-after-move when handling view update failures db,view: explicitly move the mutation to its helper function db,view: pass base token by value to mutate_MV	2021-06-16 14:15:12 +02:00
Piotr Sarna	b06e9447b1	view: fix use-after-move when handling view update failures The code was susceptible to use-after-move if both local and remote updates were going to be sent. The whole routine for sending view updates is now rewritten to avoid use-after-move. Refs #8830 Tests: unit(release), dtest(secondary_indexes_test.py:TestSecondaryIndexes.test_remove_node_during_index_build) (cherry picked from commit `8a049c9116`)	2021-06-16 13:40:57 +02:00
Piotr Sarna	6a407984d8	db,view: explicitly move the mutation to its helper function The `apply_to_remote_endpoints` helper function used to take its `mut` parameter by reference, but then moved the value from it, which is confusing and prone to errors. Since the value is moved-from, let's pass it to the helper function as rvalue ref explicitly. (cherry picked from commit `7cdbb7951a`)	2021-06-16 13:38:39 +02:00
Piotr Sarna	74df68c67f	db,view: pass base token by value to mutate_MV The base token is passed cross-continuations, so the current way of passing it by const reference probably only works because the token copying is cheap enough to optimize the reference out. Fix by explicitly taking the token by value. (cherry picked from commit `88d4a66e90`)	2021-06-16 13:38:01 +02:00
Raphael S. Carvalho	a6b3a2b945	LCS: Fix terrible write amplification when reshaping level 0 LCS reshape is basically 'major compacting' level 0 until it contains less than N sstables. That produces terrible write amplification, because any given byte will be compacted (initial # of sstables / max_threshold (32)) times. So if L0 initially contained 256 ssts, there would be a WA of about 8. This terrible write amplification can be reduced by performing STCS instead on L0, which will leave L0 in a good shape without hurting WA as it happens now. Fixes #8345. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210322150655.27011-1-raphaelsc@scylladb.com> (cherry picked from commit `bcbb39999b`)	2021-06-14 20:27:41 +03:00
Michał Chojnowski	2cf998e418	cdc: log: fix use-after-free in process_bytes_visitor Due to small value optimization used in `bytes`, views to `bytes` stored in `vector` can be invalidated when the vector resizes, resulting in use-after-free and data corruption. Fix that. Fixes #8117 (cherry picked from commit `8cc4f39472`)	2021-06-13 19:06:25 +03:00
Botond Dénes	6a23208ce4	mutation_test: test_mutation_diff_with_random_generator: compact input mutations This test checks that `mutation_partition::difference()` works correctly. One of the checks it does is: m1 + m2 == m1 + (m2 - m1). If the two mutations are identical but have compactable data, e.g. a shadowable tombstone shadowed by a row marker, the apply will collapse these, causing the above equality check to fail (as m2 - m1 is null). To prevent this, compact the two input mutations. Fixes: #8221 Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210310141118.212538-1-bdenes@scylladb.com> (cherry picked from commit `cf28552357`)	2021-06-13 18:30:43 +03:00
Takuya ASADA	94d73d2d26	scylla_coredump_setup: avoid coredump failure when hard limit of coredump is set to zero On the environment hard limit of coredump is set to zero, coredump test script will fail since the system does not generate coredump. To avoid such issue, set ulimit -c 0 before generating SEGV on the script. Note that scylla-server.service can generate coredump even ulimit -c 0 because we set LimitCORE=infinity on its systemd unit file. Fixes #8238 Closes #8245 (cherry picked from commit `af8eae317b`)	2021-06-13 18:27:03 +03:00
Avi Kivity	033d56234b	Update seastar submodule (nested exception logging) * seastar 61939b5b8a...4b7d434965 (2): > utils/log.cc: fix nested_exception logging (again) > log: skip on unknown nested mixing instead of stopping the logging Fixes #8327.	2021-06-13 18:22:59 +03:00
Benny Halevy	35d89298da	test: commitlog_test: test_allocation_failure: fill memory using smaller allocations commitlog was changed to use fragmented_temporary_buffer::ostream (db::commitlog::output). So if there are discontiguous small memory blocks, they can be used to satisfy an allocation even if no contiguous memory blocks are available. To prevent that, as Avi suggested, this change allocates in 128K blocks and frees the last one to succeed (so that we won't fail on allocating continuations). Fixes #8028 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20210203100333.862036-1-bhalevy@scylladb.com> (cherry picked from commit `ca6f5cb0bc`)	2021-06-10 19:35:40 +03:00
Dejan Mircevski	8011d181b5	cql3: Skip indexed column for CK restrictions When querying an index table, we assemble clustering-column restrictions for that query by going over the base table token, partition columns, and clustering columns. But if one of those columns is the indexed column, there is a problem; the indexed column is the index table's partition key, not clustering key. We end up with invalid clustering slice, which can cause problems downstream. Fix this by skipping the indexed column when assembling the clustering restrictions. Tests: unit (dev) Fixes #7888 Signed-off-by: Dejan Mircevski <dejan@scylladb.com> Closes #8320 (cherry picked from commit `0bd201d3ca`)	2021-06-10 10:43:14 +03:00
Hagit Segev	bfafb84567	release: prepare for 4.4.3 scylla-4.4.3	2021-06-09 19:51:57 +03:00
Yaron Kaikov	0d9c09ed04	install.sh: Setup aio-max-nr upon installation This is a follow up change to #8512. Let's add aio conf file during scylla installation process and make sure we also remove this file when uninstall Scylla As per Avi Kivity's suggestion, let's set aio value as static configuration, and make it large enough to work with 500 cpus. Closes #8650 Refs: #8713 (cherry picked from commit `dd453ffe6a`)	2021-06-07 16:30:00 +03:00
Yaron Kaikov	36a4eba22e	scylla_io_setup: configure "aio-max-nr" before iotune On severl instance types in AWS and Azure, we get the following failure during scylla_io_setup process: ``` ERROR 2021-04-14 07:50:35,666 [shard 5] seastar - Could not setup Async I/O: Resource temporarily unavailable. The most common cause is not enough request capacity in /proc/sys/fs/aio-max-nr. Try increasing that number or reducing the amount of logical CPUs available for your application ``` We have scylla_prepare:configure_io_slots() running before the scylla-server.service start, but the scylla_io_setup is taking place before 1) Let's move configure_io_slots() to scylla_util.py since both scylla_io_setup and scylla_prepare are import functions from it 2) cleanup scylla_prepare since we don't need the same function twice 3) Let's use configure_io_slots() during scylla_io_setup to avoid such failure Fixes: #8587 Closes #8512 Refs: #8713 (cherry picked from commit `588a065304`)	2021-06-07 16:29:38 +03:00
Nadav Har'El	e9b1f10654	Update tools/java submodule with backported patches * tools/java 6ca351c221...aab793d9f5 (2): > nodetool: alternate way to specify table name which includes a dot > nodetool: do no treat table name with dot as a secondary index Fixes #6521 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-06-07 09:38:53 +03:00
Nadav Har'El	6057be3f42	alternator: fix equality check of nested document containing a set In issue #5021 we noticed that the equality check in Alternator's condition expressions needs to handle sets differently - we need to compare the set's elements ignoring their order. But the implementation we added to fix that issue was only correct when the entire attribute was a set... In the general case, an attribute can be a nested document, with only some inner set. The equality-checking function needs to tranverse this nested document, and compare the sets inside it as appropriate. This is what we do in this patch. This patch also adds a new test comparing equality of a nested document with some inner sets. This test passes on DynamoDB, failed on Alternator before this patch, and passes with this patch. Refs #5021 Fixes #8514 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419184840.471858-1-nyh@scylladb.com> (cherry picked from commit `dae7528fe5`)	2021-06-07 09:10:08 +03:00
Nadav Har'El	673f823d8b	alternator: fix inequality check of two sets In issue #5021 we noted that Alternator's equality operator needs to be fixed for the case of comparing two sets, because the equality check needs to take into account the possibility of different element order. Unfortunately, we fixed only the equality check operator, but forgot there is also an inequality operator! So in this patch we fix the inequality operator, and also add a test for it that was previously missing. The implementation of the inequality operator is trivial - it's just the negation of the equality test. Our pre-existing tests verify that this is the correct implementation (e.g., if attribute x doesn't exist, then "x = 3" is false but "x <> 3" is true). Refs #5021 Fixes #8513 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419141450.464968-1-nyh@scylladb.com> (cherry picked from commit `50f3201ee2`)	2021-06-07 08:45:54 +03:00
Nadav Har'El	0082968bd8	alternator: fix equality check of two unset attributes When a condition expression (ConditionExpression, FilterExpression, etc.) checks for equality of two item attributes, i.e., "x = y", and when one of these attributes was missing we correctly returned false. However, we also need to return false when both attributes are missing in the item, because this is what DynamoDB does in this case. In other words an unset attribute is never equal to anything - not even to another unset attribute. This was not happening before this patch: When x and y were both missing attributes, Alternator incorrectly returned true for "x = y", and this patch fixes this case. It also fixes "x <> y" which should to be true when both x and y are unset (but was false before this patch). The other comparison operators - <, <=, >, >=, BETWEEN, were all implemented correctly even before this patch. This patch also includes tests for all the two-unset-attribute cases of all the operators listed above. As usual, we check that these tests pass on both DynamoDB and Alternator to confirm our new behavior is the correct one - before this patch, two of the new tests failed on Alternator and passed on DynamoDB. Fixes #8511 Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20210419123911.462579-1-nyh@scylladb.com> (cherry picked from commit `46448b0983`)	2021-06-06 16:28:27 +03:00
Takuya ASADA	542cd7aff1	scylla_raid_setup: use /dev/disk/by-uuid to specify filesystem Currently, var-lib-scylla.mount may fails because it can start before MDRAID volume initialized. We may able to add "After=dev-disk-by\x2duuid-<uuid>.device" to wait for device become available, but systemd manual says it automatically configure dependency for mount unit when we specify filesystem path by "absolute path of a device node". So we need to replace What=UUID=<uuid> to What=/dev/disk/by-uuid/<uuid>. Fixes #8279 Closes #8681 (cherry picked from commit `3d307919c3`)	2021-05-24 17:24:07 +03:00
Raphael S. Carvalho	2b29568bf4	sstables/mp_row_consumer: Fix unbounded memory usage when consuming a large run of partition tombstones mp_row_consumer will not stop consuming large run of partition tombstones, until a live row is found which will allow the consumer to stop proceeding. So partition tombstones, from a large run, are all accumulated in memory, leading to OOM and stalls. The fix is about stopping the consumer if buffer is full, to allow the produced fragments to be consumed by sstable writer. Fixes #8071. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210514202640.346594-1-raphaelsc@scylladb.com> Upstream fix: `db4b9215dd` scylla-4.4.2	2021-05-20 21:26:07 +03:00
Hagit Segev	93457807b8	release: prepare for 4.4.2	2021-05-20 00:02:31 +03:00
Takuya ASADA	cee62ab41b	install.sh: apply correct file security context when copying files Currently, unified installer does not apply correct file security context while copying files, it causes permission error on scylla-server.service. We should apply default file security context while copying files, using '-Z' option on /usr/bin/install. Also, because install -Z requires normalized path to apply correct security context, use 'realpath -m <PATH>' on path variables on the script. Fixes #8589 Closes #8602 (cherry picked from commit `60c0b37a4c`)	2021-05-19 12:41:20 +03:00
Takuya ASADA	728a5e433f	install.sh: fix not such file or directory on nonroot Since we have added scylla-node-exporter, we needed to do 'install -d' for systemd directory and sysconfig directory before copying files. Fixes #8663 Closes #8664 (cherry picked from commit `6faa8b97ec`)	2021-05-19 12:41:20 +03:00
Avi Kivity	9a2d4a7cc7	Merge 'Fix type checking in index paging' from Piotr Sarna When recreating the paging state from an indexed query, a bunch of panic checks were introduced to make sure that the code is correct. However, one of the checks is too eager - namely, it throws an error if the base column type is not equal to the view column type. It usually works correctly, unless the base column type is a clustering key with DESC clustering order, in which case the type is actually "reversed". From the point of view of the paging state generation it's not important, because both types deserialize in the same way, so the check should be less strict and allow the base type to be reversed. Tests: unit(release), along with the additional test case introduced in this series; the test also passes on Cassandra Fixes #8666 Closes #8667 * github.com:scylladb/scylla: test: add a test case for paging with desc clustering order cql3: relax a type check for index paging (cherry picked from commit `593ad4de1e`)	2021-05-19 12:41:05 +03:00
Takuya ASADA	cc050fd499	dist/redhat: stop using systemd macros, call systemctl directly Fedora version of systemd macros does not work correctly on CentOS7, since CentOS7 does not support "file trigger" feature. To fix the issue we need to stop using systemd macros, call systemctl directly. See scylladb/scylla-jmx#94 Closes #8005 (cherry picked from commit `7b310c591e`)	2021-05-18 13:50:07 +03:00
Raphael S. Carvalho	61145af5d9	compaction_manager: Don't swallow exception in procedure used by reshape and resharding run_custom_job() was swallowing all exceptions, which is definitely wrong because failure in a resharding or reshape would be incorrectly interpreted as success, which means upper layer will continue as if everything is ok. For example, ignoring a failure in resharding could result in a shared sstable being left unresharded, so when that sstable reaches a table, scylla would abort as shared ssts are no longer accepted in the main sstable set. Let's allow the exception to be propagated, so failure will be communicated, and resharding and reshape will be all or nothing, as originally intended. Fixes #8657. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210515015721.384667-1-raphaelsc@scylladb.com> (cherry picked from commit `10ae77966c`)	2021-05-18 13:00:20 +03:00
Avi Kivity	11bd83e319	Update tools/jmx (rpm systemd macros) * tools/jmx c510a56...7a101a0 (1): > dist/redhat: stop using systemd macros, call systemctl directly Ref scylladb/jmx#94.	2021-05-13 18:24:52 +03:00
Raphael S. Carvalho	b58305d919	compaction_manager: Redefine weight for better control of parallel compactions Compaction manager allows compaction of different weights to proceed in parallel. For example, a small-sized compaction job can happen in parallel to a large-sized one, but similar-sized jobs are serialized. The problem is the current definition of weight, which is the log (base 4) of total size (size of all sstables) of a job. This is what we get with the current weight definition: weight=5 for sizes=[1K, 3K] weight=6 for sizes=[4K, 15K] weight=7 for sizes=[16K, 63K] weight=8 for sizes=[64K, 255K] weight=9 for sizes=[258K, 1019K] weight=10 for sizes=[1M, 3M] weight=11 for sizes=[4M, 15M] weight=12 for sizes=[16M, 63M] weight=13 for sizes=[64M, 254M] weight=14 for sizes=[256M, 1022M] weight=15 for sizes=[1033M, 4078M] weight=16 for sizes=[4119M, 10188M] total weights: 12 Note that for jobs smaller than 1MB, we have 5 different weights, meaning 5 jobs smaller than 1MB could proceed in parallel. High number of parallel compactions can be observed after repair, which potentially produces tons of small sstables of varying sizes. That causes compaction to use a significant amount of resources. To fix this problem, let's add a fixed tax to the size before taking the log, so that jobs smaller than 1M will all have the same weight. Look at what we get with the new weight definition: weight=10 for sizes=[1K, 2M] weight=11 for sizes=[3M, 14M] weight=12 for sizes=[15M, 62M] weight=13 for sizes=[63M, 254M] weight=14 for sizes=[256M, 1022M] weight=15 for sizes=[1033M, 4078M] weight=16 for sizes=[4119M, 10188M] total weights: 7 Fixes #8124. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210217123022.241724-1-raphaelsc@scylladb.com> (cherry picked from commit `81d773e5d8`) Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210512224405.68925-1-raphaelsc@scylladb.com>	2021-05-13 08:38:40 +03:00
Lauro Ramos Venancio	065111b42b	TWCS: initialize _highest_window_seen The timestamp_type is an int64_t. So, it has to be explicitly initialized before using it. This missing inicialization prevented the major compactation from happening when a time window finishes, as described in #8569. Fixes #8569 Signed-off-by: Lauro Ramos Venancio <lauro.venancio@incognia.com> Closes #8590 (cherry picked from commit `15f72f7c9e`)	2021-05-06 08:52:15 +03:00
Nadav Har'El	ebd2c9bab0	Update tools/java submodule Backport sstableloader fix in tools/java submodule. Fixes #8230. * tools/java a3e010ee4f...6ca351c221 (1): > sstableloader: Handle non-prepared batches with ":" in identifier names Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2021-05-03 10:08:54 +03:00
Avi Kivity	bf9e1f6d2e	Merge '[branch 4.4] Backport reader_permit: always forward resources to the semaphore ' from Botond Dénes This is a backport of `8aaa3a7` to branch-4.4. The main conflicts were around Benny's reader close series (`fa43d76`), but it also turned out that an additional patch (2f1d65c) also has to backported to make sure admission on signaling resources doesn't deadlock. Refs: #8493 Closes #8571 * github.com:scylladb/scylla: test: mutation_reader_test: add test_reader_concurrency_semaphore_forward_progress test: mutation_reader_test: add test_reader_concurrency_semaphore_readmission_preserves_units reader_concurrency_semaphore: add dump_diagnostics() reader_permit: always forward resources test: multishard_mutation_query_test: fuzzy-test: don't consume resource up-front reader_concurrency_semaphore: make admission conditions consistent	2021-04-30 22:02:46 +03:00
Botond Dénes	a710866235	test: mutation_reader_test: add test_reader_concurrency_semaphore_forward_progress This unit test checks that the semaphore doesn't get into a deadlock when contended, in the presence of many memory-only reads (that don't wait for admission). This is tested by simulating the 3 kind of reads we currently have in the system: * memory-only: reads that don't pass admission and only own memory. * admitted: reads that pass admission. * evictable: admitted reads that are furthermore evictable. The test creates and runs a large number of these reads in parallel, read kinds being selected randomly, then creates a watchdog which kills the test if no progress is being made. (cherry picked from commit `45d580f056`)	2021-04-30 11:03:09 +03:00
Botond Dénes	3c3fc18777	test: mutation_reader_test: add test_reader_concurrency_semaphore_readmission_preserves_units This unit test passes a read through admission again-and-again, just like an evictable reader would be during its lifetime. When readmitted the read sometimes has to wait and sometimes not. This is to check that the readmitting a previously admitted reader doesn't leak any units. (cherry picked from commit `cadc26de38`)	2021-04-30 11:03:09 +03:00
Botond Dénes	960f93383b	reader_concurrency_semaphore: add dump_diagnostics() Allow semaphore related tests to include a diagnostics printout in error messages to help determine why the test failed. (cherry picked from commit `d246e2df0a`)	2021-04-30 09:08:18 +03:00
Botond Dénes	1c0557c638	reader_permit: always forward resources This commit conceptually reverts `4c8ab10`. Said commit was meant to prevent the scenario where memory-only permits -- those that don't pass admission but still consume memory -- completely prevent the admission of reads, possibly even causing a deadlock because a permit might even blocks its own admission. The protection introduced by said commit however proved to be very problematic. It made the status of resources on the permit very hard to reason about and created loopholes via which permits could accumulate without tracking or they could even leak resources. Instead of continuing to patch this broken system, this commit does away with this "protection" based on the observation that deadlocks are now prevented anyway by the admission criteria introduced by `0fe75571d9`, which admits a read anyway when all the initial count resources are available (meaning no admitted reader is alive), regardless of availability of memory. The benefits of this revert is that the semaphore now knows about all the resources and is able to do its job better as it is not "lied to" about resource by the permits. Furthermore the status of a permit's resources is much simpler to reason about, there are no more loopholes in unexpected state transitions to swallow/leak resources. To prove that this revert is indeed safe, in the next commit we add robust tests that stress test admission on a highly contested semaphore. This patch also does away with the registered/admitted differentiation of permits, as this doesn't make much sense anymore, instead these two are unified into a single "active" state. One can always tell whether a permit was admitted or not from whether it owns count resources anyway. (cherry picked from commit `caaa8ef59a`)	2021-04-30 09:08:17 +03:00
Botond Dénes	f23052ae64	test: multishard_mutation_query_test: fuzzy-test: don't consume resource up-front The fuzzy test consumes a large chunk of resource from the semaphore up-front to simulate a contested semaphore. This isn't an accurate simulation, because no permit will have more than 1 units in reality. Furthermore this can even cause a deadlock since `8aaa3a7` as now we rely on all count units being available to make forward progress when memory is scarce. This patch just cuts out this part of the test, we now have a dedicated unit test for checking a heavily contested semaphore, that does it properly, so no need to try to fix this clumsy attempt that is just making trouble at this point. Refs: #8493 Tests: release(multishard_mutation_query_test:fuzzy_test) Signed-off-by: Botond Dénes <bdenes@scylladb.com> Message-Id: <20210429084458.40406-1-bdenes@scylladb.com> (cherry picked from commit `26ae9555d1`)	2021-04-30 08:57:12 +03:00
Botond Dénes	15a157611a	reader_concurrency_semaphore: make admission conditions consistent Currently there are two places where we check admission conditions: `do_wait_admission()` and `signal()`. Both use `has_available_units()` to check resource availability, but the former has some additional resource related conditions on top (in `may_proceed()`), which lead to the two paths working with slightly different conditions. To fix, push down all resource availability related checks to `has_available_units()` to ensure admission conditions are consistent across all paths. (cherry picked from commit `d90cd6402c`)	2021-04-30 08:57:12 +03:00
Eliran Sinvani	d0b82e1e68	Materialized views: fix possibly old views comming from other nodes Migration manager has a function to get a schema (for read or write), this function queries a peer node and retrieves the schema from it. One scenario where it can happen is if an old node, queries an old not fixed index. This makes a hole through which views that are only adjusted for reading can slip through. Here we plug the hole by fixing such views before they are registered. Closes #8509 (cherry picked from commit `480a12d7b3`) Fixes #8554.	2021-04-29 14:03:03 +03:00
Botond Dénes	840ca41393	database: clear inactive reads in stop() If any inactive read is left in the semaphore, it can block `database::stop()` from shutting down, as sstables pinned by these reads will prevent `sstables::sstables_manager::close()` from finishing. This causes a deadlock. It is not clear how inactive reads can be left in the semaphore, as all users are supposed to clean up after themselves. Post 4.4 releases don't have this problem anymore as the inactive read handle was made a RAII object, removing the associated inactive read when destroyed. In 4.4 and earlier release this wasn't so, so errors could be made. Normally this is not a big issue, as these orphaned inactive reads are just evicted when the resources they own are needed, but it does become a serious issue during shutdown. To prevent a deadlock, clear the inactive reads earlier, in `database::stop()` (currently they are cleared in the destructor). This is a simple and foolproof way of ensuring any leftover inactive reads don't cause problems. Fixes: #8561 Tests: unit(dev) Closes #8562	2021-04-28 19:32:46 +03:00
Takuya ASADA	07051f25f2	dist: increase fs.aio-max-nr value for other apps Current fs.aio-max-nr value cpu_count() * 11026 is exact size of scylla uses, if other apps on the environment also try to use aio, aio slot will be run out. So increase value +65536 for other apps. Related #8133 Closes #8228 (cherry picked from commit `53c7600da8`)	2021-04-25 16:15:25 +03:00
Takuya ASADA	8437f71b1b	dist: tune fs.aio-max-nr based on the number of cpus Current aio-max-nr is set up statically to 1048576 in /etc/sysctl.d/99-scylla-aio.conf. This is sufficient for most use cases, but falls short on larger machines such as i3en.24xlarge on AWS that has 96 vCPUs. We need to tune the parameter based on the number of cpus, instead of static setting. Fixes #8133 Signed-off-by: Takuya ASADA <syuu@scylladb.com> Closes #8188 (cherry picked from commit `d0297c599a`)	2021-04-25 16:15:12 +03:00
Avi Kivity	9f32f5a60c	Update seastar submodule (io_queue request size) * seastar 37eb6022fc...61939b5b8a (1): > io_queue: Double max request size Fixes #8496	2021-04-25 12:35:34 +03:00
Avi Kivity	910bc2417a	Update seastar submodule (low bandwidth disks) * seastar a75171fc89...37eb6022fc (2): > io_queue: Honor disks with tiny request rate > io_queue: Shuffle fair_group creation Fixes #8378.	2021-04-21 14:02:15 +03:00
Piotr Jastrzebski	7790beb655	row_cache: remove redundant check in make_reader This check is always true because a dummy entry is added at the end of each cache entry. If that wasn't true, the check in else-if would be an UB. Refs #8435. Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `cb3dbb1a4b`)	2021-04-20 13:53:23 +02:00
Piotr Jastrzebski	1379f141c2	cache_flat_mutation_reader: fix do_fill_buffer Make sure that when a partition does not exist in underlying, do_fill_buffer does not try to fast forward withing this nonexistent partition. Test: unit(dev) Fixes #8435 Fixes #8411 Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com> (cherry picked from commit `1f644df09d`)	2021-04-20 13:53:17 +02:00

1 2 3 4 5 ...

24962 Commits