scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-28 20:27:03 +00:00

Author	SHA1	Message	Date
Botond Dénes	2ee026f26f	test/manual/sstable_scan_footprint_test: run test body in statement sched group So that queries are processed in said scheduling group and thus they use the user read concurrency semaphore.	2020-09-28 11:27:49 +03:00
Botond Dénes	272a54b81c	test/manual/sstable_scan_footprint_test: move test main code into separate function	2020-09-28 11:27:49 +03:00
Botond Dénes	29861b068e	test/manual/sstable_scan_footprint_test: sprinkle some thread::maybe_yield():s To avoid stalls.	2020-09-28 11:27:49 +03:00
Botond Dénes	daa9fa72f1	test/manual/sstable_scan_footprint_test: make clustering row size configurable So that large-row workloads can be simulated too.	2020-09-28 11:27:49 +03:00
Botond Dénes	2ff326a41a	test/manual/sstable_scan_footprint_test: document sstable related command line arguments	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb308411c	mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*()	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb0b02ee8	test: simple_schema: add make_static_row()	2020-09-28 11:27:49 +03:00
Botond Dénes	63578bf0a7	reader_permit: reader_resources: add operator==	2020-09-28 11:27:49 +03:00
Botond Dénes	256140a033	mutation_fragment: memory_usage(): remove unused schema parameter The memory usage is now maintained and updated on each change to the mutation fragment, so it needs not be recalculated on a call to `memory_usage()`, hence the schema parameter is unused and can be removed.	2020-09-28 11:27:47 +03:00
Botond Dénes	041d71bd6f	mutation_fragment: track memory usage through the reader_permit The memory usage of mutation fragments is now tracked through its lifetime through a reader permit. This was the last major (to my current knowledge) untracked piece of the reader pipeline.	2020-09-28 11:27:29 +03:00
Botond Dénes	52662f17ea	reader_permit: resource_units: add permit() and resources() accessors	2020-09-28 11:27:29 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	54357221f0	partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment It is what its callers want anyway.	2020-09-28 10:53:56 +03:00
Botond Dénes	1e6285d776	mutation_fragment: remove as_mutable_end_of_partition() There is nothing to mutate on a partition_end fragment.	2020-09-28 10:53:56 +03:00
Botond Dénes	5079b9ccf1	mutation_fragment: s/as_mutable_partition_start/mutate_as_partition_start/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_mutation_start() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	72a88e0257	mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_range_tombstone() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	f2b9cad4c6	mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_static_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	0518571e56	flat_mutation_reader: make _buffer a tracked buffer Via a tracked_allocator. Although the memory allocations made by the _buffer shouldn't dominate the memory consumption of the read itself, they can still be a significant portion that scales with the number of readers in the read.	2020-09-28 10:53:56 +03:00
Botond Dénes	77ea44cb73	mutation_reader: extract the two fill_buffer_result into a single one Currently we have two, nearly identical definitions of said struct. Extract it to a common definition and rename it to `remote_fill_buffer_result`.	2020-09-28 10:53:56 +03:00
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Botond Dénes	c1215592da	reader_permit: introduce tracking_allocator This can be used with standard containers and other containers that use the std::allocator interface to track the allocations made by them via a reader_permit.	2020-09-28 08:46:22 +03:00
Botond Dénes	f10abf6e35	reader_permit: reader_resources: add with_memory() factory function To make creating reader resource with just memory more convenient and more readable at the same time.	2020-09-28 08:46:22 +03:00
Botond Dénes	4c8ab10563	reader_permit: only forward resource consumption to semaphore after admission In the next patches we plan to start tracking the memory consumption of the actual allocations made by the circular_buffer<mutation_fragment>, as well as the memory consumed by the mutation fragments. This means that readers will start consuming memory off the permit right after being constructed. Ironically this can prevent the reader from being admitted, due to its own pre-admission memory consumption. To prevent this hold on forwarding the memory consumption to the semaphore, until the permit is actually admitted.	2020-09-28 08:46:22 +03:00
Botond Dénes	e1eee0dc34	reader_permit: track resource consumed through permit Track all resources consumed through the permit inside the permit. This allows querying how much memory each read is consuming (as there should be one read per permit). Although this might be interesting, especially when debugging OOM cores, the real reason we are doing this is to be able forward resource consumption to the semaphore only post-admission. More on this in the patch introducing this. Another advantage of tracking resources consumed through the permit is that now we can detect resource leaks in the permit destructor and report them. Even if it is just a case of the holder of the resources wanting to release the resources later, with the permit destroyed it will cause use-after-free.	2020-09-28 08:46:22 +03:00
Botond Dénes	cd953a36fd	reader_permit: move internals to impl In the next patches the reader permit will gain members that are shared across all instances of the same permit. To facilitate this move all internals into an impl class, of which the permit stores a shared pointer. We use a shared_ptr to avoid defining `impl` in the header. This is how the reader permit started in the beginning. We've done a full circle. :)	2020-09-28 08:46:22 +03:00
Botond Dénes	12372731cb	reader_permit: add consume()/signal() And do all consuming and signalling through these methods. These operations will soon be more involved than the simple forwarding they do today, so we want to centralize them to a single method pair.	2020-09-28 08:46:22 +03:00
Botond Dénes	375815e650	reader_permit::resource_units: store permit instead of semaphore In the next patches we want to introduce per-permit resource tracking -- that is, have each permit track the amount of resource consumed through it. For this, we need all consumption to happen through a permit, and not directly with the semaphore.	2020-09-28 08:46:22 +03:00
Botond Dénes	04d83f6678	reader_permit: move resource_units declaration outside the reader_permit class In the next patch we want to store a `reader_permit` instance inside `resource_units` so a full definition of the former must be available.	2020-09-28 08:46:22 +03:00
Botond Dénes	0fe75571d9	reader_concurrency_semaphore: admit one read if no reader is active To ensure progress at all times. This is due to evictable readers, who still hold on to a buffer even when their underlying reader is evicted. As we are introducing buffer and mutation fragment tracking in the next patches, these readers will hold on to memory even in this state, so it may theoretically happen that even though no readers are admitted (all count resources all available) no reader can be admitted due to lack of memory. To prevent such deadlocks we now always admit one reader if all count resource are available.	2020-09-28 08:46:22 +03:00
Botond Dénes	ef0b279c80	reader_concurrency_semaphore: move may_proceed() out-of-line They are only used in the .cc anyway.	2020-09-28 08:46:22 +03:00
Botond Dénes	d692993bdc	mutation_reader_test: test_multishard_combining_reader_non_strictly_monotonic_positions: reset size between buffer fills Current code uses a single counter to produce multiple buffer worth of data. This uses carry-on from on buffer to the other, which happens to work with the current memory accounting but is very fragile. Account each buffer separately, resetting the counter between them.	2020-09-28 08:46:22 +03:00
Botond Dénes	7e909671f4	view_build_test: test_view_update_generator_deadlock: release semaphore resources The test consumes all resources off the semaphore, leaving just enough to admit a single reader. However this amount is calculated based on the base cost of readers, but as we are going to track reader buffers as well, the amount of memory consumed will be much less predictable. So to make sure background readers can finish during shutdown, release all the consumed resources before leaving scope.	2020-09-28 08:46:22 +03:00
Botond Dénes	122ab1aabd	view_build_test: test_view_update_generator_buffering: fail the test early on exceptions No point in continuing processing the entire buffer once a failure was found. Especially that an early failure might introduce conditions that are not handled in the normal flow-path. We could handle these but there is no point in this added complexity, at this point the test is failed anyway.	2020-09-28 08:46:22 +03:00
Botond Dénes	99388590da	querier_cache_test: test_resources_based_cache_eviction: use semaphore::consume() to drain semaphore It is much more reliable and simple this way, than playing with `reader_permit::wait_for_admission()`.	2020-09-28 08:46:22 +03:00
Botond Dénes	3c73cc2a4e	tests: prepare for permit forwarding consumption post admission Some tests rely on `consume*()` calls on the permit to take effect immediately. Soon this will only be true once the permit has been admitted, so make sure the permit is admitted in these tests.	2020-09-28 08:46:22 +03:00
Botond Dénes	5e5c94b064	test/lib/reader_lifecycle_policy: don't destroy reader context eagerly Currently per-shard reader contexts are cleaned up as soon as the reader itself is destroyed. This causes two problems: * Continuations attached to the reader destroy future might rely on stuff in the context being kept alive -- like the semaphore. * Shard 0's semaphore is special as it will be used to account buffers allocated by the multishard reader itself, so it has to be alive until after all readers are destroyed. This patch changes this so that contexts are destroyed only when the lifecycle policy itself is destroyed.	2020-09-28 08:46:22 +03:00
Takuya ASADA	8366d2231d	scylla_ntp_setup: use chrony on all distributions To simplify scylla_ntp_setup, use chrony on all distributions.	2020-09-27 12:30:02 +03:00
Rafael Ávila de Espíndola	2093efceab	build: Upgrade to seastar API level 5 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200923202424.216444-1-espindola@scylladb.com>	2020-09-26 11:07:49 +03:00
Avi Kivity	36d93f586a	Update seastar submodule * seastar e215023c7...292ba734b (4): > future: Fix move of futures of reference type > doc: fix hyper link to tutorial.html > tutorial: fix formatting of code block > README.md: fix the formatting of table	2020-09-25 21:54:44 +03:00
Tomasz Grabiec	97c99ea9f3	Merge "evictable_reader: validate buffer on reader recreation" from Botond The reader recreation mechanism is a very delicate and error-prone one, as proven by the countless bugs it had. Most of these bugs were related to the recreated reader not continuing the read from the expected position, inserting out-of-order fragments into the stream. This patch adds a defense mechanism against such bugs by validating the start position of the recreated reader. The intent is to prevent corrupt data from getting into the system as well as to help catch these bugs as close to the source as possible. Fixes: #7208 Tests: unit(dev), mutation_reader_test:debug (v4) * botond/evictable-reader-validate-buffer/v5: mutation_reader_test: add unit test for evictable reader self-validation evictable_reader: validate buffer after recreation the underlying evictable_reader: update_next_position(): only use peek'd position on partition boundary mutation_reader_test: add unit test for evictable reader range tombstone trimming evictable_reader: trim range tombstones to the read clustering range position_in_partition_view: add position_in_partition_view before_key() overload flat_mutation_reader: add buffer() accessor	2020-09-25 17:02:51 +02:00
Takuya ASADA	eae2aa58fa	dist/common/scripts: move back get_set_nic_and_disks_config_value to scylla_util.py The function mistakenly moved to scylla_sysconfig_setup but it also referenced from scylla_prepare, move back to scylla_util.py Fixes #7276 Closes #7280	2020-09-25 13:05:43 +03:00
Botond Dénes	076c27318b	mutation_reader_test: add unit test for evictable reader self-validation Add both positive (where the validation should succeed) and negative (where the validation should fail) tests, covering all validation cases.	2020-09-25 12:09:01 +03:00
Botond Dénes	0b0ae18a14	evictable_reader: validate buffer after recreation the underlying The reader recreation mechanism is a very delicate and error-prone one, as proven by the countless bugs it had. Most of these bugs were related to the recreated reader not continuing the read from the expected position, inserting out-of-order fragments into the stream. This patch adds a defense mechanism against such bugs by validating the start position of the recreated reader. Several things are checked: * The partition is the expected one -- the one we were in the middle of or the next if we stopped at partition boundaries. * The partition is in the read range. * The first fragment in the partition is the expected one -- has a an equal or larger position than the next expected fragment. * The fragment is in the clustering range as defined by the slice. As these validations are only done on the slow-path of recreating an evicted reader, no performance impact is expected.	2020-09-25 12:09:00 +03:00
Botond Dénes	91020eef73	evictable_reader: update_next_position(): only use peek'd position on partition boundary `evictable_reader::update_next_position()` is used to record the position the reader will continue from, in the next buffer fill. This position is used to create the partition slice when the underlying reader is evicted and has to be recreated. There is an optimization in this method -- if the underlying's buffer is not empty we peek at the first fragment in it and use it as the next position. This is however problematic for buffer validation on reader recreation (introduced in the next patch), because using the next row's position as the next pos will allow for range tombstones to be emitted with before_key(next_pos.key()), which will trigger the validation. Instead of working around this, just drop this optimization for mid-partition positions, it is inconsequential anyway. We keep it for where it is important, when we detect that we are at a partition boundary. In this case we can avoid reading the current partition altogether when recreating the reader.	2020-09-25 12:09:00 +03:00
Botond Dénes	d1b0573e1c	mutation_reader_test: add unit test for evictable reader range tombstone trimming	2020-09-25 12:09:00 +03:00
Botond Dénes	4f2e7a18e2	evictable_reader: trim range tombstones to the read clustering range Currently mutation sources are allowed to emit range tombstones that are out-of the clustering read range if they are relevant to it. For example a read of a clustering range [ck100, +inf), might start with: range_tombstone{start={ck1, -1}, end={ck200, 1}}, clustering_row{ck100} The range tombstone is relevant to the range and the first row of the range so it is emitted as first, but its position (start) is outside the read range. This is normally fine, but it poses a problem for evictable reader. When the underlying reader is evicted and has to be recreated from a certain clustering position, this results in out-of-order mutation fragments being inserted into the middle of the stream. This is not fine anymore as the monotonicity guarantee of the stream is violated. The real solution would be to require all mutation sources to trim range tombstones to their read range, but this is a lot of work. Until that is done, as a workaround we do this trimming in the evictable reader itself.	2020-09-25 12:09:00 +03:00
Botond Dénes	d7d93aef49	position_in_partition_view: add position_in_partition_view before_key() overload	2020-09-25 12:09:00 +03:00
Avi Kivity	f1fcf4f139	Update seastar submodule * seastar 9ae33e67e1...e215023c78 (4): > future: Make futures non variadic > on_internal_error: add noexcept variant > Convert another std::result_of to std::invoke_result > reactor: remove unused declaration abort_on_error()	2020-09-24 20:04:03 +03:00
Tomasz Grabiec	14fdd2f501	Merge "Gossip echo message improvement" from Asias This series improves gossip echo message handling in a loaded cluster. Refs: #7197 * git://github.com/asias/scylla.git gossip_echo_improve_7197: gossiper: Handle echo message on any shard gossiper: Increase echo message timeout gossiper: Remove unused _last_processed_message_at	2020-09-24 15:13:55 +02:00

1 2 3 4 5 ...

23755 Commits