scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-29 04:37:00 +00:00

Author	SHA1	Message	Date
Botond Dénes	2ff326a41a	test/manual/sstable_scan_footprint_test: document sstable related command line arguments	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb308411c	mutation_fragment_test: add exception safety test for mutation_fragment::mutate_as_*()	2020-09-28 11:27:49 +03:00
Botond Dénes	ceb0b02ee8	test: simple_schema: add make_static_row()	2020-09-28 11:27:49 +03:00
Botond Dénes	63578bf0a7	reader_permit: reader_resources: add operator==	2020-09-28 11:27:49 +03:00
Botond Dénes	256140a033	mutation_fragment: memory_usage(): remove unused schema parameter The memory usage is now maintained and updated on each change to the mutation fragment, so it needs not be recalculated on a call to `memory_usage()`, hence the schema parameter is unused and can be removed.	2020-09-28 11:27:47 +03:00
Botond Dénes	041d71bd6f	mutation_fragment: track memory usage through the reader_permit The memory usage of mutation fragments is now tracked through its lifetime through a reader permit. This was the last major (to my current knowledge) untracked piece of the reader pipeline.	2020-09-28 11:27:29 +03:00
Botond Dénes	52662f17ea	reader_permit: resource_units: add permit() and resources() accessors	2020-09-28 11:27:29 +03:00
Botond Dénes	6ca0464af5	mutation_fragment: add schema and permit We want to start tracking the memory consumption of mutation fragments. For this we need schema and permit during construction, and on each modification, so the memory consumption can be recalculated and pass to the permit. In this patch we just add the new parameters and go through the insane churn of updating all call sites. They will be used in the next patch.	2020-09-28 11:27:23 +03:00
Botond Dénes	54357221f0	partition_snapshot_row_cursor: row(): return clustering_row instead of mutation_fragment It is what its callers want anyway.	2020-09-28 10:53:56 +03:00
Botond Dénes	1e6285d776	mutation_fragment: remove as_mutable_end_of_partition() There is nothing to mutate on a partition_end fragment.	2020-09-28 10:53:56 +03:00
Botond Dénes	5079b9ccf1	mutation_fragment: s/as_mutable_partition_start/mutate_as_partition_start/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_mutation_start() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	72a88e0257	mutation_fragment: s/as_mutable_range_tombstone/mutate_as_range_tombstone/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_range_tombstone() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	4f5ccf82cb	mutation_fragment: s/as_mutable_clustering_row/mutate_as_clustering_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_clustering_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	f2b9cad4c6	mutation_fragment: s/as_mutable_static_row/mutation_as_static_row/ We will soon want to update the memory consumption of mutation fragment after each modification done to it, to do that safely we have to forbid direct access to the underlying data and instead have callers pass a lambda doing their modifications. Uses where this method was just used to move the fragment away are converted to use `as_static_row() &&`.	2020-09-28 10:53:56 +03:00
Botond Dénes	0518571e56	flat_mutation_reader: make _buffer a tracked buffer Via a tracked_allocator. Although the memory allocations made by the _buffer shouldn't dominate the memory consumption of the read itself, they can still be a significant portion that scales with the number of readers in the read.	2020-09-28 10:53:56 +03:00
Botond Dénes	77ea44cb73	mutation_reader: extract the two fill_buffer_result into a single one Currently we have two, nearly identical definitions of said struct. Extract it to a common definition and rename it to `remote_fill_buffer_result`.	2020-09-28 10:53:56 +03:00
Botond Dénes	3fab83b3a1	flat_mutation_reader: impl: add reader_permit parameter Not used yet, this patch does all the churn of propagating a permit to each impl. In the next patch we will use it to track to track the memory consumption of `_buffer`.	2020-09-28 10:53:48 +03:00
Botond Dénes	c1215592da	reader_permit: introduce tracking_allocator This can be used with standard containers and other containers that use the std::allocator interface to track the allocations made by them via a reader_permit.	2020-09-28 08:46:22 +03:00
Botond Dénes	f10abf6e35	reader_permit: reader_resources: add with_memory() factory function To make creating reader resource with just memory more convenient and more readable at the same time.	2020-09-28 08:46:22 +03:00
Botond Dénes	4c8ab10563	reader_permit: only forward resource consumption to semaphore after admission In the next patches we plan to start tracking the memory consumption of the actual allocations made by the circular_buffer<mutation_fragment>, as well as the memory consumed by the mutation fragments. This means that readers will start consuming memory off the permit right after being constructed. Ironically this can prevent the reader from being admitted, due to its own pre-admission memory consumption. To prevent this hold on forwarding the memory consumption to the semaphore, until the permit is actually admitted.	2020-09-28 08:46:22 +03:00
Botond Dénes	e1eee0dc34	reader_permit: track resource consumed through permit Track all resources consumed through the permit inside the permit. This allows querying how much memory each read is consuming (as there should be one read per permit). Although this might be interesting, especially when debugging OOM cores, the real reason we are doing this is to be able forward resource consumption to the semaphore only post-admission. More on this in the patch introducing this. Another advantage of tracking resources consumed through the permit is that now we can detect resource leaks in the permit destructor and report them. Even if it is just a case of the holder of the resources wanting to release the resources later, with the permit destroyed it will cause use-after-free.	2020-09-28 08:46:22 +03:00
Botond Dénes	cd953a36fd	reader_permit: move internals to impl In the next patches the reader permit will gain members that are shared across all instances of the same permit. To facilitate this move all internals into an impl class, of which the permit stores a shared pointer. We use a shared_ptr to avoid defining `impl` in the header. This is how the reader permit started in the beginning. We've done a full circle. :)	2020-09-28 08:46:22 +03:00
Botond Dénes	12372731cb	reader_permit: add consume()/signal() And do all consuming and signalling through these methods. These operations will soon be more involved than the simple forwarding they do today, so we want to centralize them to a single method pair.	2020-09-28 08:46:22 +03:00
Botond Dénes	375815e650	reader_permit::resource_units: store permit instead of semaphore In the next patches we want to introduce per-permit resource tracking -- that is, have each permit track the amount of resource consumed through it. For this, we need all consumption to happen through a permit, and not directly with the semaphore.	2020-09-28 08:46:22 +03:00
Botond Dénes	04d83f6678	reader_permit: move resource_units declaration outside the reader_permit class In the next patch we want to store a `reader_permit` instance inside `resource_units` so a full definition of the former must be available.	2020-09-28 08:46:22 +03:00
Botond Dénes	0fe75571d9	reader_concurrency_semaphore: admit one read if no reader is active To ensure progress at all times. This is due to evictable readers, who still hold on to a buffer even when their underlying reader is evicted. As we are introducing buffer and mutation fragment tracking in the next patches, these readers will hold on to memory even in this state, so it may theoretically happen that even though no readers are admitted (all count resources all available) no reader can be admitted due to lack of memory. To prevent such deadlocks we now always admit one reader if all count resource are available.	2020-09-28 08:46:22 +03:00
Botond Dénes	ef0b279c80	reader_concurrency_semaphore: move may_proceed() out-of-line They are only used in the .cc anyway.	2020-09-28 08:46:22 +03:00
Botond Dénes	d692993bdc	mutation_reader_test: test_multishard_combining_reader_non_strictly_monotonic_positions: reset size between buffer fills Current code uses a single counter to produce multiple buffer worth of data. This uses carry-on from on buffer to the other, which happens to work with the current memory accounting but is very fragile. Account each buffer separately, resetting the counter between them.	2020-09-28 08:46:22 +03:00
Botond Dénes	7e909671f4	view_build_test: test_view_update_generator_deadlock: release semaphore resources The test consumes all resources off the semaphore, leaving just enough to admit a single reader. However this amount is calculated based on the base cost of readers, but as we are going to track reader buffers as well, the amount of memory consumed will be much less predictable. So to make sure background readers can finish during shutdown, release all the consumed resources before leaving scope.	2020-09-28 08:46:22 +03:00
Botond Dénes	122ab1aabd	view_build_test: test_view_update_generator_buffering: fail the test early on exceptions No point in continuing processing the entire buffer once a failure was found. Especially that an early failure might introduce conditions that are not handled in the normal flow-path. We could handle these but there is no point in this added complexity, at this point the test is failed anyway.	2020-09-28 08:46:22 +03:00
Botond Dénes	99388590da	querier_cache_test: test_resources_based_cache_eviction: use semaphore::consume() to drain semaphore It is much more reliable and simple this way, than playing with `reader_permit::wait_for_admission()`.	2020-09-28 08:46:22 +03:00
Botond Dénes	3c73cc2a4e	tests: prepare for permit forwarding consumption post admission Some tests rely on `consume*()` calls on the permit to take effect immediately. Soon this will only be true once the permit has been admitted, so make sure the permit is admitted in these tests.	2020-09-28 08:46:22 +03:00
Botond Dénes	5e5c94b064	test/lib/reader_lifecycle_policy: don't destroy reader context eagerly Currently per-shard reader contexts are cleaned up as soon as the reader itself is destroyed. This causes two problems: * Continuations attached to the reader destroy future might rely on stuff in the context being kept alive -- like the semaphore. * Shard 0's semaphore is special as it will be used to account buffers allocated by the multishard reader itself, so it has to be alive until after all readers are destroyed. This patch changes this so that contexts are destroyed only when the lifecycle policy itself is destroyed.	2020-09-28 08:46:22 +03:00
Takuya ASADA	8366d2231d	scylla_ntp_setup: use chrony on all distributions To simplify scylla_ntp_setup, use chrony on all distributions.	2020-09-27 12:30:02 +03:00
Rafael Ávila de Espíndola	2093efceab	build: Upgrade to seastar API level 5 Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com> Message-Id: <20200923202424.216444-1-espindola@scylladb.com>	2020-09-26 11:07:49 +03:00
Avi Kivity	36d93f586a	Update seastar submodule * seastar e215023c7...292ba734b (4): > future: Fix move of futures of reference type > doc: fix hyper link to tutorial.html > tutorial: fix formatting of code block > README.md: fix the formatting of table	2020-09-25 21:54:44 +03:00
Tomasz Grabiec	97c99ea9f3	Merge "evictable_reader: validate buffer on reader recreation" from Botond The reader recreation mechanism is a very delicate and error-prone one, as proven by the countless bugs it had. Most of these bugs were related to the recreated reader not continuing the read from the expected position, inserting out-of-order fragments into the stream. This patch adds a defense mechanism against such bugs by validating the start position of the recreated reader. The intent is to prevent corrupt data from getting into the system as well as to help catch these bugs as close to the source as possible. Fixes: #7208 Tests: unit(dev), mutation_reader_test:debug (v4) * botond/evictable-reader-validate-buffer/v5: mutation_reader_test: add unit test for evictable reader self-validation evictable_reader: validate buffer after recreation the underlying evictable_reader: update_next_position(): only use peek'd position on partition boundary mutation_reader_test: add unit test for evictable reader range tombstone trimming evictable_reader: trim range tombstones to the read clustering range position_in_partition_view: add position_in_partition_view before_key() overload flat_mutation_reader: add buffer() accessor	2020-09-25 17:02:51 +02:00
Takuya ASADA	eae2aa58fa	dist/common/scripts: move back get_set_nic_and_disks_config_value to scylla_util.py The function mistakenly moved to scylla_sysconfig_setup but it also referenced from scylla_prepare, move back to scylla_util.py Fixes #7276 Closes #7280	2020-09-25 13:05:43 +03:00
Botond Dénes	076c27318b	mutation_reader_test: add unit test for evictable reader self-validation Add both positive (where the validation should succeed) and negative (where the validation should fail) tests, covering all validation cases.	2020-09-25 12:09:01 +03:00
Botond Dénes	0b0ae18a14	evictable_reader: validate buffer after recreation the underlying The reader recreation mechanism is a very delicate and error-prone one, as proven by the countless bugs it had. Most of these bugs were related to the recreated reader not continuing the read from the expected position, inserting out-of-order fragments into the stream. This patch adds a defense mechanism against such bugs by validating the start position of the recreated reader. Several things are checked: * The partition is the expected one -- the one we were in the middle of or the next if we stopped at partition boundaries. * The partition is in the read range. * The first fragment in the partition is the expected one -- has a an equal or larger position than the next expected fragment. * The fragment is in the clustering range as defined by the slice. As these validations are only done on the slow-path of recreating an evicted reader, no performance impact is expected.	2020-09-25 12:09:00 +03:00
Botond Dénes	91020eef73	evictable_reader: update_next_position(): only use peek'd position on partition boundary `evictable_reader::update_next_position()` is used to record the position the reader will continue from, in the next buffer fill. This position is used to create the partition slice when the underlying reader is evicted and has to be recreated. There is an optimization in this method -- if the underlying's buffer is not empty we peek at the first fragment in it and use it as the next position. This is however problematic for buffer validation on reader recreation (introduced in the next patch), because using the next row's position as the next pos will allow for range tombstones to be emitted with before_key(next_pos.key()), which will trigger the validation. Instead of working around this, just drop this optimization for mid-partition positions, it is inconsequential anyway. We keep it for where it is important, when we detect that we are at a partition boundary. In this case we can avoid reading the current partition altogether when recreating the reader.	2020-09-25 12:09:00 +03:00
Botond Dénes	d1b0573e1c	mutation_reader_test: add unit test for evictable reader range tombstone trimming	2020-09-25 12:09:00 +03:00
Botond Dénes	4f2e7a18e2	evictable_reader: trim range tombstones to the read clustering range Currently mutation sources are allowed to emit range tombstones that are out-of the clustering read range if they are relevant to it. For example a read of a clustering range [ck100, +inf), might start with: range_tombstone{start={ck1, -1}, end={ck200, 1}}, clustering_row{ck100} The range tombstone is relevant to the range and the first row of the range so it is emitted as first, but its position (start) is outside the read range. This is normally fine, but it poses a problem for evictable reader. When the underlying reader is evicted and has to be recreated from a certain clustering position, this results in out-of-order mutation fragments being inserted into the middle of the stream. This is not fine anymore as the monotonicity guarantee of the stream is violated. The real solution would be to require all mutation sources to trim range tombstones to their read range, but this is a lot of work. Until that is done, as a workaround we do this trimming in the evictable reader itself.	2020-09-25 12:09:00 +03:00
Botond Dénes	d7d93aef49	position_in_partition_view: add position_in_partition_view before_key() overload	2020-09-25 12:09:00 +03:00
Avi Kivity	f1fcf4f139	Update seastar submodule * seastar 9ae33e67e1...e215023c78 (4): > future: Make futures non variadic > on_internal_error: add noexcept variant > Convert another std::result_of to std::invoke_result > reactor: remove unused declaration abort_on_error()	2020-09-24 20:04:03 +03:00
Tomasz Grabiec	14fdd2f501	Merge "Gossip echo message improvement" from Asias This series improves gossip echo message handling in a loaded cluster. Refs: #7197 * git://github.com/asias/scylla.git gossip_echo_improve_7197: gossiper: Handle echo message on any shard gossiper: Increase echo message timeout gossiper: Remove unused _last_processed_message_at	2020-09-24 15:13:55 +02:00
Pekka Enberg	84a0aca666	configure.py: Rename "mode" to "checkheaders_mode" The "mode" variable name is used everywhere, usually in a loop. Therefore, rename the global "mode" to "checkheaders_mode" so that if your code block happens to be outside of a loop, you don't accidentally use the globally visible "mode" and spend hours debugging why it's always "dev". Spotted by Yaron Kaikov. Message-Id: <20200924112237.315817-1-penberg@scylladb.com>	2020-09-24 15:00:49 +03:00
Nadav Har'El	e1c42f2bb3	scripts/pull_github_pr.sh: show titles of more than 20 patches The script pull_github_pr.sh uses git merge's "--log" option to put in the merge commit the list of titles of the individual patches being merged in. This list is useful when later searching the log for the merge which introduced a specific feature. Unfortunately, "--log" defaults to cutting off the list of commit titles at 20 lines. For most merges involving fewer than 20 commits, this makes no difference. But some merges include more than 20 commits, and get a truncated list, for no good reason. If someone worked hard to create a patch set with 40 patches, the last thing we should be worried about is that the merge commit message will be 20 lines longer. Unfortunately, there appears to be no way to tell "--log" to not limit the length at all. So I chose an arbitrary limit of 1000. I don't think we ever had a patch set in Scylla which exceeded that limit. Yet :-) Signed-off-by: Nadav Har'El <nyh@scylladb.com> Message-Id: <20200924114403.817893-1-nyh@scylladb.com>	2020-09-24 14:51:58 +03:00
Piotr Dulikowski	39771967bb	hinted handoff: fix race - decomission vs. endpoint mgr init This patch fixes a race between two methods in hints manager: drain_for and store_hint. The first method is called when a node leaves the cluster, and it 'drains' end point hints manager for that node (sends out all hints for that node). If this method is called when the local node is being decomissioned or removed, it instead drains hints managers for all endpoints. In the case of decomission/remove, drain_for first calls parallel_for_each on all current ep managers and tells them to drain their hints. Then, after all of them complete, _ep_managers.clear() is called. End point hints managers are created lazily and inserted into _ep_managers map the first time a hint is stored for that node. If this happens between parallel_for_each and _ep_managers.clear() described above, the clear operation will destroy the new ep manager without draining it first. This is a bug and will trigger an assert in ep manager's destructor. To solve this, a new flag for the hints manager is added which is set when it drains all ep managers on removenode/decommission, and prevents further hints from being written. Fixes #7257 Closes #7278	2020-09-24 14:51:24 +03:00
Nadav Har'El	a5369881b3	Merge 'sstables: make sstable_manager control the lifetime of the sstables it manages' from Avi Kivity Currently, sstable_manager is used to create sstables, but it loses track of them immediately afterwards. This series makes an sstable's life fully contained within its sstable_manager. The first practical impact (implemented in this series) is that file removal stops being a background job; instead it is tracked by the sstable_manager, so when the sstable_manager is stopped, you know that all of its sstable activity is complete. Later, we can make use of this to track the data size on disk, but this is not implemented here. Closes #7253 * github.com:scylladb/scylla: sstables: remove background_jobs(), await_background_jobs() sstables: make sstables_manager take charge of closing sstables test: test_env: hold sstables_manager with a unique_ptr test: drop test_sstable_manager test: sstables::test_env: take ownership of manager test: broken_sstable_test: prepare for asynchronously closed sstables_manager test: sstable_utils: close test_env after use test: sstable_test: dont leak shared_sstable outside its test_env's lifetime test: sstables::test_env: close self in do_with helpers test: perf/perf_sstable.hh: prepare for asynchronously closed sstables_manager test: view_build_test: prepare for asynchronously closed sstables_manager test: sstable_resharding_test: prepare for asynchronously closed sstables_manager test: sstable_mutation_test: prepare for asynchronously closed sstables_manager test: sstable_directory_test: prepare for asynchronously closed sstables_manager test: sstable_datafile_test: prepare for asynchronously closed sstables_manager test: sstable_conforms_to_mutation_source_test: remove references to test_sstables_manager test: sstable_3_x_test: remove test_sstables_manager references test: schema_changes_test: drop use of test_sstables_manager mutation_test: adjust for column_family_test_config accepting an sstables_manager test: lib: sstable_utils: stop using test_sstables_manager test: sstables test_env: introduce manager() accessor test: sstables test_env: introduce do_with_async_sharded() test: sstables test_env: introduce do_with_async_returning() test: lib: sstable test_env: prepare for life as a sharded<> service test: schema_changes_test: properly close sstables::test_env test: sstable_mutation_test: avoid constructing temporary sstables::test_env test: mutation_reader_test: avoid constructing temporary sstables::test_env test: sstable_3_x_test: avoid constructing temporary sstables::test_env test: lib: test_services: pass sstables_manager to column_family_test_config test: lib: sstables test_env: implement tests_env::manager() test: sstable_test: detemplate write_and_validate_sst() test: sstable_test_env: detemplate do_with_async() test: sstable_datafile_test: drop bad 'return' table: clear sstable set when stopping table: prevent table::stop() race with table::query() database: close sstable_manager:s sstables_manager: introduce a stub close() sstable_directory_test: fix threading confusion in make_sstable_directory_for*() functions test: sstable_datafile_test: reorder table stop in compaction_manager_test test: view_build_test: test_view_update_generator_register_semaphore_unit_leak: do not discard future in timer test: view_build_test: fix threading in test_view_update_generator_register_semaphore_unit_leak view: view_update_generator: drop references to sstables when stopping	2020-09-24 13:54:38 +03:00

1 2 3 4 5 ...

23751 Commits