scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-06-01 12:36:56 +00:00

Author	SHA1	Message	Date
Raphael S. Carvalho	cebe6e22cb	compaction_manager: scrub: switch to table_state Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	c2678ca661	compaction: table_state: add get_sstables_manager() That will be needed for retrieving sstable manager in perform_sstable_upgrade(). Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	7c1d178f4e	compaction_manager: make submit(T) switch to table_state Now that submit() switched to table_state, compaction_reenabler and friends can switch to table_state too. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	43136a3ca7	compaction: table_state: Add is_auto_compaction_disabled_by_user() auto_compaction_disabled_by_user is a configuration that can be enabled or disabled on a particular table. We're adding this interface to avoid having to push the configuration for every compaction_state, which would result in redundant information as the configuration value is the same for all table states. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	1deeeff825	compaction: table_state: Add on_compaction_completion() The idea is that we'll have a single on-completion interface for both "in-strategy" and off-strategy compactions, so not to pollute table_state with one interface for each. replica::table::on_compaction_completion is being moved into private namespace. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	1520580212	compaction: table_state: Add make_sstable() compaction_manager needs this interface when setting the sstable creation lambda in compaction_descriptor, which is then forwarded into the actual compaction procedure. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	cb05142d58	compaction: Move table::in_strategy_sstables() and switch to table_state in_strategy_sstables() doesn't have to be implemented in table, as it's simply about main set with maintenance and staging files filtered out. Also, let's make it switch to table_state as part of ongoing work. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	23e21ed5bc	compaction: table_state: Add maintenance sstable set Needed for off-strategy compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-16 21:35:06 -03:00
Raphael S. Carvalho	f52ad722f3	compaction_manager: rename table_state's get_sstable_set to main_sstable_set With compaction_manager switching to table_state, we'll need to introduce a method in table_state to return maintenance set. So better to have a descriptive name for main set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-07-13 11:12:33 -03:00
Benny Halevy	6332816ccf	compaction_manager: always register descriptor with fully expired sstables for compaction If the compaction_descriptor returned by time_window_compaction_strategy::get_sstables_for_compaction is marked with has_only_fully_expired::yes it should always be compacted since time_window_compaction_strategy::get_sstables_for_compaction is not idempotent. It sets _last_expired_check and if compaction is postponed and retried before expired_sstable_check_frequency has passed, it will not look for those fully-expired sstables again. Plus, compacting them is the cheapest possible as it does not require reading anything, just deleting the input sstables, so there's no reason not postpone it. Fixes #10989 Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 12:04:04 +03:00
Benny Halevy	cfc7a5065a	test: max_ongoing_compaction_test: test serialization of regular compaction with same weight Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 12:04:03 +03:00
Benny Halevy	65a5e0a7bb	test: max_ongoing_compaction_test: reindent refactored code Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 12:03:32 +03:00
Benny Halevy	5212e81475	test: max_ongoing_compaction_test: define compact_all_tables lambda To test both expired and non-expired sstables scenarios we need to pass this helper function the expected number of sstables before compaction and after compaction. When compaction a set of fully-expired sstables, we expect none to remain, while when the set of sstables is not fully expired, we'll expect 1 output sstable after compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 12:00:27 +03:00
Benny Halevy	fe4a59372e	test: max_ongoing_compaction_test: refactor make_table_with_single_fully_expired_sstable So we can use the lower-level build blocks to test compaction serialization of both fully-expired and non-fully-expired sstables scenarios in the following patches. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 11:56:41 +03:00
Benny Halevy	d18fc6a7ed	test: max_ongoing_compaction_test: reduce number of tables There is no need to test 100 tables. 10 tables are enough so make the test complete faster. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-12 11:53:01 +03:00
Benny Halevy	e3f561db31	compaction_manager: major_compaction_task: run in maintenance scheduling groupt We should separate the scheduling groups used for major compaction from the the regular compaction scheduling group so that the latter can be affected by the backlog tracker in case backlog accumulates during a long running major compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-07-06 18:18:45 +03:00
Botond Dénes	6c818f8625	Merge 'sstables: generation_type tidy-up' from Michael Livshin - Use `sstables::generation_type` in more places - Enforce conceptual separation of `sstables::generation_type` and `int64_t` - Fix `extremum_tracker` so that `sstables::generation_type` can be non-default-constructible Fixes #10796. Closes #10844 * github.com:scylladb/scylla: sstables: make generation_type an actual separate type sstables: use generation_type more soundly extremum_tracker: do not require default-constructible value types	2022-06-28 08:50:12 +03:00
Benny Halevy	34e9391587	test: sstable_compaction: compaction_manager_for_testing Make the compaction manager for testing using this class. Makes sure to enable the compaction manager and to stop it before it's destroyed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-23 08:02:44 +03:00
Botond Dénes	121900e377	Merge "Sanitize compaction manager construction and stopping" from Pavel Emelyanov " In order to wire-in the compaction_throughput_mb_per_sec the compaction creation and stopping will need to be patched. Right now both places are quite hairy, this set coroutinizes stop() for simpler adding of stopping bits, unifies all the compaction manager constructors and adds the compaction_manager::config for simpler future extending. As a side effect the backlog_controller class gets an "abstract" sched group it controlls which in turn will facilitate seastar sched groups unification some day. " * 'br-compaction-manager-start-stop-cleanup' of https://github.com/xemul/scylla: compaction_manager: Introduce compaction_manager::config backlog_controller: Generalize scheduling groups database: Keep compound flushing sched group compaction_manager: Swap groups and controller compaction_manager: Keep compaction_sg on board compaction_manager: Unify scheduling_group structures compaction_manager: Merge static/dynamic constructors compaction_manager: Coroutinuze really_do_stop() compaction_manager: Shuffle really_do_stop() compaction_manager: Remove try-catch around logger	2022-06-21 11:58:13 +03:00
Raphael S. Carvalho	aa667e590e	sstable_set: Fix partitioned_sstable_set constructor The sstable set param isn't being used anywhere, and it's also buggy as sstable run list isn't being updated accordingly. so it could happen that set contains sstables but run list is empty, introducing inconsistency. we're fortunate that the bug wasn't activated as it would've been a hard one to catch. found this while auditting the code. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220617203438.74336-1-raphaelsc@scylladb.com>	2022-06-21 11:58:13 +03:00
Michael Livshin	ab13127761	sstables: use generation_type more soundly `generation_type` is (supposed to be) conceptually different from `int64_t` (even if physically they are the same), but at present Scylla code still largely treats them interchangeably. In addition to using `generation_type` in more places, we provide (no-op) `generation_value()` and `generation_from_value()` operations to make the smoke-and-mirrors more believable. The churn is considerable, but all mechanical. To avoid even more (way, way more) churn, unit test code is left untreated for now, except where it uses the affected core APIs directly. Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-06-20 19:37:31 +03:00
Pavel Emelyanov	0c8abca75e	compaction_manager: Introduce compaction_manager::config This is to make it constructible in a way most other services are -- all the "scalar" parameters are passed via a config. With this it will be much shorter to add compaction bandwidth throttling option by just extending the config itself, not the list of constructor arguments (and all its callers). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	0662036d27	compaction_manager: Unify scheduling_group structures There are two of them with identical content and meaning Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	41f1044d3c	compaction_manager: Merge static/dynamic constructors The only difference between those two are in the way backlog controller is created. It's much simpler to have the controller construction logic in compaction manager instead. Similar "trick" is used to construct flush controller for the database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Raphael S. Carvalho	079283193a	tests: sstable_compaction_test: Adjust controller unit test for LCS The controller unit test for LCS was only creating level 0 SSTables. As level 0 falls back to STCS controller, it means that we weren't actually testing LCS controller. So let's adjust the unit test to account for LCS fan_out, which is 10 instead of 4, and also allow creation of SSTables on higher levels. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-06-09 14:21:40 -03:00
Pavel Emelyanov	3dab0bfc8d	tests: Remove sstables_manager& from column_family_test_config() It's unused arg in there after last patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-27 16:37:21 +03:00
Pavel Emelyanov	50e6810536	table, db, tests: Pass sstables_manager& into table constructor In core code there's only one place that constructs table -- in database.cc -- and this place currently has the sstables_manager pointer sitting on table config (despite it's a pointer, it's always non-null). All the tests always use the manager from one of _env's out there. For now the new contructor arg is unused. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-05-27 16:27:44 +03:00
Eliran Sinvani	c5e5692a01	compaction_manager: Make invoking the empty constructor more explicit The compaction manager's empty constructor is supposed to be invoked only in testing environment, however, it is easy to invoke it by mistake from production code. Here we add a more verbose constructor and making the default compaction private, the verbose compiler need to be invoked with a tag for_testing_tag, this will ensure that this constructor will be invoked only when intended. The unit tests were changed according to this new paradigm. Tests: unit (dev) Signed-off-by: Eliran Sinvani <eliransin@scylladb.com>	2022-05-18 14:57:10 +03:00
Michael Livshin	3cc2343775	tests: trivial flat_reader_assertions{,_v2} conversions (Which entails temporary cut-and-pasting some utility functions) Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>	2022-05-10 22:10:40 +03:00
Raphael S. Carvalho	48e3117ebc	compaction: move propagate_replacement() into private namespace propagate_replacement() is an internal function that shouldn't be in the public interface. No one besides an unit test for incremental compaction needs it. In the future, I want to revisit incremental compaction unit test to stop using it and only rely on public interfaces Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220506171647.81063-1-raphaelsc@scylladb.com>	2022-05-09 16:49:50 +03:00
Raphael S. Carvalho	736c96cc6f	compaction: LCS: avoid needless work post major compaction completion That's done by picking the ideal level for the input, such that LCS won't have to either promote or demote data, because the output level is not the best candidate for having the size of the output data. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-04-28 20:19:28 -03:00
Pavel Emelyanov	9066224cf4	table: Don't export compaction manager reference There's a public call on replica::table to get back the compaction manager reference. It's not needed, actually. The users of the call are distributed loader which already has database at hand, and a test that creates itw own instance of compaction manager for its testing tables and thus also has it available. tests: unit(dev) Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Message-Id: <20220406171351.3050-1-xemul@scylladb.com>	2022-04-07 09:27:45 +03:00
Raphael S. Carvalho	840500fc4d	compaction: Make cleanup for Leveled strategy bucket-aware Bucket awareness in cleanup was introduced in `a69d98c3d0`. STCS and TWCS already support it, and now LCS will receive it. The goal of bucket awareness is to reduce writeamp in cleanup, therefore reducing operation time. Additionally, garbage collection becomes more efficient as shadowed data can now be potentially compacted with the data that shadows it, assuming they're on the same level. The implementation for LCS is simple. Will reuse the procedure for STCS for returning jobs in level 0. And one job will be returned for each non-empty level > 0. What allows us to do it is our incremental selection approach used in compaction, that sets a limit on memory usage and disk space requirement. Fixes #10097. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220331173417.211257-1-raphaelsc@scylladb.com>	2022-04-05 09:10:21 +03:00
Botond Dénes	f8015d9c26	readers: move combined reader into readers/ Since the combined reader family weighs more than 1K SLOC, it gets its own .cc file.	2022-03-30 15:42:51 +03:00
Raphael S. Carvalho	a1fd9c1ee8	tests: sstable_compaction_test: Add test for TWCS' bucket-aware cleanup Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-29 09:52:11 -03:00
Raphael S. Carvalho	5312526e5e	test: sstable_compaction_test: Add test for strategy cleanup method Stresses default and STCS implementations of cleanup method Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-25 11:23:29 -03:00
Raphael S. Carvalho	c25d8f6770	compaction: Move decision of garbage collection from strategy to task type For compaction to be able to purge expired data, like tombstones, a sstable set snapshot is set in the compaction descriptor. That's a decision that belongs to task type. For example, all regular compaction enable GC, whereas scrub for example doesn't for safety reasons. The problem is that the decision is being made by every instantiation of compaction_descriptor in the strategies, which is both unnecessary and also adds lots of boilerplate to the code, making it hard to understand and work with. As sstable set snapshot is an implementation detail, a new method is being added to compaction_descriptor to make the intention clearer, making the interface easier to understand. can_purge_tombstones, used previously by rewrite task only, is being reused for communicating GC intention into task::compact_sstables(). The boilerplate was a pain when adding a new strategy method for the ongoing work on cleanup, described by issue #10097. Another benefit is that we'll now only create a set snapshot when compaction will really run. Before, it could happen that the snapshot would be discarded if the compaction attempt had to be postponed, which is a waste of cpu cycles. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-21 12:14:04 -03:00
Raphael S. Carvalho	58e520ab1d	compaction: Move run_off_strategy_compaction() into compaction manager Compaction manager is calling back the table to run off-strategy compaction, but the logic clearly belongs to manager which should perform the operation independently and only call table to update its state with the result. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220315174504.107926-2-raphaelsc@scylladb.com>	2022-03-16 09:55:52 +02:00
Raphael S. Carvalho	fce9d869b4	compaction: Move table::compact_sstables() into compaction manager Table submits compaction request into manager, which in turn calls back table to run the compaction when the time has come, i.e.: table -> compaction manager -> table -> execute compaction But manager should not rely on table to run compaction, as compaction execution procedure sits one layer below the manager and should be accessed directly by it, i.e: table -> compaction manager -> execute compaction This makes code easier to understand and update_compaction_history() can now be noop for unit tests using table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-1-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Botond Dénes	7e0b51ff23	Merge 'Overhaul compaction_manager::task' from Benny Halevy The series overhauls the compaction_manager::task design and implementation by properly layering the functionality between the compaction_manager that deals with generic task execution, and the per-task business logic that is defined in a set of classes derived from the generic task class. While at it, the series introduces `task::state` and a set of helper functions to manage it to prevent leaks in the statistics, fixing #9974. Two more stats counter were exposed: `completed_tasks` and a new `postponed_tasks`. Test: sstable_compaction_test Dtest: compaction_test.py compaction_additional_test.py Fixes #9974 Closes #10122 * github.com:scylladb/scylla: compaction_manager: use coroutine::switch_to compaction_manager::task: drop _compaction_running compaction_manager: move per-type logic to derived task compaction_manager: task: add state enum compaction_manager: task: add maybe_retry compaction_manager: reevaluate_postponed_compactions: mark as noexcept compaction_manager: define derived task types compaction_manager: register_metrics: expose postponed_compactions compaction_manager: register_metrics: expose failed_compactions compaction_manager: register_metrics: expose _stats.completed_tasks compaction: add documentation for compaction_type to string conversions compaction: expose to_string(compaction_type) compaction_manager: task: standardize task description in log messages compaction_manager: refactor can_proceed compaction_manager: pass compaction_manager& to task ctor compaction_manager: use shared_ptr<task> rather than lw_shared_ptr compaction_manager: rewrite_sstables: acquire _maintenance_ops_sem once compaction_manager: use compaction_state::lock only to synchronize major and regular compaction	2022-03-10 13:33:56 +02:00
Benny Halevy	a2a5e530f0	compaction_manager: move per-type logic to derived task Move the business logic into the task specific classes. Separating initialization during task construction, from the compaction_done task, moved into a do_run() method, and in some cases moving a lambda function that was called per table (as in rewrite_sstables) into a private method of the derived class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Botond Dénes	959483a2dc	test: migrate to the v2 variant of the sstable writer API	2022-03-10 09:16:33 +02:00
Botond Dénes	105bf8888a	sstables: convert mx writer to v2 The sstables::sstable class has two methods for writing sstables: 1) sstable_writer get_writer(...); 2) future<> write_components(flat_mutation_reader, ...); (1) directly exposes the writer type, so we have to update all users of it (there is not that many) in this same patch. We defer updating users of (2) to a follow-up commits.	2022-03-10 07:03:49 +02:00
Raphael S. Carvalho	2dba0670ad	compaction: Fix time_window_backlog_tracker::replace_sstables() Introduced in commit: `ddd693c6d7` We're not emplacing newer windows in the tracker, causing std::out_of_range when replacing sstables for windows. Let's fix the logic and add an unit test to cover this. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220301194944.95096-1-raphaelsc@scylladb.com>	2022-03-02 10:08:40 +02:00
Raphael S. Carvalho	2a7939ee4d	tests: Add compaction controller test There's no automated test for controller, it's time to have one. Let's start with a basic one that verifies the assumption that perfectly compacted tiers should produce 0 backlog. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-02-24 18:57:45 -03:00
Raphael S. Carvalho	ddd693c6d7	compaction_backlog_tracker: Batch changes through a new replacement interface This new interface allows table to communicate multiple changes in the SSTable set with a single call, which is useful on compaction completion for example. With this new interface, the size tiered backlog tracker will be able to know when compaction completed, which will allow it to recompute tiers and their backlog contribution, if any. Without it, tiered tracker would have to recompute tiers for every change, which would be terribly expensive. The old remove/add interface are being removed in favor of the new one. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-02-24 15:34:16 -03:00
Avi Kivity	cbba80914d	memtable: move to replica module and namespace Memtables are a replica-side entity, and so are moved to the replica module and namespace. Memtables are also used outside the replica, in two places: - in some virtual tables; this is also in some way inside the replica, (virtual readers are installed at the replica level, not the cooordinator), so I don't consider it a layering violation - in many sstable unit tests, as a convenient way to create sstables with known input. This is a layering violation. We could make memtables their own module, but I think this is wrong. Memtables are deeply tied into replica memory management, and trying to make them a low-level primitive (at a lower level than sstables) will be difficult. Not least because memtables use sstables. Instead, we should have a memtable-like thing that doesn't support merging and doesn't have all other funky memtable stuff, and instead replace the uses of memtables in sstable tests with some kind of make_flat_mutation_reader_from_unsorted_mutations() that does the sorting that is the reason for the use of memtables in tests (and live with the layering violation meanwhile). Test: unit (dev) Closes #10120	2022-02-23 09:05:16 +02:00
Avi Kivity	adc08d0ab9	Merge "Drop v1 input support for mutation compactor" from Botond " Currently the mutation compactor supports v1 and v2 output and has a v1 output. The next step is to add a v2 output but this would lead to a full conversion matrix which we want to avoid. So in preparation we drop the v1 input support. Most inputs were already v2, but there were some notable exceptions: tests, the compacting reader and the multishard query code. The former two was a simple mechanical update but the latter required some further work because it turned out the v2 version of evictable reader wasn't used yet and thus it managed to hide some bugs and dropped features. While at it, we migrate all evictable and multishard reader users to the v2 variant of the respective readers and drop the v1 variant completely. With this the road is open to a v2 compactor output and therefore to a v2 sstable writer. Tests: unit(dev, release), dtest(paging_additional_test.py) " * 'compact-mutation-v2-only-input/v5' of https://github.com/denesb/scylla: test/lib/test_utils: return OK from check() variants repair/row_level: use evictable reader v2 db/view/view_updating_consumer: migrate to v2 test/boost/mutation_reader_test: add v2 specific evictable reader tests test: migrate to evictable reader v2 and multishard combining reader v2 compact_mutation: drop support for v1 input test: pass v2 input to mutation_compaction test/boost/mutation_test: simplify test_compaction_data_stream_split test mutation_partition: do_compact(): do drop row tombstones covered by higher order tombstones multishard_mutation_query: migrate to v2 mutation_fragment_v2: range_tombstone_change: add memory_usage() evictable_reader_v2: terminate active range tombstones on reader recreation evictable_reader_v2: restore handling of non-monotonically increasing positions evictable_reader_v2: simplify handling of reader recreation mutation: counter_write_query: use v2 reader mutation: migrate consume() to v2 mutation_fragment_v2,flat_mutation_reader_v2: mirror v1 concept organization mutation_reader: compacting_reader: require a v2 input reader db/view/view_builder: use v2 reader test/lib/flat_mutation_reader_assertions: adjust has_monotonic_positions() to v2 spec	2022-02-21 14:32:55 +02:00
Botond Dénes	284ed9154f	test: pass v2 input to mutation_compaction	2022-02-21 12:29:24 +02:00

1 2 3

120 Commits