scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-01 21:55:50 +00:00

Author	SHA1	Message	Date
Avi Kivity	bfc521ee9c	Merge "Activate compaction_throughput_mb_per_sec option" from Pavel E " The option controlls the IO bandwidth of the compaction sched class. It's not set to be 16MB/s, but is unused. This set makes it 0 by default (which means unlimited), live-updateable and plugs it to the seastar sched group IO throttling. branch: https://github.com/xemul/scylla/tree/br-compaction-throttling-3 tests: unit(dev), v2: https://jenkins.scylladb.com/job/releng/job/Scylla-CI/1010/ , v2: manual config update " * 'br-compaction-throttling-3-a' of https://github.com/xemul/scylla: compaction_manager: Add compaction throughput limit updateable_value: Support dummy observing serialized_action: Allow being observer for updateable_value config: Tune the config option	2022-07-07 13:14:07 +03:00
Pavel Emelyanov	b112a98318	compaction_manager: Add compaction throughput limit Re-use eisting compaction_throughput_mb_per_sec option, push it down to compaction manager via config and update the nderlying compaction sched class when the option is (live)updated. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-06 08:17:08 +03:00
Pavel Emelyanov	af026e423e	compaction_manager: Add logging around drain Now we know when it starts and whe^w if it finishes Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-01 17:17:53 +03:00
Pavel Emelyanov	a9d6e5cfb6	compaction_manager: Coroutinize drain It's short enough to fix indentation right at once Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-07-01 17:17:53 +03:00
Benny Halevy	8bccd5e9c5	compaction_manager: task: acquire_semaphore: handle abort_requested_exception Change `8f39547d89` added `handle_exception_type([] (const semaphore_aborted& e) {})`, but it turned out that `named_semaphore_aborted` isn't derived from `semaphore_aborted`, but rather from `abort_requested_exception` so handle the base exception instead. Fixes #10666 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10881	2022-06-27 09:47:48 +03:00
Benny Halevy	a65ed19edc	table: perform_offstrategy_compaction: move off-strategy logic to compaction_manager compaction_manager needs to decide about running off-strategy compaction or not based on the maintenance_set, not partly in table::trigger_offstrategy_compaction and part in the compaction_manager layer as it is done today. So move the logic down to performa_offstrategy that now returns future<bool> to return true iff it performed offstrategy compaction. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-23 08:18:17 +03:00
Benny Halevy	9079c98db0	compaction_manager: offstrategy_compaction_task: refactor log printouts Move logging from run_offstrategy_compaction to do_run so that in the next patch we can skip run_offstrategy_compaction if the maintenance set is empty (but still log it, for the sake of dtests. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-06-23 08:02:44 +03:00
Pavel Emelyanov	0c8abca75e	compaction_manager: Introduce compaction_manager::config This is to make it constructible in a way most other services are -- all the "scalar" parameters are passed via a config. With this it will be much shorter to add compaction bandwidth throttling option by just extending the config itself, not the list of constructor arguments (and all its callers). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	997a34bf8c	backlog_controller: Generalize scheduling groups Make struct scheduling_group be sub-class of the backlog controller. Its new meaning is now -- the group under controller maintenance. Both database and compaction manager derive their sched groups from this one. This makes backlog controller construction simpler, prepares the ground for sched groups unification in seastar and facilitates next patch. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	0fef2e0273	compaction_manager: Swap groups and controller To have groups initialized before controller. Makes next patch shorter Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	fbb59fc920	compaction_manager: Keep compaction_sg on board This is mainly to make next patch simpler. Also this makes the backlog controller API smaller by removing its sg() method. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	0662036d27	compaction_manager: Unify scheduling_group structures There are two of them with identical content and meaning Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	41f1044d3c	compaction_manager: Merge static/dynamic constructors The only difference between those two are in the way backlog controller is created. It's much simpler to have the controller construction logic in compaction manager instead. Similar "trick" is used to construct flush controller for the database. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	2dbf0b5248	compaction_manager: Coroutinuze really_do_stop() This way it's more compact and easier to extend. Also it's small enough to fix indentation right at once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	bbd9fc26cd	compaction_manager: Shuffle really_do_stop() Make it the future-returning method and setup the _stop_future in its only caller. Makes next patch much simpler Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Pavel Emelyanov	b19b8c9e5b	compaction_manager: Remove try-catch around logger Logging functions are all noexcept already Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2022-06-16 17:40:19 +03:00
Mikołaj Sielużycki	db5b05948b	compaction: Clarify comment. Closes #10799	2022-06-15 15:09:44 +03:00
Benny Halevy	8f39547d89	compaction_manager: task: convert semaphore_aborted to compaction_stopped exception Fixes #10666 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes #10686	2022-06-13 16:20:39 +03:00
Mikołaj Sielużycki	4143878558	compaction: Release compaction weight before updating history. update_history can take a long time compared to compaction, as a call issued on shard S1 can be handled on shard S2. If the other shard is under heavy load, we may unnecessarily block kicking off a new compaction. Normally it isn't a problem, as compactions aren't super frequent, but there were edge cases where the described behaviour caused compaction to fail to keep up with excessive flushing, leading to too many sstables on disk and OOM during a read. There is no need to wait with next compaction until history is updated, so release the weight earlier to remove unnecessary serialization. Compaction is marked as finished as soon as sstables are compacted (without waiting for history update).	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	5ce1fd1574	compaction: Inline compact_sstables_and_update_history call. This commit introduces no functional changes and exists solely for clarity of the change in the subsequent commit.	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	533552273a	compaction: Extract compact_sstables function	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	33c5802957	compaction: Rename compact_sstables to compact_sstables_and_update_history	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	9572520d0d	compaction: Extract update_history function	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	537819b7f8	compaction: Extract should_update_history function.	2022-06-07 12:55:28 +02:00
Mikołaj Sielużycki	447bd8a2e0	compaction: Fetch start_size from compaction_result The start size is calculated during compaction and returned from sstables::compact_sstables, so there is no need to do it twice.	2022-06-07 12:55:28 +02:00
Avi Kivity	4b53af0bd5	treewide: replace parallel_for_each with coroutine::parallel_for_each in coroutines coroutine::parallel_for_each avoids an allocation and is therefore preferred. The lifetime of the function object is less ambiguous, and so it is safer. Replace all eligible occurences (i.e. caller is a coroutine). One case (storage_service::node_ops_cmd_heartbeat_updater()) needed a little extra attention since there was a handle_exception() continuation attached. It is converted to a try/catch. Closes #10699	2022-05-31 09:06:24 +03:00
Raphael S. Carvalho	b120cacdd1	compaction_manager: Allow off-strategy to proceed in parallel to in-strategy compactions Off-strategy works on maintenance sstable set using maintenance scheduling group, whereas "in-strategy" works on main sstable set and uses compaction group. Today, it can happen that off-strategy has to wait for an "in-strategy" maintenance compaction, e.g. cleanup, to complete before getting a chance to run. But that's not desired behavior as off-strategy uses maintenance group, and its candidates don't add to the backlog that influences "in-strategy" bandwidth. Therefore, "in-strategy" and off-strategy should be decoupled, with off-strategy having its own semaphore for guaranteeing serialization across tables. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #10595	2022-05-19 17:37:11 +03:00
Raphael S. Carvalho	ca322fb7c2	compaction_manager: Quickly abort maintenance compaction waiting for its turn Today, aborting a maintenance compaction like major, which is waiting for its turn to run, can take lots of time because compaction manager will only be able to bail out the task once it gets the "permit" from the serialization mechanism, i.e. semaphore. Meaning that the command that started the task will only complete after all this time waiting for the "permit". To allow a pending maintenance compaction to be quickly aborted, we can use the abortable variant of get_units(). So when user submits an abortion request, get_units() will be able to return earlier through the abort exception. Refs #10485. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #10581	2022-05-17 13:14:51 +03:00
Avi Kivity	528ab5a502	treewide: change metric calls from make_derive to make_counter make_derive was recently deprecated in favor of make_counter, so make the change throughput the codebase. Closes #10564	2022-05-14 12:53:55 +02:00
Raphael S. Carvalho	5682393693	compaction: Fix use-after-move when retrying maintenance compaction SSTable was moved into descriptor, so on failure, it couldn't be used without resulting in a segfault. Fix it by not moving sst, and changing signature to make it explicit we don't want to move the content. Fixes #10505. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #10506	2022-05-08 11:16:55 +03:00
Raphael S. Carvalho	20a1ef3bee	compaction_backlog_tracker: Raise logging level to error when disabling tracker on exception If exception is caught while updating backlog tracker, the backlog tracker will be disabled for the underlying table, potentially causing compaction to fall behind. That being said, let's raise the log level to error, to give it its due importance and allow tests to detect the problem. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220330151421.49054-1-raphaelsc@scylladb.com>	2022-03-31 07:04:00 +03:00
Botond Dénes	0c3d4091a4	Merge "Make TWCS' cleanup bucket aware" from Raphael S. Carvalho " Quoting patch 3/4: "This continues the work in `a69d98c3d0`, by implementing the cleanup method in TWCS to make it bucket aware. Till now, the default impl was used which cleanups on file at a time, starting from the smallest. The cleanup strategy for TWCS is simple. It's simply calling the size tiered cleanup method for each bucket, so there will be one job for each tier in each window. The next strategies to receive this improvement are LCS and ICS (the latter one being only available in enterprise). Refs #10097." Simply put, the goal is to reduce writeamp when performing cleanup on a TWCS table, therefore reducing the operation time. tests: unit(dev). " * 'twcs_cleanup_bucket_aware/v1' of https://github.com/raphaelsc/scylla: tests: sstable_compaction_test: Add test for TWCS' bucket-aware cleanup compaction: TWCS: Implement cleanup method for bucket awareness compaction: TWCS: change get_buckets() signature to work with const qualified functions compaction_strategy: get_cleanup_compaction_jobs: accept candidates by value	2022-03-30 11:45:28 +03:00
Raphael S. Carvalho	177a8e8259	compaction_manager: allow sstable to be moved into rewrite_sstable() Caller was already trying to move sstable, but rewrite_sstable() signature was incorrect. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220329022149.250655-1-raphaelsc@scylladb.com>	2022-03-30 11:42:52 +03:00
Raphael S. Carvalho	2a9bfa3e3f	compaction_strategy: get_cleanup_compaction_jobs: accept candidates by value Then caller can decide whether to copy or move candidate set into the function. cleanup_sstables_compaction_task can move candidates as it's no longer needed once it retrieves all descriptors. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-29 09:49:13 -03:00
Raphael S. Carvalho	c7826aa910	compaction_manager: Wire cleanup task into the strategy cleanup method As the cleanup process can now be driven by the compaction strategy, let's move cleanup into a new task type that uses the new compaction_strategy::get_cleanup_compaction_jobs(). By the time being all strategies are using the default method that returns one descriptor for each sstable that needs clean up. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-25 11:23:26 -03:00
Raphael S. Carvalho	25be958ab9	compaction: Introduce compaction_descriptor::sstables_size This method can be reused in manager, and will be useful for upcoming cleanup task. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-21 12:55:10 -03:00
Raphael S. Carvalho	c25d8f6770	compaction: Move decision of garbage collection from strategy to task type For compaction to be able to purge expired data, like tombstones, a sstable set snapshot is set in the compaction descriptor. That's a decision that belongs to task type. For example, all regular compaction enable GC, whereas scrub for example doesn't for safety reasons. The problem is that the decision is being made by every instantiation of compaction_descriptor in the strategies, which is both unnecessary and also adds lots of boilerplate to the code, making it hard to understand and work with. As sstable set snapshot is an implementation detail, a new method is being added to compaction_descriptor to make the intention clearer, making the interface easier to understand. can_purge_tombstones, used previously by rewrite task only, is being reused for communicating GC intention into task::compact_sstables(). The boilerplate was a pain when adding a new strategy method for the ongoing work on cleanup, described by issue #10097. Another benefit is that we'll now only create a set snapshot when compaction will really run. Before, it could happen that the snapshot would be discarded if the compaction attempt had to be postponed, which is a waste of cpu cycles. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-21 12:14:04 -03:00
Avi Kivity	aab052c0d5	Merge 'replica/database: truncate: temporarily disable compaction on table and views before flush' from Benny Halevy Flushing the base table triggers view building and corresponding compactions on the view tables. Temporarily disable compaction on both the base table and all its view before flush and snapshot since those flushed sstables are about to be truncated anyway right after the snapshot is taken. This should make truncate go faster. In the process, this series also embeds `database::truncate_views` into `truncate` and coroutinizes both Refs #6309 Test: unit(dev) Closes #10203 * github.com:scylladb/scylla: replica/database: truncate: fixup indentation replica/database: truncate: temporarily disable compaction on table and views before flush replica/database: truncate: coroutinize per-view logic replica/database: open-code truncate_view in truncate replica/database: truncate: coroutinize run_with_compaction_disabled lambda replica/database: coroutinize truncate compaction_manager: add disable_compaction method	2022-03-17 17:24:20 +02:00
Raphael S. Carvalho	0cc717ee86	compaction_manager: Retrieve and register files in rewrite_sstables() atomically The atomicity was lost in commit `a2a5e530f0`. Registration of compacting SSTables now happens in rewrite_sstables_compaction_task ctor, but that's risky because a regular compaction could pick those same files if run_with_compaction_disabled() defers after the callback passed to it returns, and before run__w__c__d() caller has a chance to run. The deferring point is very much possible, because submit() (submits a regular job) is called when run__w__c__d() reenables compaction internally. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220315182857.121479-1-raphaelsc@scylladb.com>	2022-03-16 09:58:16 +02:00
Raphael S. Carvalho	58e520ab1d	compaction: Move run_off_strategy_compaction() into compaction manager Compaction manager is calling back the table to run off-strategy compaction, but the logic clearly belongs to manager which should perform the operation independently and only call table to update its state with the result. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220315174504.107926-2-raphaelsc@scylladb.com>	2022-03-16 09:55:52 +02:00
Benny Halevy	297a37f640	compaction_manager: add disable_compaction method Returns a RAII class compaction_reenabler that conditionally reenables compaction for the given table when destroyed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-15 11:00:49 +02:00
Raphael S. Carvalho	1a2332a0ba	compaction: Move release_exhausted out of the compaction descriptor With compact_sstables() now living in compaction_manager::task, release_exhausted no longer has to live inside compaction_descriptor, which is a good direction because implementation detail is being removed from the interface. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-2-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Raphael S. Carvalho	fce9d869b4	compaction: Move table::compact_sstables() into compaction manager Table submits compaction request into manager, which in turn calls back table to run the compaction when the time has come, i.e.: table -> compaction manager -> table -> execute compaction But manager should not rely on table to run compaction, as compaction execution procedure sits one layer below the manager and should be accessed directly by it, i.e: table -> compaction manager -> execute compaction This makes code easier to understand and update_compaction_history() can now be noop for unit tests using table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220311023410.250149-1-raphaelsc@scylladb.com>	2022-03-14 15:39:23 +02:00
Benny Halevy	5e1fda7e1d	compaction_manager: use coroutine::switch_to Saving an allocation for running the functor as a task in the switched-to scheduling group. Also, switch to the desired scheduling group at the beginning of the task so that the higher level logic, like getting the list of sstables to compact will be performed under the desired scheduling group, not only the compaction code itself. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	8c66916652	compaction_manager::task: drop _compaction_running Replace the _compaction_running boolean member by calculating _state == state::active now that setup_new_compaction switches state to `active` Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	a2a5e530f0	compaction_manager: move per-type logic to derived task Move the business logic into the task specific classes. Separating initialization during task construction, from the compaction_done task, moved into a do_run() method, and in some cases moving a lambda function that was called per table (as in rewrite_sstables) into a private method of the derived class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:20:01 +02:00
Benny Halevy	2e6ce43a97	compaction_manager: task: add state enum Add an enum class representing the task state machine and a switch_state function to transition between the states and update the corresponding compaction_manager stats counters. Refs #9974 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 12:19:59 +02:00
Benny Halevy	9c59d66b7e	compaction_manager: task: add maybe_retry Replacing and combining compaction_manager methods: maybe_stop_on_error and put_task_to_sleep. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:37 +02:00
Benny Halevy	ee32be3aa5	compaction_manager: reevaluate_postponed_compactions: mark as noexcept To simplify error handling in following patches that will coroutinize task logic. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:37 +02:00
Benny Halevy	72162ed653	compaction_manager: define derived task types Turn task into a class, defining a clear hierarchy of private, protected, and public methods. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-03-10 11:35:35 +02:00

1 2 3 4

194 Commits