scylladb

Author	SHA1	Message	Date
Raphael S. Carvalho	ef72075920	compaction: wire storage free space into reshape procedure After this, TWCS reshape procedure can be changed to limit job to 10% of available space. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `0ce8ee03f1`)	2024-06-20 20:41:41 +00:00
Raphael S. Carvalho	56f551f740	replica: don't expose compaction_group to reshape task compaction_group sits in replica layer and compaction layer is supposed to talk to it through compaction::table_state only. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `b8bd4c51c2`)	2024-06-20 20:41:41 +00:00
Aleksandra Martyniuk	532653f118	replica: replace table::as_table_state Replace table::as_table_state with table::try_get_table_state_with_static_sharding which throws if a table does not use static sharding.	2024-05-10 14:56:38 +02:00
Aleksandra Martyniuk	cf9913b0b7	compaction: pass compaction group id to reshape_compaction_group Pass compaction group id to shard_reshaping_compaction_task_impl::reshape_compaction_group. Modify table::as_table_state to return table_state of the given compaction group.	2024-05-10 14:56:38 +02:00
Pavel Emelyanov	1f44a374b8	error_injection: Overload inject() instead of inject_with_handler() The inject_with_handler() method accepts a coroutine that can be called wiht injection_handler. With such function as an argument, there's no need in distinctive inject_with_handler() name for a method, it can be overload of all the existing inject()-s Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2024-03-11 19:30:19 +03:00
Lakshmi Narayanan Sreethar	83fecc2f1f	compaction: reshape sstables within compaction groups For tables using tablet based replication strategies, the sstables should be reshaped only within the compaction groups they belong to. Updated shard_reshaping_compaction_task_impl to group the sstables based on their compaction groups before reshaping them within the groups. Fixes #16966 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-02-23 18:43:39 +05:30
Lakshmi Narayanan Sreethar	9fffd8905f	compaction: reshape: update total reshaped size only on success The total reshaped size should only be updated on reshape success and not after reshape has been failed due to some exception. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-02-23 01:07:54 +05:30
Lakshmi Narayanan Sreethar	4fb099659a	compaction: simplify exception handling in shard_reshaping_compaction_task_impl::run Catch and handle the exceptions directly instead of rethrowing and catching again. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-02-23 01:07:54 +05:30
Avi Kivity	eedb997568	Merge 'compaction: upgrade: handle keyspaces that use tablets' from Lakshmi Narayanan Sreethar Tables in keyspaces governed by replication strategy that uses tablets, have separate effective_replication_maps. Update the upgrade compaction task to handle this when getting owned key ranges for a keyspace. Fixes #16848 Closes scylladb/scylladb#17335 * github.com:scylladb/scylladb: compaction: upgrade: handle keyspaces that use tablets replica/database: add an optional variant to get_keyspace_local_ranges	2024-02-15 21:31:54 +02:00
Lakshmi Narayanan Sreethar	7a98877798	compaction: upgrade: handle keyspaces that use tablets Tables in keyspaces governed by replication strategy that uses tablets, have separate effective_replication_maps. Update the upgrade compaction task to handle this when getting owned key ranges for a keyspace. Fixes #16848 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-02-15 17:47:39 +05:30
Kefu Chai	caa20c491f	storage_service: pass non-empty keyspace when performing cleanup_all this change addresses the regression introduced by `5e0b3671`, which fall backs to local cleanup in cleanup_all. but `5e0b3671` failed to pass the keyspace to the `shard_cleanup_keyspace_compaction_task_impl` is its constructor parameter, that's why the test fails like ``` error executing POST request to http://localhost:10000/storage_service/cleanup_all with parameters {}: remote replied with status code 400 Bad Request: Can't find a keyspace ``` where the string after "Can't find a keyspace" is empty. in this change, the keyspace name of the keyspace to be cleaned is passed to `shard_cleanup_keyspace_compaction_task_impl`. we always enable the topology coordinator when performing testing, that's why this issue does not pop up until the longevity test. Fixes #17302 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#17320	2024-02-15 13:17:45 +02:00
Kefu Chai	5e0b3671d3	storage_service: fall back to local cleanup in cleanup_all before this change, if no keyspaces are specified, scylla-nodetool just enumerate all non-local keyspaces, and call "/storage_service/keyspace_cleanup" on them one after another. this is not quite efficient, as each this RESTful API call force a new active commitlog segment, and flushes all tables. so, if the target node of this command has N non-local keyspaces, it would repeat the steps above for N times. this is not necessary. and after a topology change, we would like to run a global "nodetool cleanup" without specifying the keyspace, so this is a typical use case which we do care about. to address this performance issue, in this change, we improve an existing RESTful API call "/storage_service/cleanup_all", so if the topology coordinator is not enabled, we fall back to a local cleanup to cleanup all non-local keyspaces. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Kefu Chai	4f90a875f6	compaction: format flush_mode without the helper since flush_mode is moved out of major_compaction_task_impl, let's drop the helper hosted in that class as well, and implement the formatter witout it. please note, the `__builtin_unreachable()` is dropped. it should not change the behavior of the formatter. we don't put it in the `default` branch in hope that `-Wswitch` can warn us in the case when another enum of `flush_mode` is added, but we fail to handle it somehow. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Kefu Chai	b39cc01bb3	compaction_manager: flush all tables before cleanup according to the document "nodetool cleanup" > Triggers removal of data that the node no longer owns currently, scylla performs cleanup by rewriting the sstables. but commitlog segments may still contain the mutations to the tables which are dropped during sstable rewriting. when scylla server restarts, the dirty mutations are replayed to the memtable. if any of these dirty mutations changes the tables cleaned up. the stale data are reapplied. this would lead to data resurrection. so, in this change we following the same model of major compaction: 1. force new active segment, 2. flush all tables 3. perform cleanup using compaction, which rewrites the sstables of specified tables because we already `flush()` all tables in `cleanup_keyspace_compaction_task_impl::run()`, there is no need to call `flush()` again, in `table::perform_cleanup_compaction()`, so the `flush()` call is dropped in this function, and the tests using this function are updated to call `flush()` manually to preserve the existing behavior. there are two callers of `cleanup_keyspace_compaction_task_impl`, * one is `storage_service::sstable_cleanup_fiber()`, which listens for the events fired by topology_state_machine, which is in turn driven by, for instance, "/storage_service/cleanup_all" API. which cleanup all keyspaces in one after another. * another is "/storage_service/keyspace_cleanup", which cleans up the specified keyspace. in the first use case, we can force a new active segment for a single time, so another parameter to the ctor of `cleanup_keyspace_compaction_task_impl` is introduced to specify if the `db.flush_all_tables()` call should be skiped. please note, there are two possible optimizations, 1. force new active segment only if the mutations in it touches the tables being cleaned up 2. after forcing new active segment, only flush the (mem)tables mutated by the non-active segments but let's leave them for following-up changes. this change is a minimal fix for data resurrection issue. Fixes #16757 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-02-01 11:25:53 +08:00
Botond Dénes	493b6bc65f	Merge 'Guard tables in compaction tasks' from Benny Halevy Currently, if a compaction function enters the table or compaction_group async_gate, we can't stop it on the table/compaction_group stop path as they co_await their respective async_gate.close(). This series introduces a table_ptr smart pointer to guards the table object by entering its async_gate, and it also defers awaiting the gate.close future till after stopping ongoing compaction so that closing the gate will prevent starting new compactions while ongoing compaction can be stopped and finally awaiting the close() future will wait for them to unwind and exit the gate after being stopped. Fixes #16305 Closes scylladb/scylladb#16351 * github.com:scylladb/scylladb: compaction: run_on_table: skip compaction also on gate_closed_exception compaction: run_on_table: hold table table: add table_holder and hold method table: stop: allow compactions to be stopped while closing async_gate	2023-12-12 12:50:17 +02:00
Benny Halevy	7843025a53	compaction: run_on_table: skip compaction also on gate_closed_exception Similar to the no_such_column_family error, gate_closed_exception indicates that the table is stopped and we should skip compaction on it gracefully. Fixes #16305 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-12 08:46:37 +02:00
Benny Halevy	92c718c60a	compaction: run_on_table: hold table To ensure the table will not be dropped while the compaction task is ongoing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-12 08:45:59 +02:00
Aleksandra Martyniuk	ceec5577d8	api: compaction: pass pointer to top level compaction tasks As a preparation for asynchronous compaction api, from which we cannot take values by reference, top level compaction tasks get pointers which need to be set to nullptr when they are not needed (like in async api).	2023-12-11 11:36:10 +01:00
Kefu Chai	f483309165	compaction, api: drop unused functions run_on_existing_tables() is not used at all. and we have two of them. in this change, let's drop them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16304	2023-12-06 14:31:08 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Benny Halevy	b12b142232	api: add /storage_service/compact For major compacting all tables in the database. The advantage of this api is that `commitlog->force_new_active_segment` happens only once in `database::flush_all_tables` rather than once per keyspace (when `nodetool compact` translates to a sequence of `/storage_service/keyspace_compaction` calls). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	66ba983fe0	compaction_manager: flush_all_tables before major compaction Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See `64ec1c6ec6` However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See `f42eb4d1ce`). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb/scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1fd85bd37b	api: compaction: add flush_memtables option When flushing is done externally, e.g. by running `nodetool flush` prior to `nodetool compact`, flush_memtables=false can be passed to skip flushing of tables right before they are major-compacted. This is useful to prevent creation of small sstables due to excessive memtable flushing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Aleksandra Martyniuk	9c2c964b8e	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-11-24 19:25:27 +01:00
Aleksandra Martyniuk	aa7bba2d8b	compaction: abort task manager compaction tasks Set top level compaction tasks as abortable. Compaction tasks which have no children, i.e. compaction task executors, have abort method overriden to stop compaction data.	2023-11-24 15:44:34 +01:00
Botond Dénes	0ae1335daa	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `11cafd2fc8`, reversing changes made to `2bae14f743`. Reverting because this series causes frequent CI failures, and the proposed quickfix causes other failures of its own. Fixes: #16113	2023-11-22 17:44:07 +02:00
Aleksandra Martyniuk	6af581301b	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-11-14 10:36:38 +01:00
Aleksandra Martyniuk	599d6ebd52	compaction: abort task manager compaction tasks Set top level compaction tasks as abortable. Compaction tasks which have no children, i.e. compaction task executors, have abort method overriden to stop compaction data.	2023-11-13 15:46:58 +01:00
Botond Dénes	1cccc86813	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `2860d43309`, reversing changes made to `a3621dbd3e`. Reverting because rest_api.test_compaction_task started failing after this was merged. Fixes: #16005	2023-11-09 10:43:11 +01:00
Aleksandra Martyniuk	56221f2161	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-10-19 10:47:20 +02:00
Aleksandra Martyniuk	0681795417	compaction: abort task manager compaction tasks Set top level compaction tasks as abortable. Compaction tasks which have no children, i.e. compaction task executors, have abort method overriden to stop compaction data.	2023-10-19 10:47:17 +02:00
Aleksandra Martyniuk	198119f737	compaction: add get_progress method to compaction_task_impl compaction_task_impl::get_progress is used by the lowest level compaction tasks which progress can be taken from compaction_progress_monitor.	2023-10-12 17:16:05 +02:00
Aleksandra Martyniuk	3553556708	compaction: keep compaction_progress_monitor in compaction_task_executor Keep compaction_progress_monitor in compaction_task_executor and pass a reference to it further, so that the compaction progress could be retrieved out of it.	2023-10-12 17:03:46 +02:00
Aleksandra Martyniuk	d799adc536	tasks: change task_manager::task::impl::is_internal() Most of the time only the roots of tasks tree should be non internal. Change default implementation of is_internal and delete overrides consistent with it. Closes scylladb/scylladb#15353	2023-09-26 14:49:49 +03:00
Aleksandra Martyniuk	e0ce711e4f	compaction: do not swallow compaction_stopped_exception for reshape Loop in shard_reshaping_compaction_task_impl::run relies on whether sstables::compaction_stopped_exception is thrown from run_custom_job. The exception is swallowed for each type of compaction in compaction_manager::perform_task. Rethrow an exception in perfrom task for reshape compaction. Fixes: #15058. Closes #15067	2023-08-21 12:41:55 +03:00
Aleksandra Martyniuk	139e147ae1	compaction: turn custom_task_executor into compaction_task_impl custom_task_executor inherits both from compaction_task_executor and compaction_task_impl.	2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk	71db8645d5	compaction: pass task_info through sstables compaction	2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk	4e439ac957	compaction: turn offstrategy_compaction_task_executor into offstrategy_compaction_task_impl offstrategy_compaction_task_executor inherits both from compaction_task_executor and offstrategy_compaction_task_impl.	2023-07-28 10:51:55 +02:00
Aleksandra Martyniuk	92f2987217	compaction: turn cleanup_compaction_task_executor into cleanup_compaction_task_impl cleanup_compaction_task_executor inherits both from compaction_task_executor and cleanup_compaction_task_impl. Add a new version of compaction_manager::perform_task_on_all_files which accepts only the tasks that are derived from compaction_task_impl. After all task executors' conversions are done, the new version replaces the original one.	2023-07-28 10:48:58 +02:00
Aleksandra Martyniuk	77dcdd743e	compaction: add shard_reshard_sstables_compaction_task_impl Add task manager's task covering resharding compaction on one shard.	2023-07-19 17:19:10 +02:00
Aleksandra Martyniuk	f73178a114	compaction: invoke resharding on sharded database In reshard_sstables_compaction_task_impl::run() we call sharded<sstables::sstable_directory>::invoke_on_all. In lambda passed to that method, we use both sharded sstable_directory service and its local instance. To make it straightforward that sharded and local instances are dependend, we call sharded<replica::database>::invoke_on_all instead and access local directory through the sharded one.	2023-07-19 17:19:10 +02:00
Aleksandra Martyniuk	fa10c352a1	compaction: move run_resharding_jobs into reshard_sstables_compaction_task_impl::run()	2023-07-19 17:19:10 +02:00
Aleksandra Martyniuk	7a7e287d8c	compaction: add reshard_sstables_compaction_task_impl Add task manager's task covering resharding compaction. A struct and some functions are moved from replica/distributed_loader.cc to compaction/task_manager_module.cc.	2023-07-19 17:15:40 +02:00
Michał Chojnowski	b511d57fc8	Revert "Merge 'Compaction resharding tasks' from Aleksandra Martyniuk" This reverts commit `2a58b4a39a`, reversing changes made to `dd63169077`. After patch `87c8d63b7a`, table_resharding_compaction_task_impl::run() performs the forbidden action of copying a lw_shared_ptr (_owned_ranges_ptr) on a remote shard, which is a data race that can cause a use-after-free, typically manifesting as allocator corruption. Note: before the bad patch, this was avoided by copying the _contents_ of the lw_shared_ptr into a new, local lw_shared_ptr. Fixes #14475 Fixes #14618 Closes #14641	2023-07-11 19:11:37 +03:00
Aleksandra Martyniuk	87c8d63b7a	compaction: add shard_reshard_sstables_compaction_task_impl Add task manager's task covering resharding compaction on one shard.	2023-06-28 11:43:12 +02:00
Aleksandra Martyniuk	db6e4a356b	compaction: invoke resharding on sharded database In reshard_sstables_compaction_task_impl::run() we call sharded<sstables::sstable_directory>::invoke_on_all. In lambda passed to that method, we use both sharded sstable_directory service and its local instance. To make it straightforward that sharded and local instances are dependend, we call sharded<replica::database>::invoke_on_all instead and access local directory through the sharded one.	2023-06-28 11:43:12 +02:00
Aleksandra Martyniuk	1acaed026a	compaction: move run_resharding_jobs into reshard_sstables_compaction_task_impl::run()	2023-06-28 11:43:11 +02:00
Aleksandra Martyniuk	837d77ba8c	compaction: add reshard_sstables_compaction_task_impl Add task manager's task covering resharding compaction.	2023-06-28 11:41:43 +02:00
Aleksandra Martyniuk	0d6dd3eeda	compaction: replica: copy struct and functions from distributed_loader.cc As a preparation for integrating resharding compaction with task manager a struct and some functions are copied from replica/distributed_loader.cc to compaction/task_manager_module.cc.	2023-06-28 11:41:42 +02:00
Raphael S. Carvalho	83c70ac04f	utils: Extract pretty printers into a header Can be easily reused elsewhere. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-06-26 21:58:20 -03:00

1 2

79 Commits