scylladb

Author	SHA1	Message	Date
Benny Halevy	51a46aa83b	compaction_manager: perform_task_on_all_files: return early when there are no sstables to compact Prevent the creation of a compaction task when the list of sstables is known to be empty ahead of time. Refs scylladb/scylladb#16694 Fixes scylladb/scylladb#16803 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-01-17 11:53:39 +02:00
Benny Halevy	bd1d65ec38	compaction_manager: perform_cleanup: use compaction_manager::eligible_for_compaction `3b424e391b` introduced a loop in `perform_cleanup` that waits until all sstables that require cleanup are cleaned up. However, with `f1bbf705f9`, an sstable that is_eligible_for_compaction (i.e. it is not in staging, awaiting view update generation), may already be compacted by e.g. regular compaction. And so perform_cleanup should interrupt that by calling try_perform_cleanup, since the latter reevaluates `update_sstable_cleanup_state` with compaction disabled - that stops ongoing compactions. Refs scylladb/scylladb#15673 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-01-17 11:53:39 +02:00
Benny Halevy	d6071945c8	compaction, table: ignore foreign sstables replay_position The sstables replay_position in stats_metadata is valid only on the originating node and shard. Therefore, validate the originating host and shard before using it in compaction or table truncate. Fixes #10080 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#16550	2024-01-16 18:45:59 +02:00
Botond Dénes	697ebef149	Merge 'tasks: compaction: drop regular compaction tasks after they are finished' from Aleksandra Martyniuk Make compaction tasks internal. Drop all internal tasks without parents immediately after they are done. Fixes: #16735 Refs: #16694. Closes scylladb/scylladb#16698 * github.com:scylladb/scylladb: compaction: make regular compaction tasks internal tasks: don't keep internal root tasks after they complete	2024-01-11 12:10:44 +02:00
Kefu Chai	eb9216ef11	compaction: do not include unused headers these unused includes were identified by clangd. see https://clangd.llvm.org/guides/include-cleaner#unused-include-warning for more details on the "Unused include" warning. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16707	2024-01-10 11:07:36 +02:00
Aleksandra Martyniuk	6b87778ef2	compaction: make regular compaction tasks internal Regular compaction tasks are internal. Adjust test_compaction_task accordingly: modify test_regular_compaction_task, delete test_running_compaction_task_abort (relying on regular compaction) which checks are already achived by test_not_created_compaction_task_abort. Rename the latter.	2024-01-09 13:13:54 +01:00
Aleksandra Martyniuk	6f13e55187	tasks: call release_resources when task is finished Call task_manager::task::impl::release_resources when task is finished instead of putting the responsibility on user. Closes scylladb/scylladb#16660	2024-01-09 11:41:54 +02:00
Lakshmi Narayanan Sreethar	1d6eaf2985	compaction manager: remove: cleanup _compaction_state on exceptions If for some reason an exception is thrown in compaction_manager::remove, it might leave behind stale table pointers in _compaction_state. Fix that by setting up a deffered action to perform the cleanup. Fixes #16635 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#16632	2024-01-03 22:03:24 +02:00
Raphael S. Carvalho	d1e6dfadea	sstables: Harden estimate_droppable_tombstone_ratio() interface The interface is fragile because the user may incorrectly use the wrong "gc before". Given that sstable knows how to properly calculate "gc before", let's do it in estimate__d__t__r(), leaving no room for mistakes. sstable_run's variant was also changed to conform to new interface, allowing ICS to properly estimate droppable ratio, using GC before that is calculated using each sstable's range. That's important for upcoming tablets, as we want to query only the range that belongs to a particular tablet in the repair history table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#15931	2023-12-20 19:04:41 +02:00
Raphael S. Carvalho	dd1a6d6309	compaction: Add splitting compaction task to manager The task for splitting compaction will run until all sstables in the main set are split. The only exceptions are shutdown or user has explicitly asked for abort. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:09 -03:00
Raphael S. Carvalho	f87161e556	compaction: Prepare rewrite_sstables_compaction_task_executor to be reused for splitting Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:09 -03:00
Raphael S. Carvalho	c96938c49b	compaction: remove scrub-specific code from rewrite_sstables_compaction_task_executor Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:09 -03:00
Raphael S. Carvalho	b1c5d5dd4e	compaction: Add splitting compaction Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:40:08 -03:00
Raphael S. Carvalho	3dcb800a96	flat_mutation_reader: Allow interposer consumers to be stacked reader_consumer_v2 being a noncopyable_function imposes a restriction when stacking one interposer consumer on top of another. Think for example of a token-based segregator on top of a timestamp based one. To achieve that, the interposer consumer creator must be reentrant, such that the consumer can be created on each "channel", but today the creator becomes unusable after first usage. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-12-17 11:26:32 -03:00
Botond Dénes	493b6bc65f	Merge 'Guard tables in compaction tasks' from Benny Halevy Currently, if a compaction function enters the table or compaction_group async_gate, we can't stop it on the table/compaction_group stop path as they co_await their respective async_gate.close(). This series introduces a table_ptr smart pointer to guards the table object by entering its async_gate, and it also defers awaiting the gate.close future till after stopping ongoing compaction so that closing the gate will prevent starting new compactions while ongoing compaction can be stopped and finally awaiting the close() future will wait for them to unwind and exit the gate after being stopped. Fixes #16305 Closes scylladb/scylladb#16351 * github.com:scylladb/scylladb: compaction: run_on_table: skip compaction also on gate_closed_exception compaction: run_on_table: hold table table: add table_holder and hold method table: stop: allow compactions to be stopped while closing async_gate	2023-12-12 12:50:17 +02:00
Benny Halevy	7843025a53	compaction: run_on_table: skip compaction also on gate_closed_exception Similar to the no_such_column_family error, gate_closed_exception indicates that the table is stopped and we should skip compaction on it gracefully. Fixes #16305 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-12 08:46:37 +02:00
Benny Halevy	92c718c60a	compaction: run_on_table: hold table To ensure the table will not be dropped while the compaction task is ongoing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-12 08:45:59 +02:00
Aleksandra Martyniuk	ceec5577d8	api: compaction: pass pointer to top level compaction tasks As a preparation for asynchronous compaction api, from which we cannot take values by reference, top level compaction tasks get pointers which need to be set to nullptr when they are not needed (like in async api).	2023-12-11 11:36:10 +01:00
Kefu Chai	f483309165	compaction, api: drop unused functions run_on_existing_tables() is not used at all. and we have two of them. in this change, let's drop them. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16304	2023-12-06 14:31:08 +02:00
Benny Halevy	0bcce35abd	treewide: get rid of now unused fb_utilities Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-12-05 16:22:49 +02:00
Yaniv Kaul	c658bdb150	Typos: fix typos in comments Fixes some typos as found by codespell run on the code. In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc. Follow-up commits will take care of them. Refs: https://github.com/scylladb/scylladb/issues/16255 Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>	2023-12-02 22:37:22 +02:00
Benny Halevy	b12b142232	api: add /storage_service/compact For major compacting all tables in the database. The advantage of this api is that `commitlog->force_new_active_segment` happens only once in `database::flush_all_tables` rather than once per keyspace (when `nodetool compact` translates to a sequence of `/storage_service/keyspace_compaction` calls). Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	66ba983fe0	compaction_manager: flush_all_tables before major compaction Major compaction already flushes each table to make sure it considers any mutations that are present in the memtable for the purpose of tombstone purging. See `64ec1c6ec6` However, tombstone purging may be inhibited by data in commitlog segments based on `gc_time_min` in the `tombstone_gc_state` (See `f42eb4d1ce`). Flushing all sstables in the database release all references to commitlog segments and there it maximizes the potential for tombstone purging, which is typically the reason for running major compaction. However, flushing all tables too frequently might result in tiny sstables. Since when flushing all keyspaces using `nodetool flush` the `force_keyspace_compaction` api is invoked for keyspace successively, we need a mechanism to prevent too frequent flushes by major compaction. Hence a `compaction_flush_all_tables_before_major_seconds` interval configuration option is added (defaults to 24 hours). In the case that not all tables are flushed prior to major compaction, we revert to the old behavior of flushing each table in the keyspace before major-compacting it. Fixes scylladb/scylladb#15777 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Benny Halevy	1fd85bd37b	api: compaction: add flush_memtables option When flushing is done externally, e.g. by running `nodetool flush` prior to `nodetool compact`, flush_memtables=false can be passed to skip flushing of tables right before they are major-compacted. This is useful to prevent creation of small sstables due to excessive memtable flushing. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-28 16:37:42 +02:00
Botond Dénes	3ccf1e020b	Merge ' compaction: abort compaction tasks' from Aleksandra Martyniuk Compaction tasks which do not have a parent are abortable through task manager. Their children are aborted recursively. Compaction tasks of the lowest level are aborted using existing compaction task executors stopping mechanism. Closes scylladb/scylladb#16177 * github.com:scylladb/scylladb: test: test abort of compaction task that isn't started yet test: test running compaction task abort tasks: fail if a task was aborted compaction: abort task manager compaction tasks	2023-11-28 09:08:04 +02:00
Aleksandra Martyniuk	9c2c964b8e	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-11-24 19:25:27 +01:00
Aleksandra Martyniuk	8639eae0ce	test: test running compaction task abort Test whether a task which is aborted while running has a proper status.	2023-11-24 19:25:20 +01:00
Aleksandra Martyniuk	aa7bba2d8b	compaction: abort task manager compaction tasks Set top level compaction tasks as abortable. Compaction tasks which have no children, i.e. compaction task executors, have abort method overriden to stop compaction data.	2023-11-24 15:44:34 +01:00
Kefu Chai	f99223919a	compaction: add formatter for map<timestamp_type, vector<shared_sstable>> before this change, we rely on the default-generated fmt::formatter created from operator<<, but fmt v10 dropped the default-generated formatter. in this change, we define a formatter for map<timestamp_type, vector<shared_sstable>>. since the operator<< for this type is only used in the .cc file, and the only use case of it is to provide the formatter for fmt, so the operator<< based formatter is remove in this change. Refs #13245 Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#16163	2023-11-24 11:56:28 +02:00
Raphael S. Carvalho	157a5c4b1b	treewide: Avoid using namespace sstables in header to avoid conflicts That's needed for compaction_group.hh to be included in headers. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-11-23 17:36:57 +02:00
Botond Dénes	0ae1335daa	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `11cafd2fc8`, reversing changes made to `2bae14f743`. Reverting because this series causes frequent CI failures, and the proposed quickfix causes other failures of its own. Fixes: #16113	2023-11-22 17:44:07 +02:00
Nadav Har'El	64d1d5cf62	Merge 'Fix partition estimation with TWCS tables during streaming' from Raphael "Raph" Carvalho TWCS tables require partition estimation adjustment as incoming streaming data can be segregated into the time windows. Turns out we had two problems in this area that leads to suboptimal bloom filters. 1) With off-strategy enabled, data segregation is postponed, but partition estimation was adjusted as if segregation wasn't postponed. Solved by not adjusting estimation if segregation is postponed. 2) With off-strategy disabled, data segregation is not postponed, but streaming didn't feed any metadata into partition estimation procedure, meaning it had to assume the max windows input data can be segregated into (100). Solved by using schema's default TTL for a precise estimation of window count. For the future, we want to dynamically size filters (see https://github.com/scylladb/scylladb/issues/2024), especially for TWCS that might have SSTables that are left uncompacted until they're fully expired, meaning that the system won't heal itself in a timely manner through compaction on a SSTable that had partition estimation really wrong. Fixes https://github.com/scylladb/scylladb/issues/15704. Closes scylladb/scylladb#15938 * github.com:scylladb/scylladb: streaming: Improve partition estimation with TWCS streaming: Don't adjust partition estimate if segregation is postponed	2023-11-14 20:41:36 +02:00
Botond Dénes	11cafd2fc8	Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk Compaction tasks which do not have a parent are abortable through task manager. Their children are aborted recursively. Compaction tasks of the lowest level are aborted using existing compaction task executors stopping mechanism. Closes scylladb/scylladb#16050 * github.com:scylladb/scylladb: test: test abort of compaction task that isn't started yet test: test running compaction task abort tasks: fail if a task was aborted compaction: abort task manager compaction tasks	2023-11-14 14:55:17 +02:00
Aleksandra Martyniuk	6af581301b	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-11-14 10:36:38 +01:00
Botond Dénes	a66ec1d3c1	Merge 'Drop compaction_manager_test' from Pavel Emelyanov This is continuation of `a34c8dc4` (Drop compaction_manager_for_testing). There's one more wrapper over compaction_manager to access its private fields. All such access was recently moved to sstables::test_env's compaction manager, now it's time to drop the remaining legacy wrapper class. Closes scylladb/scylladb#16017 * github.com:scylladb/scylladb: test/utils: Drop compaction_manager_test test/utils: Get compaction manager from test_env test/sstables: Introduce test_env_compaction_manager::perform_compaction() test/env: Add sstables::test_env& to compaction_manager_test::run() test/utils: Add sstables::test_env& to compact_sstables() test/utils: Simplify and unify compaction_manager_test::run() test/utils: Squash two compact_sstables() helpers test/compaction: Use shorter compact_sstables() helper test/utils: Keep test task compaction gate on task itself test/utils: Move compaction_manager_test::propagate_replacement()	2023-11-14 11:25:17 +02:00
Aleksandra Martyniuk	a63a6dcd93	test: test running compaction task abort Test whether a task which is aborted while running has a proper status.	2023-11-13 16:06:36 +01:00
Aleksandra Martyniuk	599d6ebd52	compaction: abort task manager compaction tasks Set top level compaction tasks as abortable. Compaction tasks which have no children, i.e. compaction task executors, have abort method overriden to stop compaction data.	2023-11-13 15:46:58 +01:00
Pavel Emelyanov	f4696f21a8	test/utils: Drop compaction_manager_test This class only provides a .run() method which allocates a task and calls sstables::test_env::perform_compaction(). This can be done in a helper method, no need for the whole class for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	9fd270566a	test/sstables: Introduce test_env_compaction_manager::perform_compaction() Take it from compaction_manager_test::run() which is simplified overwite of the compaction_manager::perform_compaction(). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Pavel Emelyanov	aec3fc493a	test/utils: Move compaction_manager_test::propagate_replacement() The purpose of this method is to turn public the private compaction_manager method of the same name. The caller of this method is having sstable_test_env at hand with its test_env_compaction_manager, so the de-private-isation call can be moved. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2023-11-13 11:44:51 +03:00
Kefu Chai	efd65aebb2	build: cmake: add check-header target to have feature parity with `configure.py`. we won't need this once we migrate to C++20 modules. but before that day comes, we need to stick with C++ headers. we generate a rule for each .hh files to create a corresponding .cc and then compile it, in order to verify the self-containness of that header. so the number of rule is quite large, to avoid the unnecessary overhead. the check-header target is enabled only if `Scylla_CHECK_HEADERS` option is enabled. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#15913	2023-11-13 10:27:06 +02:00
Benny Halevy	68a7bbe582	compaction_manager: perform_cleanup: ignore condition_variable_timed_out The polling loop was intended to ignore `condition_variable_timed_out` and check for progress using a longer `max_idle_duration` timeout in the loop. Fixes #15669 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#15671	2023-11-12 13:53:51 +02:00
Botond Dénes	1cccc86813	Revert "Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk" This reverts commit `2860d43309`, reversing changes made to `a3621dbd3e`. Reverting because rest_api.test_compaction_task started failing after this was merged. Fixes: #16005	2023-11-09 10:43:11 +01:00
Raphael S. Carvalho	b551f4abd2	streaming: Improve partition estimation with TWCS When off-strategy is disabled, data segregation is not postponed, meaning that getting partition estimate right is important to decrease filter's false positives. With streaming, we don't have min and max timestamps at destination, well, we could have extended the RPC verb to send them, but turns out we can deduce easily the amount of windows using default TTL. Given partitioner random nature, it's not absurd to assume that a given range being streamed may overlap with all windows, meaning that each range will yield one sstable for each window when segregating incoming data. Today, we assume the worst of 100 windows (which is the max amount of sstables the input data can be segregated into) due to the lack of metadata for estimating the window count. But given that users are recommended to target a max of ~20 windows, it means partition estimate is being downsized 5x more than needed. Let's improve it by using default TTL when estimating window count, so even on absence of timestamp metadata, the partition estimation won't be way off. Fixes #15704. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-11-08 12:10:03 +02:00
Botond Dénes	2860d43309	Merge 'compaction: abort compaction tasks' from Aleksandra Martyniuk Compaction tasks which do not have a parent are abortable through task manager. Their children are aborted recursively. Compaction tasks of the lowest level are aborted using existing compaction task executors stopping mechanism. Closes scylladb/scylladb#15083 * github.com:scylladb/scylladb: test: test abort of compaction task that isn't started yet test: test running compaction task abort tasks: fail if a task was aborted compaction: abort task manager compaction tasks	2023-11-08 08:45:16 +02:00
Benny Halevy	a1acf6854b	everywhere: reduce dependencies on i_partitioner.hh Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2023-11-05 20:47:44 +02:00
Aleksandra Martyniuk	0c6a3f568a	compaction: delete default_compaction_progress_monitor default_compaction_progress_monitor returns a reference to a static object. So, it should be read-only, but its users need to modify it. Delete default_compaction_progress_monitor and use one's own compaction_progress_monitor instance where it's needed. Closes scylladb/scylladb#15800	2023-10-23 16:03:34 +03:00
Raphael S. Carvalho	fded314e46	sstables: Fix update of tombstone GC settings to have immediate effect After "repair: Get rid of the gc_grace_seconds", the sstable's schema (mode, gc period if applicable, etc) is used to estimate the amount of droppable data (or determine full expiration = max_deletion_time < gc_before). It could happen that the user switched from timeout to repair mode, but sstables will still use the old mode, despite the user asked for a new one. Another example is when you play with value of grace period, to prevent data resurrection if repair won't be able to run in a timely manner. The problem persists until all sstables using old GC settings are recompacted or node is restarted. To fix this, we have to feed latest schema into sstable procedures used for expiration purposes. Fixes #15643. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#15746	2023-10-19 16:27:59 +03:00
Aleksandra Martyniuk	56221f2161	test: test abort of compaction task that isn't started yet Test whether a task which parent was aborted has a proper status.	2023-10-19 10:47:20 +02:00
Aleksandra Martyniuk	520d9db92d	test: test running compaction task abort Test whether a task which is aborted while running has a proper status.	2023-10-19 10:47:20 +02:00

1 2 3 4 5 ...

776 Commits