scylladb

Author	SHA1	Message	Date
Lakshmi Narayanan Sreethar	18c071c94b	compaction: fix use after free when strategy is altered during compaction The `compaction_strategy_state` class holds strategy specific state via a `std::variant` containing different state types. When a compaction strategy performs compaction, it retrieves a reference to its state from the `compaction_strategy_state` object. If the table's compaction strategy is ALTERed while a compaction is in progress, the `compaction_strategy_state` object gets replaced, destroying the old state. This leaves the ongoing compaction holding a dangling reference, resulting in a use after free. Fix this by using `seastar::shared_ptr` for the state variant alternatives(`leveled_compaction_strategy_state_ptr` and `time_window_compaction_strategy_state_ptr`). The compaction strategies now hold a copy of the shared_ptr, ensuring the state remains valid for the duration of the compaction even if the strategy is altered. The `compaction_strategy_state` itself is still passed by reference and only the variant alternatives use shared_ptrs. This allows ongoing compactions to retain ownership of the state independently of the wrapper's lifetime. Fixes #25913 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2025-10-17 22:57:05 +05:30
Botond Dénes	86ed627fc4	compaction: move code to namespace compaction The namespace usage in this directory is very inconsistent, with files and classes scattered in: * global namespace * namespace compaction * namespace sstables With cases, where all three used in the same file. This code used to live in sstables/ and some of it still retains namespace sstables as a heritage of that time. The mismatch between the dir (future module) and the namespace used is confusing, so finish the migration and move all code in compaction/ to namespace compaction too. This patch, although large, is mechanic and only the following kind of changes are made: * replace namespace sstable {} with namespace compaction {} * add namespace compaction {} * drop/add sstables:: * drop/add compaction:: * move around forward-declarations so they are in the correct namespace context This refactoring revealed some awkward leftover coupling between sstables and compaction, in sstables/sstable_set.cc, where the make_sstable_set() methods of compaction strategies are implemented.	2025-09-25 15:03:56 +03:00
Raphael S. Carvalho	9d3755f276	replica: Futurize retrieval of sstable sets in compaction_group_view This will allow upcoming work to gently produce a sstable set for each compaction group view. Example: repaired and unrepaired. Locking strategy for compaction's sstable selection: Since sstable retrieval path became futurized, tasks in compaction manager will now hold the write lock (compaction_state::lock) when retrieving the sstable list, feeding them into compaction strategy, and finally registering selected sstables as compacting. The last step prevents another concurrent task from picking the same sstable. Previously, all those steps were atomic, but we have seen stall in that area in large installations, so futurization of that area would come sooner or later. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-08-08 06:58:00 +03:00
Raphael S. Carvalho	20c3301a1a	treewide: Futurize estimation of pending compaction tasks This is to allow futurization of compaction_group_view method that retrieves sstable set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-08-08 06:51:29 +03:00
Raphael S. Carvalho	2c4a9ba70c	treewide: Rename table_state to compaction_group_view Since table_state is a view to a compaction group, it makes sense to rename it as so. With upcoming incremental repair, each replica::compaction_group will be actually two compaction groups, so there will be two views for each replica::compaction_group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-08-08 06:51:28 +03:00
Raphael S. Carvalho	21d1e78457	compaction: Wire table_state into make_sstable_set() This will be useful for feeding token range owned by compaction group into sstable set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-04-29 15:47:33 -03:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Raphael S. Carvalho	0ce8ee03f1	compaction: wire storage free space into reshape procedure After this, TWCS reshape procedure can be changed to limit job to 10% of available space. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2024-06-13 12:53:27 -03:00
Raphael S. Carvalho	8997fe0625	compaction: Switch to strategy_control::candidates() for regular compaction Now everything is prepared for the switch, let's do it. Now let's wait for ICS to enjoy the set of changes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Aleksandra Martyniuk	44744d6229	compaction: validate options used in different compaction strategies For each compaction strategy, validate whether options values are valid.	2023-09-13 16:59:40 +02:00
Aleksandra Martyniuk	702c19f941	compaction: make compaction strategy keys static constexpr	2023-09-13 16:59:40 +02:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Raphael S. Carvalho	233fe6d3dc	compaction: LCS: wire up compaction_strategy_state LCS no longer keeps internal state, and will now rely on state managed by each compaction group through compaction::table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-28 08:48:15 -03:00
Raphael S. Carvalho	1ffe2f04ef	compaction: add table_state param to compaction_strategy::notify_completion() once compaction_strategy is made staless, the state must be retrieved in notify_completion() through table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:02 -03:00
Raphael S. Carvalho	2ffaae97a4	compaction: LCS: extract state into a separate struct Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:02 -03:00
Raphael S. Carvalho	232e71f2cf	compaction: add const-qualifier to a few compaction_strategy methods Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 11:13:10 -03:00
Raphael S. Carvalho	b88acffd66	replica: Allow one compaction_backlog_tracker for each compaction_group Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction group will be allowed to have its own tracker, which will be managed by compaction manager. On compaction strategy change, table will update each group with the new tracker, which is created using the previously introduced ompaction_group_sstable_set_updater. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Raphael S. Carvalho	736c96cc6f	compaction: LCS: avoid needless work post major compaction completion That's done by picking the ideal level for the input, such that LCS won't have to either promote or demote data, because the output level is not the best candidate for having the size of the output data. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-04-28 20:19:28 -03:00
Raphael S. Carvalho	b25e53a845	compaction: LCS: extract calculation of ideal level for input ideal level is calculated as: ceil(log base10 of ((input_size + max_fragment_size - 1) / max_fragment_size)) such that 20 fragments will be placed at level 2, as level 1 capacity is 10 fragments only. The goal of extracting it is that the formula will be useful for major in addition to reshape. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-04-28 20:19:04 -03:00
Raphael S. Carvalho	840500fc4d	compaction: Make cleanup for Leveled strategy bucket-aware Bucket awareness in cleanup was introduced in `a69d98c3d0`. STCS and TWCS already support it, and now LCS will receive it. The goal of bucket awareness is to reduce writeamp in cleanup, therefore reducing operation time. Additionally, garbage collection becomes more efficient as shadowed data can now be potentially compacted with the data that shadows it, assuming they're on the same level. The implementation for LCS is simple. Will reuse the procedure for STCS for returning jobs in level 0. And one job will be returned for each non-empty level > 0. What allows us to do it is our incremental selection approach used in compaction, that sets a limit on memory usage and disk space requirement. Fixes #10097. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20220331173417.211257-1-raphaelsc@scylladb.com>	2022-04-05 09:10:21 +03:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Raphael S. Carvalho	49f40c8791	compaction: Implement strategy control and wire it This implements strategy control interface for both manager and tests, and wire it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-12-13 16:05:23 -03:00
Raphael S. Carvalho	8d9704c030	compaction: LCS: kill needless include of database.hh This is part of work for reducing compilation time and removing layer violation in compaction. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211120042232.106651-1-raphaelsc@scylladb.com>	2021-11-20 18:28:55 +02:00
Raphael S. Carvalho	e2f6a47999	compaction: switch to table_state in estimated_pending_compactions() Last method in compaction_strategy using table. From now on, compaction strategy no longer works directly with table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 11:25:28 -03:00
Raphael S. Carvalho	93ae9225f7	compaction: switch to table_state in compaction_strategy::get_major_compaction_job() From now on, get_major_compaction_job() will use table_state instead of a plain reference to table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 11:25:22 -03:00
Raphael S. Carvalho	d881310b52	compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction() From now on, get_sstables_for_compaction() will use table_state. With table_state, we avoid layer violations like strategy using manager and also makes testing easier. Compaction unit tests were temporarily disabled to avoid a giant commit which is hard to parse. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 10:52:14 -03:00
Avi Kivity	daf028210b	build: enable -Winconsistent-missing-override warning This warning can catch a virtual function that thinks it overrides another, but doesn't, because the two functions have different signatures. This isn't very likely since most of our virtual functions override pure virtuals, but it's still worth having. Enable the warning and fix numerous violations. Closes #9347	2021-09-15 12:55:54 +03:00
Raphael S. Carvalho	1924e8d2b6	treewide: Move compaction code into a new top-level compaction dir Since compaction is layered on top of sstables, let's move all compaction code into a new top-level directory. This change will give me extra motivation to remove all layer violations, like sstable calling compaction-specific code, and compaction entanglement with other components like table and storage service. Next steps: - remove all layer violations - move compaction code in sstables namespace into a new one for compaction. - move compaction unit tests into its own file Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20210707194058.87060-1-raphaelsc@scylladb.com>	2021-07-07 23:21:51 +03:00

28 Commits