scylladb

Author	SHA1	Message	Date
Raphael S. Carvalho	ef72075920	compaction: wire storage free space into reshape procedure After this, TWCS reshape procedure can be changed to limit job to 10% of available space. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> (cherry picked from commit `0ce8ee03f1`)	2024-06-20 20:41:41 +00:00
Ferenc Szili	b50a9f9bab	removed forward declaration of resharding_descriptor resharding_descriptor has been removed in `e40aa042` in 2020	2024-03-22 11:35:10 +01:00
Raphael S. Carvalho	b551f4abd2	streaming: Improve partition estimation with TWCS When off-strategy is disabled, data segregation is not postponed, meaning that getting partition estimate right is important to decrease filter's false positives. With streaming, we don't have min and max timestamps at destination, well, we could have extended the RPC verb to send them, but turns out we can deduce easily the amount of windows using default TTL. Given partitioner random nature, it's not absurd to assume that a given range being streamed may overlap with all windows, meaning that each range will yield one sstable for each window when segregating incoming data. Today, we assume the worst of 100 windows (which is the max amount of sstables the input data can be segregated into) due to the lack of metadata for estimating the window count. But given that users are recommended to target a max of ~20 windows, it means partition estimate is being downsized 5x more than needed. Let's improve it by using default TTL when estimating window count, so even on absence of timestamp metadata, the partition estimation won't be way off. Fixes #15704. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-11-08 12:10:03 +02:00
Raphael S. Carvalho	8997fe0625	compaction: Switch to strategy_control::candidates() for regular compaction Now everything is prepared for the switch, let's do it. Now let's wait for ICS to enjoy the set of changes. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-09-25 17:18:21 -03:00
Raphael S. Carvalho	d6029a195e	Remove DateTieredCompactionStrategy This is the last step of deprecation dance of DTCS. In Scylla 5.1, users were warned that DTCS was deprecated. In 5.2, altering or creation of tables with DTCS was forbidden. 5.3 branch was already created, so this is targetting 5.4. Users that refused to move away from DTCS will have Scylla falling back to the default strategy, either STCS or ICS. See: WARN 2023-07-14 09:49:11,857 [shard 0] schema_tables - Falling back to size-tiered compaction strategy after the problem: Unable to find compaction strategy class 'DateTieredCompactionStrategy Then user can later switch to a supported strategy with alter table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes #14559	2023-07-14 16:20:48 +03:00
Pavel Emelyanov	66e43912d6	code: Switch to seastar API level 7 In that level no io_priority_class-es exist. Instead, all the IO happens in the context of current sched-group. File API no longer accepts prio class argument (and makes io_intent arg mandatory to impls). So the change consists of - removing all usage of io_priority_class - patching file_impl's inheritants to updated API - priority manager goes away altogether - IO bandwidth update is performed on respective sched group - tune-up scylla-gdb.py io_queues command The first change is huge and was made semi-autimatically by: - grep io_priority_class \| default_priority_class - remove all calls, found methods' args and class' fields Patching file_impl-s is smaller, but also mechanical: - replace io_priority_class& argument with io_intent* one - pass intent to lower file (if applicatble) Dropping the priority manager is: - git-rm .cc and .hh - sed out all the #include-s - fix configure.py and cmakefile The scylla-gdb.py update is a bit hairry -- it needs to use task queues list for IO classes names and shares, but to detect it should it checks for the "commitlog" group is present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes #13963	2023-06-06 13:29:16 +03:00
Raphael S. Carvalho	1ffe2f04ef	compaction: add table_state param to compaction_strategy::notify_completion() once compaction_strategy is made staless, the state must be retrieved in notify_completion() through table_state. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 13:40:02 -03:00
Raphael S. Carvalho	232e71f2cf	compaction: add const-qualifier to a few compaction_strategy methods Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2023-03-27 11:13:10 -03:00
Avi Kivity	69a385fd9d	Introduce schema/ module Schema related files are moved there. This excludes schema files that also interact with mutations, because the mutation module depends on the schema. Those files will have to go into a separate module. Closes #12858	2023-02-15 11:01:50 +02:00
Raphael S. Carvalho	b88acffd66	replica: Allow one compaction_backlog_tracker for each compaction_group Today, compaction_backlog_tracker is managed in each compaction_strategy implementation. So every compaction strategy is managing its own tracker and providing a reference to it through get_backlog_tracker(). But this prevents each group from having its own tracker, because there's only a single compaction_strategy instance per table. To remove this limitation, compaction_strategy impl will no longer manage trackers but will instead provide an interface for trackers to be created, such that each compaction group will be allowed to have its own tracker, which will be managed by compaction manager. On compaction strategy change, table will update each group with the new tracker, which is created using the previously introduced ompaction_group_sstable_set_updater. Now table's backlog will be the sum of all compaction_group backlogs. The normalization factor is applied on the sum, so we don't have to adjust each individual backlog to any factor. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-11-11 09:22:51 -03:00
Benny Halevy	78d6f6a519	compaction: sanitize headers from flat_mutation_reader v1 flat_mutation_reader make_scrubbing_reader no longer exists and there is no need to include flat_mutation_reader.hh nor forward declare the class. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2022-04-28 17:23:04 +03:00
Raphael S. Carvalho	2a9bfa3e3f	compaction_strategy: get_cleanup_compaction_jobs: accept candidates by value Then caller can decide whether to copy or move candidate set into the function. cleanup_sstables_compaction_task can move candidates as it's no longer needed once it retrieves all descriptors. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-29 09:49:13 -03:00
Raphael S. Carvalho	44e9e10414	compaction_strategy: Allow strategies to define their own cleanup strategy Today, all compaction strategies will clean up their files using the incremental approach of one sstable being rewritten at a time. Turns out that's not the best approach performance wise. Let's take STCS for example. As cleanup finishes rewriting one file, the output file is placed into the sstable set. Regular now can compact that file with another that was already there (e.g. produced by flush after cleanup started). Inefficient compactions like this can keep happening as cleanup incrementally places output file into the candidate list for regular. This method will allow strategies to clean up their files in batches. For example, STCS can clean up all files in smallest tiers in single round, allowing the output data to be added at once. So next compaction rounds can be more efficient in terms of writeamp. Another benefit is that deduplication and GC can happen more efficiently. The drawback is the space requirement, as we no longer compact one file a a time. However, the impact is minimized by cleaning up the smallest tier first. With leveled strategy for example, even though 90% of data is in highest level, the space requirement is not a problem because we can apply the incremental compaction on its behalf. The same applies to ICS. With STCS, the requirement is the size of the tier being compacted, but that's already expected by its users anyway. By the time being, all strategies have it unimplemented. so they still use the old behavior where files are rewritten on at a time. This will allow us to incrementally implement the cleanup method for all compaction strategies. Refs #10097. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2022-03-23 00:04:03 -03:00
Mikołaj Sielużycki	1d84a254c0	flat_mutation_reader: Split readers by file and remove unnecessary includes. The flat_mutation_reader files were conflated and contained multiple readers, which were not strictly necessary. Splitting optimizes both iterative compilation times, as touching rarely used readers doesn't recompile large chunks of codebase. Total compilation times are also improved, as the size of flat_mutation_reader.hh and flat_mutation_reader_v2.hh have been reduced and those files are included by many file in the codebase. With changes real 29m14.051s user 168m39.071s sys 5m13.443s Without changes real 30m36.203s user 175m43.354s sys 5m26.376s Closes #10194	2022-03-14 13:20:25 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Botond Dénes	1ba19c2aa4	compaction/compaction_strategy: convert make_interposer_consumer() to v2 The underlying timestamp-based splitter is v2 already.	2022-01-07 13:51:59 +02:00
Raphael S. Carvalho	49f40c8791	compaction: Implement strategy control and wire it This implements strategy control interface for both manager and tests, and wire it. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-12-13 16:05:23 -03:00
Raphael S. Carvalho	2f9f089eda	compaction_strategy: kill unused compaction_strategy_type::major Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-12-03 12:27:10 -03:00
Raphael S. Carvalho	9725e5efa9	compaction_strategy: kill unused can_compact_partial_runs() This strategy method was introduced unnecessarily. We assume it was going to be needed, but turns out it was never needed, not even for ICS. Also it's built on a wrong assumption as an output sstable run being generated can never be compacted in parallel as the non-overlapping requirement can be easily broken. LCS for example can allow parallel compaction on different runs (levels) but correctness cannto be guaranteed with same runs are compacted in parallel. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-12-03 12:20:51 -03:00
Raphael S. Carvalho	bb5a8682f3	compaction: stop including database.hh for compaction_strategy Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 11:29:47 -03:00
Raphael S. Carvalho	e2f6a47999	compaction: switch to table_state in estimated_pending_compactions() Last method in compaction_strategy using table. From now on, compaction strategy no longer works directly with table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 11:25:28 -03:00
Raphael S. Carvalho	93ae9225f7	compaction: switch to table_state in compaction_strategy::get_major_compaction_job() From now on, get_major_compaction_job() will use table_state instead of a plain reference to table. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 11:25:22 -03:00
Raphael S. Carvalho	d881310b52	compaction: switch to table_state in compaction_strategy::get_sstables_for_compaction() From now on, get_sstables_for_compaction() will use table_state. With table_state, we avoid layer violations like strategy using manager and also makes testing easier. Compaction unit tests were temporarily disabled to avoid a giant commit which is hard to parse. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2021-11-09 10:52:14 -03:00
Asias He	6350a19f73	compaction: Move compaction_strategy.hh to compaction dir The top dir is a mess. Move compaction_strategy.hh and compaction_strategy_type.hh to the new home.	2021-08-07 08:06:37 +08:00

24 Commits