scylladb

Author	SHA1	Message	Date
Avi Kivity	4f87362abb	compaction_manager: drop gratuitous conversion from interval to wrapped_interval The conversion is unnecessary and likely dates back from before the split between interval and wrapped_interval. It gets in the way of making the conversion explicit. Closes scylladb/scylladb#24164	2025-05-15 16:15:55 +03:00
Botond Dénes	efc48caea5	readers/mutation_reader: s/reader_consumer_v2/mutation_reader_consumer/	2025-05-09 07:53:29 -04:00
Botond Dénes	7af0690762	mutation/mutation_compactor: drop v2 from compactor and related names	2025-05-09 07:53:29 -04:00
Botond Dénes	7ba3c3fec3	readers/multi_range: remove flat from name	2025-05-09 07:53:25 -04:00
Raphael S. Carvalho	c77f710a0c	sstables: Fix quadratic space complexity in partitioned_sstable_set Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with. A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range. Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled). There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried. And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them. Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario. It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly. Fixes #23634. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-04-29 15:47:33 -03:00
Raphael S. Carvalho	21d1e78457	compaction: Wire table_state into make_sstable_set() This will be useful for feeding token range owned by compaction group into sstable set. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-04-29 15:47:33 -03:00
Raphael S. Carvalho	59dad2121f	compaction: Introduce token_range() to table_state This provides a way for compaction layer to know compaction group's token range. It will be important for sstable set impl to know the token range of underlying group. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>	2025-04-29 15:47:33 -03:00
Benny Halevy	e1fe82ed33	utils: phased_barrier, pluggable: use named gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:47:00 +03:00
Benny Halevy	747ae5e1c4	compaction: compaction_state: use named gate Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-04-12 11:28:48 +03:00
Benny Halevy	fba88bdd62	database, compaction_manager, large_data_handler: use pluggable<system_keysapce> To allow safe plug and unplug of the system_keyspace. This patch follows-up on `917fdb9e53` (more specifically - `f9b57df471`) Since just keeping a shared_ptr<system_keyspace> doesn't prevent stopping the system_keyspace shards, while using the `pluggable` interface allows safe draining of outstanding async calls on shutdown, before stopping the system_keyspace. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2025-03-05 08:27:23 +02:00
Amnon Heiman	67ca02b361	compaction_manager.cc: label metrics with basic_level The following metrics will be marked with basic_level label: scylla_compaction_manager_compactions	2025-03-03 16:58:38 +02:00
Kefu Chai	9fdbe0e74b	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22997	2025-02-25 10:32:32 +03:00
Kefu Chai	ccbfe4f669	compaction: replace boost::range::find with std::ranges::find Replace boost::range::find() calls with std::ranges::find(). This change reduces external dependencies and modernizes the codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22942	2025-02-20 14:25:08 +02:00
Kefu Chai	9c5155fa63	compaction: switch from boost::accumulate to std::views::join Replace boost::accumulate() with the standard library's alternatives to reduce external dependencies and simplify the codebase. This change eliminates the requirement for boost::range and makes the implementation more maintainable. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22856	2025-02-18 10:23:40 +02:00
Pavel Emelyanov	1b44861e8f	Merge 'sstable_loader: fix cross-shard resource cleanup in download_task_impl ' from Kefu Chai This PR addresses two related issues in our task system: 1. Prepares for asynchronous resource cleanup by converting release_resources() to a coroutine. This refactoring enables future improvements in how we handle resource cleanup. 2. Fixes a cross-shard resource cleanup issue in the SSTable loader where destruction of per-shard progress elements could trigger "shared_ptr accessed on non-owner cpu" errors in multi-shard environments. The fix uses coroutines to ensure resources are released on their owner shards. Fixes #22759 --- this change addresses a regression introduced by `d815d7013c`, which is contained by 2025.1 and master branches. so it should be backported to 2025.1 branch. Closes scylladb/scylladb#22791 * github.com:scylladb/scylladb: sstable_loader: fix cross-shard resource cleanup in download_task_impl tasks: make release_resources() a coroutine	2025-02-15 20:32:22 +02:00
Kefu Chai	7ff0d7ba98	tree: Remove unused boost headers This commit eliminates unused boost header includes from the tree. Removing these unnecessary includes reduces dependencies on the external Boost.Adapters library, leading to faster compile times and a slightly cleaner codebase. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22857	2025-02-15 20:32:22 +02:00
Kefu Chai	4c1f1baab4	tasks: make release_resources() a coroutine Convert tasks::task_manager::task::impl::release_resources() to a coroutine to prepare for upcoming changes that will implement asynchronous resource release. This is a preparatory refactoring that enables future coroutine-based implementation of resource cleanup logic. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-02-14 11:13:58 +08:00
Kefu Chai	a18069fad7	service: migrate from boost::range::remove_if() to std::ranges::remove_if Replace boost::range::remove_if() with the standard library's std::ranges::remove_if() to reduce external dependencies and simplify the codebase. This change eliminates the requirement for boost::range and makes the implementation more maintainable. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-02-11 09:15:14 +08:00
Kefu Chai	9a20fb43ab	tree: replace boost::min_element() with std::ranges::min_element() in order to reduce the external header dependency, let's switch to the standardlized std::ranges::min_element(). Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22572	2025-02-05 21:54:01 +02:00
Kefu Chai	f39cfd8eb0	compaction: switch boost::algorithm::any_of to std::ranges::any_of std::any_of was included by C++11, and boost::algorithm::any_of() is provided by Boost for users stuck in the pre-C++11 era. in our case, we've moved into C++23, where the ranges variant of this algorithm is available. in order to reduce the header dependency, let's switch to `std::ranges::any_of()`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22503	2025-01-30 13:22:33 +02:00
Benny Halevy	88ae067ddb	everywhere: add skeletal support for the in_memory_tables feature Forward-ported from scylla-enterprise. Note that the feature has been deprecated and the implementation is provided only for backward compatibility with pre-existing features and schema. Tested manually after adding the following to feature_service: ``` gms::feature workload_prioritization { *this, "WORKLOAD_PRIORITIZATION"sv }; ``` Launched a single-node cluster running 2023.1.10 ``` cqlsh> create KEYSPACE ks WITH replication = {'class': 'SimpleStrategy', 'replication_factor': 1}; cqlsh> create TABLE ks.test ( pk int PRIMARY KEY, val int ) WITH compaction = {'class': 'InMemoryCompactionStrategy'}; ``` log: ``` Scylla version 2023.1.10-0.20241227.21cffccc1ccd with build-id bd65b8399cb13b713a87e57fe333cfcabfd50be7 starting ... ... INFO 2024-12-27 19:45:16,563 [shard 0] migration_manager - Create new ColumnFamily: org.apache.cassandra.config.CFMetaData@0x600000f1b400[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName=ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,readRepairChance=0,dcLocalReadRepairChance=0,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,keyValidator=org.apache.cassandra.db.marshal.Int32Type,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,in_memory=false,version=5529c631-c47a-11ef-bd1d-4295734ce5a8,droppedColumns={},collections={},indices={}] INFO 2024-12-27 19:45:16,564 [shard 0] schema_tables - Creating ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 ``` Upgraded to this branch and started scylla. Verified that ks.test was successfuly loaded: log: ``` INFO 2024-12-27 19:48:58,115 [shard 0:main] init - Scylla version 6.3.0~dev-0.20241227.a64c6dfc153e with build-id f9496134a09cf2e55d3865b9e9ff499f672aa7da starting ... ... WARN 2024-12-27 19:53:02,948 [shard 1:main] CompactionStrategy - InMemoryCompactionStrategy is no longer supported. Defaulting to NullCompactionStrategy. ... INFO 2024-12-27 19:53:02,948 [shard 0:main] database - Keyspace ks: Reading CF test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ec88d510-6aff-344a-914d-541d37081440 storage=/home/bhalevy/scylladb/data/ks/test-5529c630c47a11efbd1d4295734ce5a8 ``` Then, tested: ``` cqlsh> describe KEYSPACE ks; CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false}; CREATE TABLE ks.test ( pk int, val int, PRIMARY KEY (pk) ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} AND comment = '' AND compaction = {'class': 'InMemoryCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND speculative_retry = '99.0PERCENTILE'; cqlsh> alter TABLE ks.test with compaction = {'class': 'SizeTieredCompactionStrategy'}; cqlsh> describe KEYSPACE ks; CREATE KEYSPACE ks WITH replication = {'class': 'org.apache.cassandra.locator.SimpleStrategy', 'replication_factor': '1'} AND durable_writes = true AND tablets = {'enabled': false}; CREATE TABLE ks.test ( pk int, val int, PRIMARY KEY (pk) ) WITH bloom_filter_fp_chance = 0.01 AND caching = {'keys': 'ALL', 'rows_per_partition': 'ALL'} AND comment = '' AND compaction = {'class': 'SizeTieredCompactionStrategy'} AND compression = {'sstable_compression': 'org.apache.cassandra.io.compress.LZ4Compressor'} AND crc_check_chance = 1 AND default_time_to_live = 0 AND gc_grace_seconds = 864000 AND max_index_interval = 2048 AND memtable_flush_period_in_ms = 0 AND min_index_interval = 128 AND speculative_retry = '99.0PERCENTILE' AND tombstone_gc = {'mode': 'timeout', 'propagation_delay_in_seconds': '3600'}; ``` log: ``` INFO 2024-12-27 19:56:40,465 [shard 0:stmt] migration_manager - Update table 'ks.test' From org.apache.cassandra.config.CFMetaData@0x60000362d800[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.InMemoryCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ec88d510-6aff-344a-914d-541d37081440,droppedColumns={},collections={},indices={}] To org.apache.cassandra.config.CFMetaData@0x60000336e000[cfId=5529c630-c47a-11ef-bd1d-4295734ce5a8,ksName==ks,cfName=test,cfType=Standard,comparator=org.apache.cassandra.db.marshal.CompositeType(org.apache.cassandra.db.marshal.UTF8Type),comment=,tombstoneGcOptions={"mode":"timeout","propagation_delay_in_seconds":"3600"},gcGraceSeconds=864000,minCompactionThreshold=4,maxCompactionThreshold=32,columnMetadata=[ColumnDefinition{name=pk, type=org.apache.cassandra.db.marshal.Int32Type, kind=PARTITION_KEY, componentIndex=0, droppedAt=-9223372036854775808}, ColumnDefinition{name=val, type=org.apache.cassandra.db.marshal.Int32Type, kind=REGULAR, componentIndex=null, droppedAt=-9223372036854775808}],compactionStrategyClass=class org.apache.cassandra.db.compaction.SizeTieredCompactionStrategy,compactionStrategyOptions={enabled=true},compressionParameters={sstable_compression=org.apache.cassandra.io.compress.LZ4Compressor},bloomFilterFpChance=0.01,memtableFlushPeriod=0,caching={"keys":"ALL","rows_per_partition":"ALL"},cdc={},defaultTimeToLive=0,minIndexInterval=128,maxIndexInterval=2048,speculativeRetry=99.0PERCENTILE,triggers=[],isDense=false,version=ecccf010-c47b-11ef-b52c-622f2f0e87c4,droppedColumns={},collections={},indices={}] INFO 2024-12-27 19:56:40,466 [shard 0: gms] schema_tables - Altering ks.test id=5529c630-c47a-11ef-bd1d-4295734ce5a8 version=ecccf010-c47b-11ef-b52c-622f2f0e87c4 ``` Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#22068	2025-01-20 16:55:17 +02:00
Kefu Chai	1ef2d9d076	tree: migrate from boost::adaptors::transformed to std::views::transform Replace remaining uses of boost::adaptors::transformed with std::views::transform to reduce Boost dependencies, following the migration pattern established in `bab12e3a`. This change addresses recently merged code that reintroduced Boost header dependencies through boost::adaptors::transformed usage. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22365	2025-01-17 16:56:40 +02:00
Kefu Chai	353b522ca0	treewide: migrate from boost::adaptors::reversed to std::views::reverse now that we are allowed to use C++23. we now have the luxury of using `std::views::reverse`. - replace `boost::adaptors::transformed` with `std::views::transform` - remove unused `#include <boost/range/adaptor/reversed.hpp>` this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2025-01-07 13:22:00 +02:00
Kefu Chai	f7fd55146d	compaction: do not include unused headers these unused includes are identified by clang-include-cleaner. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#22188	2025-01-07 13:18:31 +02:00
Raphael S. Carvalho	c973254362	Introduce incremental compaction strategy (ICS) ICS is a compaction strategy that inherits size tiered properties -- therefore it's write optimized too -- but fixes its space overhead of 100% due to input files being only released on completion. That's achieved with the concept of sstable run (similar in concept to LCS levels) which breaks a large sstable into fixed-size chunks (1G by default), known as run fragments. ICS picks similar-sized runs for compaction, and fragments of those runs can be released incrementally as they're compacted, reducing the space overhead to about (number_of_input_runs * 1G). This allows user to increase storage density of nodes (from 50% to ~80%), reducing the cost of ownership. NOTE: test_system_schema_version_is_stable adjusted to account for batchlog using IncrementalCompactionStrategy contains: compaction/: added incremental_compaction_strategy.cc (.hh), incremental_backlog_tracker.cc (.hh) compaction/CMakeLists.txt: include ICS cc files configure.py: changes for ICS files, includes test db/legacy_schema_migrator.cc / db/schema_tables.cc: fallback to ICS when strategy is not supported db/system_keyspace: pick ICS for some system tables schema/schema.hh: ICS becomes default test/boost: Add incremental_compaction_test.cc test/boost/sstable_compaction_test.cc: ICS related changes test/cqlpy/test_compaction_strategy_validation.py: ICS related changes docs/architecture/compaction/compaction-strategies.rst: changes to ICS section docs/cql/compaction.rst: changes to ICS section docs/cql/ddl.rst: adds reference to ICS options docs/getting-started/system-requirements.rst: updates sentence mentioning ICS docs/kb/compaction.rst: changes to ICS section docs/kb/garbage-collection-ics.rst: add file docs/kb/index.rst: add reference to <garbage-collection-ics> docs/operating-scylla/procedures/tips/production-readiness.rst: add ICS section some relevant commits throughout the ICS history: commit 434b97699b39c570d0d849d372bf64f418e5c692 Merge: 105586f747 30250749b8 Author: Paweł Dziepak <pdziepak@scylladb.com> Date: Tue Mar 12 12:14:23 2019 +0000 Merge "Introduce Incremental Compaction Strategy (ICS)" from Raphael " Introduce new compaction strategy which is essentially like size tiered but will work with the existing incremental compaction. Thus incremental compaction strategy. It works like size tiered, but each element composing a tier is a sstable run, meaning that the compaction strategy will look for N similar-sized sstable runs to compact, not just individual sstables. Parameters: * "sstable_size_in_mb": defines the maximum sstable (fragment) size composing a sstable run, which impacts directly the disk space requirement which is improved with incremental compaction. The lower the value the lower the space requirement for compaction because fragments involved will be released more frequently. * all others available in size tiered compaction strategy HOWTO ===== To change an existing table to use it, do: ALTER TABLE mykeyspace.mytable WITH compaction = {'class' : 'IncrementalCompactionStrategy'}; Set fragment size: ALTER TABLE mykeyspace.mytable WITH compaction = {'class' : 'IncrementalCompactionStrategy', 'sstable_size_in_mb' : 1000 } " commit 94ef3cd29a196bedbbeb8707e20fe78a197f30a1 Merge: dca89ce7a5 e08ef3e1a3 Author: Avi Kivity <avi@scylladb.com> Date: Tue Sep 8 11:31:52 2020 +0300 Merge "Add feature to limit space amplification in Incremental Compaction" from Raphael " A new option, space_amplification_goal (SAG), is being added to ICS. This option will allow ICS user to set a goal on the space amplification (SA). It's not supposed to be an upper bound on the space amplification, but rather, a goal. This new option will be disabled by default as it doesn't benefit write-only (no overwrites) workloads and could hurt severely the write performance. The strategy is free to delay triggering this new behavior, in order to increase overall compaction efficiency. The graph below shows how this feature works in practice for different values of space_amplification_goal: https://user-images.githubusercontent.com/1409139/89347544-60b7b980-d681-11ea-87ab-e2fdc3ecb9f0.png When strategy finds space amplification crossed space_amplification_goal, it will work on reducing the SA by doing a cross-tier compaction on the two largest tiers. This feature works only on the two largest tiers, because taking into account others, could hurt the compaction efficiency which is based on the fact that the more similar-sized sstables are compacted together the higher the compaction efficiency will be. With SAG enabled, min_threshold only plays an important role on the smallest tiers, given that the second-largest tier could be compacted into the largest tier for a space_amplification_goal value < 2. By making the options space_amplification_goal and min_threshold independent, user will be able to tune write amplification and space amplification, based on the needs. The lower the space_amplification_goal the higher the write amplification, but by increasing the min threshold, the write amplification can be decreased to a desired amount. " commit 7d90911c5fb3fa891ad64a62147c3a6ca26d61b1 Author: Raphael S. Carvalho <raphaelsc@scylladb.com> Date: Sat Oct 16 13:41:46 2021 -0300 compaction: ICS: Add garbage collection Today, ICS lacks an approach to persist expired tombstones in a timely manner, which is a problem because accumulation of tombstones are known to affecting latency considerably. For an expired tombstone to be purged, it has to reach the top of the LSM tree and hope that older overlapping data wasn't introduced at the bottom. The condition are there and must be satisfied to avoid data resurrection. STCS, today, has an inefficient garbage collection approach because it only picks a single sstable, which satisfies the tombstone density threshold and file staleness. That's a problem because overlapping data either on same tier or smaller tiers will prevent tombstones from being purged. Also, nothing is done to push the tombstones to the top of the tree, for the conditions to be eventually satisfied. Due to incremental compaction, ICS can more easily have an effecient GC by doing cross-tier compaction of relevant tiers. The trigger will be file staleness and tombstone density, which threshold values can be configured by tombstone_compaction_interval and tombstone_threshold, respectively. If ICS finds a tier which meets both conditions, then that tier and the larger[1] and closest-in-size[2] tier will be compacted together. [1]: A larger tier is picked because we want tombstones to eventually reach the top of the tree. [2]: It also has to be the closest-in-size tier as the smaller the size difference the higher the efficiency of the compaction. We want to minimize write amplification as much as possible. The staleness condition is there to prevent the same file from being picked over and over again in a short interval. With this approach, ICS will be continuously working to purge garbage while not hurting overall efficiency on a steady state, as same-tier compactions are prioritized. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Message-Id: <20211016164146.38010-1-raphaelsc@scylladb.com> Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#22063	2025-01-04 15:43:52 +02:00
Kefu Chai	6acc5294a4	treewide: migrate from boost::copy_range to std::ranges::to now that we are allowed to use C++23. we now have the luxury of using `std::ranges::to`. in this change, we: - replace `boost::copy_range` to `std::ranges::to` - remove unused `#include` of boost headers Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21880	2024-12-26 11:46:26 +02:00
Avi Kivity	f3eade2f62	treewide: relicense to ScyllaDB-Source-Available-1.0 Drop the AGPL license in favor of a source-available license. See the blog post [1] for details. [1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/	2024-12-18 17:45:13 +02:00
Avi Kivity	9024e4940c	counters.hh: drop unused boost includes Re-add them to source files that need them. Closes scylladb/scylladb#21738	2024-12-05 12:27:41 +02:00
Kefu Chai	bab12e3a98	treewide: migrate from boost::adaptors::transformed to std::views::transform now that we are allowed to use C++23. we now have the luxury of using `std::views::transform`. in this change, we: - replace `boost::adaptors::transformed` with `std::views::transform` - use `fmt::join()` when appropriate where `boost::algorithm::join()` is not applicable to a range view returned by `std::view::transform`. - use `std::ranges::fold_left()` to accumulate the range returned by `std::view::transform` - use `std::ranges::fold_left()` to get the maximum element in the range returned by `std::view::transform` - use `std::ranges::min()` to get the minimal element in the range returned by `std::view::transform` - use `std::ranges::equal()` to compare the range views returned by `std::view::transform` - remove unused `#include <boost/range/adaptor/transformed.hpp>` - use `std::ranges::subrange()` instead of `boost::make_iterator_range()`, to feed `std::views::transform()` a view range. to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. limitations: there are still a couple places where we are still using `boost::adaptors::transformed` due to the lack of a C++23 alternative for `boost::join()` and `boost::adaptors::uniqued`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21700	2024-12-03 09:41:32 +02:00
Kefu Chai	f436edfa22	mutation: remove unused "#include"s these unused includes are identified by clang-include-cleaner. after auditing the source files, all of the reports have been confirmed. please note, because `mutation/mutation.hh` does not include `seastar/coroutine/maybe_yield.hh` anymore, and quite a few source files were relying on this header to bring in the declaration of `maybe_yield()`, we have to include this header in the places where this symbol is used. the same applies to `seastar/core/when_all.hh`. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-29 14:01:44 +08:00
Avi Kivity	7e02f9bbaa	tombstone_gc.hh: remove include of boost/icl/interval_map.hh tombstone_gc.hh is relatively lightweight and is used in many places, but it includes the heavyweight boost/icl/interval_map.hh. Lighten the load for its users by wrapping lw_shared_ptr<some icl map type> in a forward-declared class. Define the class in a new header tombstone_gc-internals.hh, to be used by the two translation units that need it. Ref #1. Closes scylladb/scylladb#21706	2024-11-28 11:24:51 +03:00
Botond Dénes	ccb433d767	Merge 'tasks: add api_task_ttl for tasks started with API' from Aleksandra Martyniuk When users start an operation asynchronously with API, they are expected to check the operation's status. Hence, the status should be kept in task manager for reasonable time after the operation is done. The operations that are started internally usually don't need to stay in task manager for that long. Add api_task_ttl that will be used for tasks started with API. By default it's 1 hour. The time for which non-API tasks stay in task manager isn't changed. Fixes: #21499. Refs: #21425. No backport needed - previous versions may use task_ttl Closes scylladb/scylladb#21505 * github.com:scylladb/scylladb: test: add test to check user_task_ttl tasks: api: move make_task method docs: nodetool: update backup and restore commands docs docs: update task manager docs nodetool: add nodetool tasks user-ttl command node_ops: use user task ttl for node ops virtual task tasks: use user_task_ttl for tasks started by user api: task_manager: add /task_manager/user_ttl to get and set user task ttl tasks: add task_manager::task::is_user_task method tasks: keep updateable_value of task_ttl in task manager db: config: add user_task_ttl_seconds named value	2024-11-27 09:57:57 +02:00
Kefu Chai	a5ee0c896b	treewide: migrate from boost::adaptors::filtered to std::views::filter Modernize the codebase by replacing Boost range adaptors with C++23 standard library views, reducing external dependencies and leveraging modern C++ language features. Key Changes: - Replace `boost::adaptors::filtered` with `std::views::filter` - Remove `#include <boost/range/adaptor/filtered.hpp>` - Utilize standard library range views Motivation: - Reduce project's external dependency footprint - Leverage standard library's range and view capabilities - Improve long-term code maintainability - Align with modern C++ best practices Implementation Challenges and Considerations: 1. Range Conversion and Move Semantics - `std::ranges::to` adaptor requires rvalue references - Necessitated updates to variable and parameter constness - Example: `cql3/restrictions/statement_restrictions.cc` modified to remove `const` from `common` to enable efficient range conversion 2. Range Iteration and Mutation - Range views may mutate internal state during iteration - Cannot pass ranges by const reference in some scenarios - Solution: Pass ranges by rvalue reference to explicitly indicate state invalidation Limitations: - One instance of `boost::adaptors::filtered` temporarily preserved due to lack of a C++23 alternative for `boost::join()` - A comprehensive replacement will be addressed in a follow-up change This change is part of our ongoing effort to modernize the codebase, reducing external dependencies and adopting modern C++ practices. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21648	2024-11-26 14:26:50 +02:00
Aleksandra Martyniuk	292d00463a	tasks: add task_manager::task::is_user_task method	2024-11-25 14:21:53 +01:00
Avi Kivity	3a6c0a9b36	Merge 'compaction: Perform integrity checks on compacting SSTables' from Nikos Dragazis This PR enables compaction tasks to verify the integrity of the input data through checksum and digest checks. The mechanism for integrity checking was introduced in previous PRs (#20207, #20720) as a built-in functionality of the input streams. This PR integrates this mechanism with compaction. The change applies to all compaction types and covers both compressed and uncompressed SSTables adhering to the 3.x format. If a compaction task reads only part of an SSTable, then only the per-chunk checksums are verified, not the digest. The PR consists of: * Changes to mx readers to support integrity checking. The kl readers, considered as compatibility-only, were left unchanged. Also, integrity checking on single-partition reversed reads (`data_consume_reversed_partition()`) remains unsupported by mx readers as this is not used in compaction. * Changes to `sstable` and `sstable_set` APIs to allow toggling integrity checks for mx readers. * Activation of integrity checking for all compaction types. * Tests for all compaction types with corrupted SSTables. Integrity checks come at a cost. For uncompressed SSTables, the cost is the loading of the CRC and Digest components from disk, and the calculation of checksums and digest from the actual data. For compressed SSTables, checksums are stored in-place and they are being checked already on all reads, so the only extra cost is the loading and calculation of the digest. The measurements show a ~5% regression in compaction performance for uncompressed SSTables, and a negligible regression for compressed SSTables. Command: `perf-sstable --smp=1 --cpuset=1 --poll-mode --mode=compaction --iterations=1000 --partitions 10000 --sstables=1 --key_size=4096 --num_columns=15 --column_size={32, 1024, 3500, 7000, 14500}` Uncompressed SSTables: ``` +--------------+-----------------------+----------------------+------------+ \| SSTable Size \| No Integrity (p/sec) \| Integrity (p/sec) \| Regression \| +--------------+-----------------------+----------------------+------------+ \| 50 MiB \| 65175.59 +- 80.82 \| 61814.63 +- 72.88 \| 5.16% \| \| 200 MiB \| 41795.10 +- 60.39 \| 39686.28 +- 45.05 \| 5.05% \| \| 500 MiB \| 21087.41 +- 30.72 \| 20092.93 +- 25.05 \| 4.72% \| \| 1 GiB \| 12781.64 +- 21.77 \| 12233.94 +- 21.71 \| 4.29% \| \| 2 GiB \| 6629.99 +- 9.40 \| 6377.13 +- 8.28 \| 3.81% \| +--------------+-----------------------+----------------------+------------+ ``` Compressed SSTables: ``` +--------------+-----------------------+----------------------+------------+ \| SSTable Size \| No Integrity (p/sec) \| Integrity (p/sec) \| Regression \| +--------------+-----------------------+----------------------+------------+ \| 50 MiB \| 53975.05 +- 63.18 \| 53825.93 +- 62.28 \| 0.28% \| \| 200 MiB \| 28687.94 +- 26.58 \| 28689.41 +- 26.91 \| 0% \| \| 500 MiB \| 13865.35 +- 15.50 \| 13790.41 +- 14.88 \| 0.54% \| \| 1 GiB \| 7858.10 +- 7.71 \| 7829.75 +- 9.66 \| 0.36% \| \| 2 GiB \| 4023.11 +- 2.43 \| 4010.54 +- 2.55 \| 0.31% \| +--------------+-----------------------+----------------------+------------+ (p/sec = partitions/sec) ``` Refs #19071. New feature, no backport is needed. Closes scylladb/scylladb#21153 * github.com:scylladb/scylladb: test: Add test for compaction with corrupted SSTables compaction: Enable integrity checks for all compaction types sstables: Add integrity option to factories for sstable_set readers sstables: Add integrity option to sstable::make_reader() sstables: Add integrity option to mx::make_reader() sstables: Load checksums and digests in mx full-scan reader sstables: Add integrity option to data_consume_single_partition() sstables: Disengage integrity_check from sstable class sstables: Allow data sources to disable digest check	2024-11-17 20:59:31 +02:00
Kefu Chai	4cc9d78801	compaction: document compaction::make_interposer_consumer() for better maintainability Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#14982	2024-11-15 06:44:52 +02:00
Nikos Dragazis	6687eba2db	compaction: Enable integrity checks for all compaction types Compaction tasks create mutation readers to read SSTables from disk. Each compaction type defines its own reader creation logic by implementing the pure virtual function `compaction::make_sstable_reader()`. Modify all implementations of `make_sstable_reader()` to enable integrity checking on the created readers. This way, all compaction tasks will be able to detect corruption issues on the compacting SSTables. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2024-11-11 22:25:45 +02:00
Lakshmi Narayanan Sreethar	eb4b407085	compaction: use better partition estimate for split compaction Split compaction divides the partitions in an existing sstable into two groups and writes them into two new sstables, which replace the original one. The partition count from the original sstable is used as an estimate when writing the new ones, but this estimate is not accurate as the partitions are split between the two new sstables and each will contain only a portion of the original partition count. This also causes the bloom filters to be rebuilt at the end of compaction, as they were initially built with inaccurate estimates. Fix this by using a better estimate for the output sstables based on the token ranges written to them. Fixes scylladb#20253 Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-11-11 12:26:51 +05:30
Lakshmi Narayanan Sreethar	67dad99ab5	compaction::table_state: implement `get_token_range_after_split()` wrapper Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>	2024-11-11 12:24:00 +05:30
Kefu Chai	50fbab29ca	compaction: remove unused "#include" we don't use `std::list` in compaction/compaction_manager.hh, neither is this header responsible for exposing the declarations in `<list>`. so let's stop `#include` this header. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21436	2024-11-07 10:25:27 +03:00
Kefu Chai	59eb2ab119	treewide: s/boost::algorithm::any_of/std::ranges::any_of/ now that we are allowed to use C++23. we now have the luxury of using `std::ranges::any_of`. in this change, we replace `boost::algorithm::any_of` with `std::ranges::any_of` to reduce the dependency to boost for better maintainability, and leverage standard library features for better long-term support. this change is part of our ongoing effort to modernize our codebase and reduce external dependencies where possible. Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>	2024-11-05 14:06:09 +08:00
Benny Halevy	6cce67bec8	compaction_manager: stop: await _stop_future if engaged The current condition that consults the compaction manager state for awaiting `_stop_future` works since _stop_future is assigned after the state is set to `stopped`, but it is incidental. What matters is that `_stop_future` is engaged. While at it, exchange _stop_future with a ready future so that stop() can be safely called multiple times. And dropped the superfluous co_return. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-11-03 10:53:35 +02:00
Benny Halevy	a7a55298ea	compaction_manager: really_do_stop: assert that no tasks are left behind stop_ongoing_compactions now ignores any errors returned by tasks, and it should leave no task left behind. Assert that here, before the compaction_manager is destroyed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-11-03 10:53:34 +02:00
Benny Halevy	c08ba8af68	compaction_manager: stop_tasks, stop_ongoing_compactions: ignore errors stop() methods, like destructors must always succeed, and returning errors from them is futile as there is nothing else we can do with them but continue with shutdown. Leaked errors on the stop path may cause termination on shutdown, when called in a deferred action destructor. Fixes scylladb/scylladb#21298 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2024-11-03 10:52:58 +02:00
Botond Dénes	d8500472b3	compaction/compaction_manager: stop_tasks(): unlink stopped tasks Stopped tasks currently linger in _tasks until the fiber that created the task is scheduled again and unlinks the task. This window between stop and remove prevents reliable checks for empty _tasks list after all tasks are stopped. Unlink the task early so really_do_stop() can safely check for an empty _tasks list (next patch).	2024-11-03 10:17:11 +02:00
Botond Dénes	e942c074f2	compaction/compaction_manager: make _tasks an intrusive list _tasks is currently std::list<shared_ptr<compaction_task_executor>>, but it has no role in keeping the instances alive, this is done by the fibers which create the task (and pin a shared ptr instance). This lends itself to an intrusive list, avoiding that extra allocation upon push_back(). Using an intrusive list also makes it simpler and much cheaper (O(1) vs. O(N)) to remove tasks from the _tasks list. This will be made use of in the next patch. Code using _task has to be updated because the value_type changes from shared_ptr<compaction_task_executor> to compaction_task_executor&.	2024-11-03 10:17:11 +02:00
Kefu Chai	1b8446f92d	compaction: fix the indent in `38ce2c605d`, we left a TODO for reindent the code. in this change, we reindent the code to address this TODO. Refs `38ce2c605d` Signed-off-by: Kefu Chai <kefu.chai@scylladb.com> Closes scylladb/scylladb#21383	2024-11-01 12:55:47 +03:00
Avi Kivity	b5e46077df	sstables: generation_type: replace boost ranges with std ranges Reduce dependency load. Closes scylladb/scylladb#21402	2024-11-01 12:45:24 +03:00
Nadav Har'El	ee2d75b088	Merge 'Generalize "breakpoint" type of error injection' from Pavel Emelyanov This pattern is -- if requested (by test) suspend code execution until requestor (the test) explicitly wakes it up. For that the injected place should inject a lambda that is called with so called "handler" at hand and try to read message from the handler. In many cases the inner lambda additionally prints a message into logs that tests waits upon to make sure injection was stepped on. In the end of the day this "breakpoint" is injected like ``` co_await inject("foo", [] (auto& handler) { log.info("foo waiting"); co_await handler.wait_for_message(timeout); }); ``` This PR makes breakpoints shorter and more unified, like this ``` co_await inject("foo", wait_for_message(timeout)); ``` where `wait_for_message` is a wrapper structure used to pick new `inject()` overload. Closes scylladb/scylladb#21342 * github.com:scylladb/scylladb: sstables: Use inject(wait_for_message_overload) treewide,error_injection: Use inject(wait_for_message) and fix tests treewide,error_injection: Use inject(wait_for_message) overload error_injection: Add inject() overload with wait_for_message wrapper	2024-10-31 21:56:27 +02:00
Benny Halevy	78ceaeabca	compaction_manager: compaction_disabled: return true if not in compaction_state When a compaction_group is removed via `compaction_manager::remove`, it is erase from `_compaction_state`, and therefore compaction is definitely not enabled on it. This triggers an internal error if tablets are cleaned up during drop/truncate, which checks that compaction is disabled in all compaction groups. Note that the callers of `compaction_disabled` aren't really interested in compaction being actively disabled on the compaction_group, but rather if it's enabled or not. A follow-up patch can be consider to reverse the logic and expose `compaction_enabled` rather than `compaction_disabled`. Fixes scylladb/scylladb#20060 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#21378	2024-10-31 18:21:29 +03:00

1 2 3 4 5 ...

929 Commits