scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 14:15:46 +00:00

Author	SHA1	Message	Date
Michał Chojnowski	f851efd4fa	test: add test_sstable_compression_dictionaries_autotrain.py Adds a test which checks that sstable compression dict autotraining does its job.	2025-04-01 00:07:31 +02:00
Michał Chojnowski	62da3d8363	test: add test_sstable_compression_dictionaries_basic.py Add a basic integration test for SSTable compression with shared dictionaries.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	7b0eeefd79	test/pylib/rest_client: add `keyspace_upgrade_sstables` helper	2025-04-01 00:07:30 +02:00
Michał Chojnowski	3f7969313f	main: run a sstable_dict_autotrainer Create an instance of `sstable_dict_autotrainer` in `scylla_main` and run it.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	a19d6d95f7	api: add the estimate_compression_ratios API call Add an API call which estimates the effectiveness of possible compression config changes. This can be used to make an informed decision about whether to change the compression method, without actually recompressing any SSTables.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	4f0d453acf	dict_autotrainer: introduce sstable_dict_autotrainer Add a fiber responsible for periodic re-training of compression dictionaries (for tables which opted into dict-aware compression). As of this patch, it works like this: every `$tick_period` (15 minutes), if we are the current Raft leader, we check for dict-aware tables which have no dict, or a dict older than `$retrain_period`. For those tables, if they have enough data (>1GiB) for a training, we train a new dict and check if it's significantly better than the current one (provides ratio smaller than 95% of current ratio), and if so, we update the dict.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	9d02e2c005	db/system_keyspace: add query_dict_timestamp Adds a helper method which queries the creation timestamp of a given dict in `system.dicts`. We will later use the age of the current SSTable compression dict to decide if another training should be done already.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	cb1b291051	compress: add ZstdWithDictsCompressor and LZ4WithDictsCompressor Add new compressor names to `sstable_compression`. When those names are configured in the schema, new SSTables will be compressed with dict-aware Zstd or LZ4 respectively.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	bea866a46f	main: clean up sstable compression dicts after table drops When a table is dropped, its corresponding dictionary in `system.dicts` -- if any -- should be deleted, otherwise it will remain forever as garbage. This commit implements such cleanup.	2025-04-01 00:07:30 +02:00
Michał Chojnowski	cee504f66f	sstables/compress: discard hidden compression options after the decompressor is created Dictionary contents are kept in the list of "compression options" in the header of `CompressionInfo.db`, and they are loaded from disk into memory when the `sstable::compression` object is populated. After the decompressor for the SSTable is created based on those dict contents, they are not needed in RAM anymore. And since they take up a sizeable amount of memory, we would like to free them. In this patch, we discard all "hidden compression options" (currently: only the dictionary contents) from the `sstable::compression` object right after the decompressor is created. (Those options are not supposed to be used for anything else anyway).	2025-04-01 00:07:30 +02:00
Michał Chojnowski	10fa4abde7	compress: change compressor_ptr from shared_ptr to unique_ptr Cleanup patch. After we moved the ownership of compressors to sstables, compressor objects never have shared lifetime. `unique_ptr` is more appropriate for them than `shared_ptr` now. (And besides expressing the intent better, using `unique_ptr` prevents an accidental cross-shard `shared_ptr` copy).	2025-04-01 00:07:29 +02:00
Michał Chojnowski	58ae278d10	api: add the retrain_dict API call Add an API call which will retrain the SSTable compression dictionary for a given table. Currently, it needs all nodes to be alive to succeed. We can relax this later.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	4115a6fece	storage_service: add some dict-related routines storage_service will be the interface between the API layer (or the automatic training loop) and the dict machinery. This commit implements the relevant interface for that. It adds methods that: 1. Take SSTable samples from the cluster, using the new RPC verbs. 2. Train a dict on the sample. (The trainer will be plugged in from `main`). 3. Publishes the trained dictionary. (By adding mutations to Raft group 0). Perhaps this should be moved to a separate "service". But it's not like `storage_service` has a clear purpose anyway.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	94d244ab49	main: in compression_dict_updated_callback, recognize and use SSTable compression dicts Currently, there is at most one dictionary in `system.dicts`: named "general", used by RPC compression. So the callback called on `system.dicts` just always refreshes the RPC compression dict. In a follow-up commit, we will publish SSTable compression dicts to `system.dicts` rows with a name in the "sstables/{table_uuid}" format. We want modification to such rows to be passed as new dictionary recommendations to the SSTable compressor factory. This commit teaches the `system.dicts` modification callback to recognize such modifications and forward them to the compressor factory.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	380f409c46	storage_service: add do_sample_sstables() Adds a helper which uses ESTIMATE_SSTABLE_VOLUME and SAMPLE_SSTABLES RPC calls to gather a combined sample of SSTable Data files for the given table from the entire cluster.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	94c33b6760	messaging_service: add SAMPLE_SSTABLES and ESTIMATE_SSTABLE_VOLUME verbs Add two verbs needed to implement dictionary training for SSTable compression. SAMPLE_SSTABLES returns a list of randomly-selected chunks of Data files with a given cardinality and using a given chunk size, for the given table. ESTIMATE_SSTABLE_VOLUME returns the total uncompressed size of all Data files the given table.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	4856f4acca	db/system_keyspace: let `system.dicts` helpers be used for dicts other than the RPC compression dict Extend the `system.dicts` helper for querying and modifying `system.dicts` with an ability to use names other than "general". We will use that in later commits to publish dictionaries for SSTable compression.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	b77c611c00	raft/group0_state_machine: on `system.dicts` mutations, pass the affected partitition keys to the callback Before this patch, `system.dicts` contains only one dictionary, for RPC compression, with the fixed name "general". In later parts of this series, we will add more dictionaries to system.dicts, one per table, for SSTable compression. To enable that, this patch adjusts the callback mechanism for group0's `write_mutations` command, so that the mutation callbacks for group0-managed tables can see which partition keys were affected. This way, the callbacks can query only the modified partitions instead of doing a full scan. (This is necessary to prevent quadratic behaviours.) For now, only the `system.dicts` callback uses the partition keys.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	d920ab5366	database: add sample_data_files() Add a helper for sampling the Data files for a given table. We will use it to take samples for dictionary training.	2025-04-01 00:07:29 +02:00
Michał Chojnowski	48c06c7e4b	database: add take_sstable_set_snapshot() We want a method that will allow us to take a stable snapshot of SSTables, to asynchronously compute some stats on them. But `take_storage_snapshot` is overly invasive for that, because it flushes memtables on each call. (If `take_storage_snapshot` was, for example, called repetitively, it could create a ton of small memtables and lead to trouble). This commit adds a weaker version which only takes a snapshot of existing SSTables, and doesn't flush memtables by itself. This will be useful for dictionary training, which doesn't care about the semantics of SSTables, only their rough statistical properties.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	64f3d7e364	compress: teach `lz4_processor` about dictionaries Extend `lz4_processor` with the ability to use dictionaries. We won't use this ability yet. It will be used when new compressor names are added.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	b65101b371	compress: teach `zstd_processor` about dictionaries Extend `zstd_processor` with the ability to use dictionaries. We won't use this ability yet. It will be used when new compressor names are added.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	b18ddcb92e	sstables: delegate compressor creation to the compressor factory Remove `compressor::create()`. This enforces that compressors are only created through the `sstable_compressor_factory`. Unlike the synchronous `compressor::create()`, the factory will be able to create dict-aware compressors.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	30a9d471fa	sstables: plug an `sstable_compressor_factory` into `sstables_manager` Create a `sstable_compressor_factory_impl` in `scylla_main`, and pipe it through constructors into `sstables_manager`. In next commits, the factory available through the `sstables_manager` will be used to create compressors for SSTable readers and writers.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	ebf02913a2	sstables: introduce sstable_compressor_factory Before this commit, `compressor` objects are synchronously created, during the creation or opening of SSTables, from `compression_parameters` objects. But we want to add compression dictionaries to SSTables and we want to share dictionary contents across shards. To do that, we need to make the creation of `compressor` objects asynchronous, and give it access to a global dictionary registry. We encapsulate that in a `sstable_compression_factory`. Instead of calling `compressor::create()` on SSTable opening or creation, we will ask the factory, asynchronously, for a new compressor, and it will return a compressor with a deduplicated, up-to-date dictionary. This commit introduces such a factory. It's not used anywhere yet, and the compressors it produces don't use the provided dictionaries yet.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	2bd393849c	utils/hashers: add get_sha256() Add a helper function which computes the SHA256 for a blob. We will use it to compute identifiers for SSTable compression dictionaries later.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	61316e29df	gms/feature_service: add the SSTABLE_COMPRESSION_DICTS cluster feature This feature will guard against writing SSTables containing compression dictionaries before the entire cluster is able to understand them.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	dd932ebb2f	compress: add hidden dictionary options Before this commit, "compression options" written into CompressionInfo.db (and used to construct a decompressor) have a 1:1 correspondence to "compression options" specified in the schema. But we want to add a new "compression option" -- the compression dictionary -- which will be written into CompressionInfo.db and used to construct decompressors, but won't be specified in the schema. To reconcile that, in this commit we introduce the notion of a "hidden option". If an option name in `CompressionInfo.db` begins with a dot, then this option will be used to construct decompressors, but won't be visible for other uses. (I.e. for the `sstable_info` API call and for recovering a fake `schema` from `CompressionInfo.db` in the `scylla sstable` tool). Then, we introduce the hidden `.dictionary.{0,1,2,..}` options, which hold the contents of the dictionary blob for this SSTable. (The dictionary is split into several parts because the SSTable format limits the length of a single option value to 16 bits, and dictionaries usually have a length greater than that). This commit only introduces helpers which translate dictionary blobs into "options" for CompressionInfo.db, and vice-versa, but it doesn't use those helpers yet. They will be used in later commits.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	11be7c0704	compress: remove `compression_parameters::get_compressor()` Following up on the previous commits, we avoid constructing compressors where not necessary, by checking things directly on `compression_parameters` instead.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	006c631642	sstables/compress: remove get_sstable_compressor() Following up on the previous commit, we avoid constructing a compressor in the `sstable_info` API call, and we instead read the compression options from the `sstable::compression`.	2025-04-01 00:07:28 +02:00
Michał Chojnowski	8e611536b0	sstables/compress: move ownership of `compressor` to `sstable::compression` SSTable readers and writers use `compressor` objects to compress and decompress chunks of SSTable data files. `compressor` objects are read-only, so only one of them is needed for each SSTable. Before this commit, each reader and writer has its own `compressor` object. This isn't necessary, but it's okay. But later in this series it will stop being okay, because the creation of a `compressor` will become an expensive cross-shard operation (because it might require sharing a compression dictionary from another shard). So we have to adjust the code so that there is only once `compressor` per sstable, not one per reader/writer. We stuff the ownership of this compressor into `sstable::compression`. To make the ownership clear, we remove `compression_ptr` shared pointers from readers and writers, and make them access the compressor via the `sstable::compression` instead.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	7bdcd5e8c1	compress: remove compressor::option_names() It used to be used by `compression_parameters` validation logic to ask the created `compressor` for compressor-specific option names. Since we no longer delegate this to `compressor`, but we just put the knowledge of those options directly into `compressor_parameters`, it's dead code now.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	3b0ab8e1ee	compress: clean up the constructor of zstd_processor Since we now parse and validate the compression level during the construction of `compression_parameters`, we can just pass the structured params to `zstd_processor` instead of passing a raw string map.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	6470035a74	compress: squash zstd.cc into compress.cc Unlike all other implementations of `compressor`, `zstd_processor` has its own special object file and its own special late binding mechanism (via the `class_registry`). It doesn't need either. Let's squash it into `compress.cc`. Keeping `zstd_processor` a separate "module" would require adding even more headers and source files later in the series (when adding dictionaries), and there's no benefit in being so granular. All `compressor` logic can be in `compress.cc` and it will still be small enough. This commit also gets rid of the pointless `class_registry` late binding mechanism and just constructs the `zstd_processor` in `compressor::create()` with a regular constructor call.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	cfe69e057f	sstables/compress: break the dependency of `compression_parameters` on `compressor` Note: this commit is meant to be a code refactoring only and is not intended to change the observable behaviour. Today `schema` contains a `compression_parameters`. `compression_parameters` contains an instance of `compressor`, and SSTable writers just share that instance. This is fine because `compressor` is a stateless object, functionally dependent on the schema. But in later parts of the series, we will break this functional dependency by adding dictionaries to compressors. Two writers for the same schema might have different dictionaries, so they won't be able to just share a single instance contained in the schema. And when that happens, having a `compressor` instance in the `schema`/`compression_parameters` will become awkward, since it won't be actually used. It will be only a container for options. In addition, for performance reasons, we will want to share some pieces of compressors across shards, which will require -- in the general case -- a construction of a compressor to be asynchronous, and therefore not possible inside the constructor of `compression_parameters`. This commit modifies `compression_parameters` so that it doesn't hold or construct instances of `compressor`. Before this patch, the `compressor` instance constructed in `compression_parameters` has an additional role of validating and holding compressor-specific options. (Today the only such option is the zstd compression level). This means that the pieces of logic responsible for compressor-specific options have to be rewritten. That ends up being the bulk of this commit.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	f4ca94d13b	compress.hh: switch compressor::name() from an instance member to a virtual call Before this patch, `compressor` is designed to be a proper abstract class, where the creator of a compressor doesn't even know what he's creating -- he passes a name, and it gets turned into a `compressor` behind a scenes. But later, when creation of compressors will involve looking up dictionaries, this abstraction will only get in the way. So we give up on keeping `compressor` abstract, and instead of using "opaque" names we turn to an explicit enum of possible compressor types. The main point of this patch is to add the `algorithm` enum and the `algorithm_to_name()` function. The rest of the patch switches the `compressor::name()` function to use `algorithm_to_name()` instead of the passed-by-constructor `compressor::_name`, to keep a single source of truth for the names.	2025-04-01 00:07:27 +02:00
Michał Chojnowski	4f634de2e9	bytes: adapt fmt_hex to std::span<const std::byte> This allows us to hexdump things other than `bytes_view`. (That is, without reinterpret_casting them to `bytes_view`, which -- aside from the inconvenience -- isn't quite legal. In contrast, any span can be legally casted to `std::span<const std::byte>`).	2025-04-01 00:07:27 +02:00
Tomasz Grabiec	29d1c2adc6	Merge 'Finalize tablet splits earlier' from Lakshmi Narayanan Sreethar Resize finalization is executed in a separate topology transition state, `tablet_resize_finalization`, to ensure it does not overlap with tablet transitions. The topology transitions into the `tablet_resize_finalization` state only when no tablet migrations are scheduled or being executed. If there is a large load-balancing backlog, split finalization might be delayed indefinitely, leaving the tables with large tablets. This PR fixes the issue by updating the load balancer to no schedule any migrations and to not make any repair plans when there a resize finalization is pending in any table. Also added a testcase to verify the fix. Fixes #21762 Improvement : No need to backport. Closes scylladb/scylladb#22148 * github.com:scylladb/scylladb: topology_coordinator: fix indentation in generate_migration_updates topology_coordinator: do not schedule migrations when there are pending resize finalizations load_balancer: make repair plans only when there is no pending resize finalization	2025-03-31 14:42:34 +02:00
Botond Dénes	90c20858ed	Merge 'test/database: Remove most of take_snapshot() helper overloads and re-use them more' from Pavel Emelyanov This helper facilitate snapshot creation by various test cases in database_test.cc. This PR generalizes all overloads into one that suits all callers and patches one more test case to use it as well. Closes scylladb/scylladb#23482 * github.com:scylladb/scylladb: test/database: Re-use take_snapshot() helper once more test/database: Remove most of take_snapshot() helper overloads	2025-03-31 15:20:51 +03:00
Benny Halevy	5f2ce0b022	loading_cache_test: test_loading_cache_reload_during_eviction: use manual_clock Rather than lowres_clock, as since `32b7cab917`, loading_cache_for_test uses manual_clock for timing and relying on lowres_clock to time the test might run out of memory on fast test machines. Fixes #23497 Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#23498	2025-03-31 14:53:06 +03:00
Pavel Emelyanov	ac582efb44	test/database: Re-use take_snapshot() helper once more There's a test case that can call the recently patched take_snapshot() helper as well. This changes nothing, but makes further patching a bit simpler (not in this branch). Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-31 13:18:06 +03:00
Pavel Emelyanov	7e6380b6bd	test/database: Remove most of take_snapshot() helper overloads There are 3 of those that help tests (re)shuffle cql_test_env/database, skip_flush == true/false options and keyspace/table/snapshot names. There's little sense in having that many of those, just one overload with default arguments suits most of the callers. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-03-31 13:18:06 +03:00
Botond Dénes	ea55eed037	Merge 'Snapshot several tables at once in scrub API handler' from Pavel Emelyanov The scrub API handler may want to snapshot several tables. For that, it calls snapshot-ctl method to snapshot a single table for each table in the list. That's excessive, snapshot-ctl has a method to snapshot a bunch of tables at once, just what the scrub handler needs. It's an improvement, so no need to backport Closes scylladb/scylladb#23472 * github.com:scylladb/scylladb: snapshot-ctl: Remove unused snapshot-single-table method api: Snapshot all tables at once in scrub handler	2025-03-31 13:00:32 +03:00
Piotr Smaron	aff8cbc6f3	CODEOWNERS: remove expired owners Removing krzaq, who's no longer with the company. Removing core-frontend team members from Alternator areas, as it's no longer the domain of this team. Closes scylladb/scylladb#23500	2025-03-31 11:37:51 +03:00
Pavel Emelyanov	0077acd1bb	api: Properly validate table in tablet add\|del replica handlers The handlers in question just go and call database.find_column_family, in case the table in question doesn't exist, the no_such_column_family exception would be thrown, which is not nice. Proper behavior is to throw bad_param one and there's a helper that does it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#23389	2025-03-31 10:03:17 +02:00
Andrzej Jackowski	c89d8c6566	cql3: prevent from empty option use in cf_statement::column_family() Implementation of cf_statement::column_family() dereferences _cf_name option without checking if the option is non-empty. On enterprise branch, there is a safeguard that prevents from such an empty option dereferencing. Although the current code on master seems to not call columny_family() when _cf_name is empty, it is safer to introduce the same workaround on master, to avoid any regression. This change: - Prevent from empty option use in cf_statement::column_family() Fixes: scylla-enterprise#5273 Closes scylladb/scylladb#23366	2025-03-31 09:43:22 +03:00
Michał Chojnowski	e23fdc0799	table: fix a race in table::take_storage_snapshot() `safe_foreach_sstable` doesn't do its job correctly. It iterates over an sstable set under the sstable deletion lock in an attempt to ensure that SSTables aren't deleted during the iteration. The thing is, it takes the deletion lock after the SSTable set is already obtained, so SSTables might get unlinked before we take the lock. Remove this function and fix its usages to obtain the set and iterate over it under the lock. Closes scylladb/scylladb#23397	2025-03-31 09:40:32 +03:00
Avi Kivity	2b9e1e61d0	docs: reader_concurrency_semaphore: document CPU concurrency limit Document the CPU concurrency implemented in `3d816b7c16` and adjusted in `3d12451d1f`. Closes scylladb/scylladb#23404	2025-03-31 09:39:55 +03:00
Dawid Mędrek	b0b0c5905e	test/cluster/test_multidc: Clean up RF-rack-valid keyspaces tests There are some minor things we should fix that are a remnant of the original changes (scylladb/scylladb@7646e14). Closes scylladb/scylladb#23429	2025-03-31 09:38:42 +03:00
David Garcia	1a7be07b8c	docs: renders os-support from json file docs: renders os-support from json file Closes scylladb/scylladb#23436	2025-03-31 09:36:49 +03:00

1 2 3 4 5 ...

47265 Commits