scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-30 11:36:54 +00:00

Author	SHA1	Message	Date
copilot-swe-agent[bot]	b818331420	Add ungzip function implementation with libdeflate - Created utils/gzip.hh header with ungzip function declaration - Created utils/gzip.cc implementation using libdeflate - Updated utils/CMakeLists.txt to include gzip.cc and link libdeflate - Created comprehensive test suite in test/boost/gzip_test.cc - Added gzip_test to test/boost/CMakeLists.txt The implementation: - Uses libdeflate for high-performance gzip decompression - Handles chunked_content input/output (vector of temporary_buffer) - Supports concatenated gzip files - Validates gzip headers and detects invalid/truncated/corrupted data - Enforces size limits to prevent memory exhaustion - Runs in async context to avoid blocking the reactor Co-authored-by: nyh <584227+nyh@users.noreply.github.com>	2025-11-19 11:46:29 +00:00
Botond Dénes	8579e20bd1	Merge 'Enable digest+checksum verification for streaming/repair' from Taras Veretilnyk This PR enables integrity check of both checksum and digest for repair/streaming. In the past, streaming readers only verified the checksum of compressed SSTables. This change extends the checks to include the digest and the checksum (CRC) for both compressed and uncompressed SSTables. These additional checks require reading the digest and CRC components from disk, which may cause some I/O overhead. For uncompressed SSTables, this involves loading and computing checksums and digest from the data, while for compressed SSTables - where checksums are already verified inline - the only extra cost is reading and verifying the digest.If the reader range doesn't cover the full SSTable, the digest is not loaded and check is skipped. To support testing of these changes, a new option was added to the random_mutation_generator that allows disabling compression. Several new test cases were added to verify that the repair_reader correctly detects corruption. These tests corrupt digest or data component of an SSTable and confirm that the system throws the expected `malformed_sstable_exception`. Backport is not required, it is an improvement Refs #21776 Closes scylladb/scylladb#26444 * github.com:scylladb/scylladb: boost/repair_test: add repair reader integrity verification test cases test/lib: allow to disable compression in random_mutation_generator sstables: Skip checksum and digest reads for unlinked SSTables table: enable integrity checks for streaming reader table: Add integrity option to table::make_sstable_reader() sstables: Add integrity option to create_single_key_sstable_reader	2025-11-14 18:00:33 +02:00
Taras Veretilnyk	e7ceb13c3b	boost/repair_test: add repair reader integrity verification test cases Adds test cases to verify that repair_reader correctly detects SSTable(both comprossed and uncompressed) checksum mismatch. Digest mismatch verification is not possible as repair readar may skip some sstable data, which automatically disables digest verification. Each test corrupts the Data component on disk and ensures the reader throws a malformed_sstable_exception with the expected error message.	2025-11-13 14:08:33 +01:00
Piotr Dulikowski	2e5eb92f21	Merge 'cdc: use CDC schema that is compatible with the base schema' from Michael Litvak When generating CDC log mutations for some base mutation, use a CDC schema that is compatible with the base schema. The compatible CDC schema has for every base column a corresponding CDC column with the same name. If using a non-compatible schema, we may encounter a situation, especially during ALTER, that we have a mutation with a base column set with some value, but the CDC schema doesn't have a column by that name. This would cause the user request to fail with an error. We add to the schema object a schema_ptr that for CDC-enabled tables points to the schema object of the CDC table that is compatible with the schema. It is set by the schema merge algorithm when creating the schema for a table that is created or altered. We use the fact that a base table and its CDC table are created and altered in the same group0 operation, and this way we can find and set the cdc schema for a base table. When transporting the base schema as a frozen schema between shards, we transport with it the frozen cdc schema as well. The patch starts with a series of refactoring commits that make extending the frozen schema easier and cleans up some duplication in the code about the frozen schema. We combine the two types `frozen_schema_with_base_info` and `view_schema_and_base_info` to a single type `extended_frozen_schema` that holds a frozen schema with additional data that is not part of the schema mutations but needs to be transported with it to unfreeze it - base_info, and the frozen cdc schema which is added in a later commit. Fixes https://github.com/scylladb/scylladb/issues/26405 backport not needed - enhancement Closes scylladb/scylladb#24960 * github.com:scylladb/scylladb: test: cdc: test cdc compatible schema cdc: use compatiable cdc schema db: schema_applier: create schema with pointer to CDC schema db: schema_applier: extract cdc tables schema: add pointer to CDC schema schema_registry: remove base_info from global_schema_ptr schema_registry: use extended_frozen_schema in schema load schema_registry: replace frozen_schema+base_info with extended_frozen_schema frozen_schema: extract info from schema_ptr in the constructor frozen_schema: rename frozen_schema_with_base_info to extended_frozen_schema	2025-11-13 10:11:54 +01:00
Tomasz Grabiec	10b893dc27	Merge 'load_stats: fix bug in migrate_tablet_size()' from Ferenc Szili `topology_cooridinator::migrate_tablet_size()` was introduced in `10f07fb95a`. It has a bug where the has_tablet_size() lambda always returns false because of bad comparison of iterators after a table and tablet search: ``` if (auto table_i = tables.find(gid.table); table_i != tables.find(gid.table)) { if (auto size_i = table_i->second.find(trange); size_i != table_i->second.find(trange)) { ``` This change also fixes a problem where the `migrate_tablet_size()` would crash with a `std::out_of_range` if the pending node was not present in load_stats. This change fixes these two problems and moves the functionality into a separate method of `load_stats`. It also adds tests for the new method. A version containing this bug has not been released yet, so no backport is needed. Closes scylladb/scylladb#26946 * github.com:scylladb/scylladb: load_stats: add test for migrate_tablet_size() load_stats: fix problem with tablet size migration	2025-11-12 23:48:37 +01:00
Ferenc Szili	fcbc239413	load_stats: add test for migrate_tablet_size() This change adds tests which validate the functionality of load_stats::migrate_tablet_size()	2025-11-11 14:28:31 +01:00
Benny Halevy	a290505239	utils: stall_free: add dispose_gently dispose_gently consumes the object moved to it, clearing it gently before it's destroyed. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Closes scylladb/scylladb#26356	2025-11-11 12:20:18 +02:00
Pavel Emelyanov	decf86b146	Merge 'Make AWS & Azure KMS boost testing use fixture + include Azure in pytests' from Calle Wilund * Adds test fixture for AWS KMS * Adds test fixture for Azure KMS * Adds key provider proxy for Azure to pytests (ported dtests) * Make test gather for boost tests handle suites * Fix GCP test snafu Fixes #26781 Fixes #26780 Fixes #26776 Fixes #26775 Closes scylladb/scylladb#26785 * github.com:scylladb/scylladb: gcp_object_storage_test: Re-enable parallelism. test::pylib: Add azure (mock) testing to EAR matrix test::boost::encryption_at_rest: Remove redundant azure test indent test::boost::encryption_at_rest: Move azure tests to use fixture test::lib: Add azure mock/real server fixture test::pylib::boost: Fix test gather to handle test suites utils::gcp::object_storage: Fix typo in semaphore init test::boost::encryption_at_rest_test: Remove redundant indent test::boost::test_encryption_at_rest: Move to AWS KMS fixture for kms test test::boost::test_encryption_at_rest: Reorder tests and helpers ent::encryption: Make text helper routines take std::string test::pylib::dockerized_service: Handle docker/podman bind error message test::lib::aws_kms_fixture: Add a fixture object to run mock AWS KMS test::lib::gcs_fixture: Only set port if running docker image + more retry	2025-11-10 14:35:05 +03:00
Botond Dénes	cdba3bebda	Merge 'Generalize directory checks in database_test's snapshot test cases' from Pavel Emelyanov Those test cases use lister::scan_dir() to validate the contents of snapshot directory of a table against this table's base directory. This PR generalizes the listing code making it shorter. Also, the snapshot_skip_flush_works case is missing the check for "schema.cql" file. Nothing is wrong with it, but the test is more accurate if checking it. Also, the snapshot_with_quarantine_works case tries to check if one set of names is sub-set of another using lengthy code. Using std::includes improves the test readability a lot. Also, the PR replaces lister::scan_dir() with directory_lister. The former is going to be removed some day (see also #26586) Improving existing working test, no backport is needed. Closes scylladb/scylladb#26693 * github.com:scylladb/scylladb: database_test: Simplify snapshot_with_quarantine_works() test database_test: Improve snapshot_skip_flush_works test database_test: Simplify snapshot_works() tests database_test: Use collect_files() to remove files database_test: Use collectz_files() to count files in directory database_test: Introduce collect_files() helper	2025-11-07 16:04:02 +02:00
Michał Chojnowski	b82c2aec96	sstables/trie: fix an assertion violation in bti_partition_index_writer_impl::write_last_key _last_key is a multi-fragment buffer. Some prefix of _last_key (up to _last_key_mismatch) is unneeded because it's already a part of the trie. Some suffix of _last_key (after needed_prefix) is unneeded because _last_key can be differentiated from its neighbors even without it. The job of write_last_key() is to find the middle fragments, (containing the range `[_last_key_mismatch, needed_prefix)`) trim the first and last of the middle fragments appropriately, and feed them to the trie writer. But there's an error in the current logic, in the case where `_last_key_mismatch` falls on a fragment boundary. To describe it with an example, if the key is fragmented like `aaa\|bbb\|ccc`, `_last_key_mismatch == 3`, and `needed_prefix == 7`, then the intended output to the trie writer is `bbb\|c`, but the actual output is `\|bbb\|c`. (I.e. the first fragment is empty). Technically the trie writer could handle empty fragments, but it has an assertion against them, because they are a questionable thing. Fix that. We also extend bti_index_test so that it's able to hit the assert violation (before the patch). The reason why it wasn't able to do that before the patch is that the violation requires decorated keys to differ on the _first_ byte of a partition key column, but the keys generated by the test only differed on the last byte of the column. (Because the test was using sequential integers to make the values more human-readable during debugging). So we modify the key generation to use random values that can differ on any position. Fixes scylladb/scylladb#26819 Closes scylladb/scylladb#26839	2025-11-07 11:25:07 +02:00
Calle Wilund	b0061e8c6a	gcp_object_storage_test: Re-enable parallelism. Re-enable parallel execution to get better logs. Note, this is somewhat wasteful, as we won't re-use test fixture here, but in the end, it is probably an improvement.	2025-11-05 15:07:26 +00:00
Tomasz Grabiec	f8879d797d	tablet_allocator: Avoid load balancer failure when replacing the last node in a rack Introduced in `9ebdeb2` The problem is specific to node replacing and rack-list RF. The culprit is in the part of the load balancer which determines rack's shard count. If we're replacing the last node, the rack will contain no normal nodes, and shards_per_rack will have no entry for the rack, on which the table still has replicas. This throws std::out_of_range and fails the tablet draining stage, and node replace is failed. No backport because the problem exists only on master. Fixes #26768 Closes scylladb/scylladb#26783	2025-11-05 15:49:51 +03:00
Pavel Emelyanov	05d711f221	database_test: Simplify snapshot_with_quarantine_works() test The test collects Data files from table dir, then _all_ files from snapshot dir and then checks whether the former is the subset of the latter. Using std::includes over two sets makes the code much shorter. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:35:28 +03:00
Pavel Emelyanov	c8492b3562	database_test: Improve snapshot_skip_flush_works test It has two inaccuracies. First, when checking the contents of table directory, it uses pre-populated expected list with "manifest.json" in it. Weird. Second, when cechking the contents of snapshot directory it doesn't check if the "schema.cql" is there. It's always there, but if something breaks in the future it may come unnoticed. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:35:26 +03:00
Pavel Emelyanov	5a25d74b12	database_test: Simplify snapshot_works() tests No functional changes here, just make use of the new lister to shorten the code. A small side effect -- if the test fails because contents of directories changes, it will print the exact difference in logs, not just that N files are missing/present. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:34:25 +03:00
Pavel Emelyanov	365044cdbb	database_test: Use collect_files() to remove files Some test cases remove files from table directory to perform some checks over the taken snapshots. Using collect_files() helper makes the code easier to read. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:34:24 +03:00
Pavel Emelyanov	e1f326d133	database_test: Use collectz_files() to count files in directory Some test cases want to see that there are more than one file in a directory, so they can just re-use the new helper. Much shorter this way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:32:58 +03:00
Pavel Emelyanov	60d1f78239	database_test: Introduce collect_files() helper It returns a set of files in a given directoy. Will be used by all next patches. Implemented using directory_lister, not lister::scan_dir in order to help removing the latter one in the future. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-11-05 15:32:58 +03:00
Calle Wilund	b8a6b6dba9	test::boost::encryption_at_rest: Remove redundant azure test indent	2025-11-05 10:22:23 +00:00
Calle Wilund	10e591bd6b	test::boost::encryption_at_rest: Move azure tests to use fixture Fixes #26781 Makes the test independent of wrapping scripts. Note: retains the split into "real" and "mock" tests. For other tests, we either all mock, or allow the environment to select mock or real. Here we have them combined. More expensive, but otoh more thourough.	2025-11-05 10:22:22 +00:00
Calle Wilund	2edf6cf325	test::boost::encryption_at_rest_test: Remove redundant indent Removed empty scope and reindents kms test using fixtures.	2025-11-05 10:22:22 +00:00
Calle Wilund	286a655bc0	test::boost::test_encryption_at_rest: Move to AWS KMS fixture for kms test Fixes #26780 Uses fake/real CI endpoint for AWS KMS tests, and moves these into a suite for sharing the mock server.	2025-11-05 10:22:22 +00:00
Calle Wilund	a1cc866f35	test::boost::test_encryption_at_rest: Reorder tests and helpers No code changes. Just reorders code to organize more by provider etc, prepping for fixtures and test suites.	2025-11-05 10:22:22 +00:00
Pavel Emelyanov	fc37518aff	test: Check file existence directly There's a test that checks if temporary-statistics file is gone at some point. It does it by listing the directory it expects the file to be in and then comparing the names met with the temp. stat. file name. It looks like a single file_exists() call is enough for that purpose. As a "sanity" check this patch adds a validation that non-temporary statistics file is there, all the more so this file is removed after the test. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26743	2025-11-04 19:37:55 +01:00
Avi Kivity	d458dd41c6	Merge 'Avoid input_/output_stream-s default initialization and move-assignment' from Pavel Emelyanov Recent seastar update deprecated in/out streams usage pattern when a stream is default constructed early and them move-assigned with the proper one (see scylladb/seastar#3051). This PR fixes few places in Scylla that still use one. Adopting newer seastar API, no need to backport Closes scylladb/scylladb#26747 * github.com:scylladb/scylladb: commitlog: Remove unused work::r stream variable ec2_snitch: Fix indentation after previous patch ec2_snitch: Coroutinize the aws_api_call_once() sstable: Construct output_stream for data instantly test: Don't reuse on-stack input stream	2025-10-31 21:22:41 +02:00
Avi Kivity	adf9c426c2	Merge 'db/config: Change default SSTable compressor to LZ4WithDictsCompressor' from Nikos Dragazis `sstable_compression_user_table_options` allows configuring a node-global SSTable compression algorithm for user tables via scylla.yaml. The current default is LZ4Compressor (inherited from Cassandra). Make LZ4WithDictsCompressor the new default. Metrics from real datasets in the field have shown significant improvements in compression ratios. If the dictionary compression feature is not enabled in the cluster (e.g., during an upgrade), fall back to the `LZ4Compressor`. Once the feature is enabled, flip the default back to the dictionary compressor using with a listener callback. Fixes #26610. Closes scylladb/scylladb#26697 * github.com:scylladb/scylladb: test/cluster: Add test for default SSTable compressor db/config: Change default SSTable compressor to LZ4WithDictsCompressor db/config: Deprecate sstable_compression_dictionaries_allow_in_ddl boost/cql_query_test: Get expected compressor from config	2025-10-31 21:15:18 +02:00
Michael Litvak	e7dbccd59e	cdc: use chunked_vector instead of vector for stream ids use utils::chunked_vector instead of std::vector to store cdc stream sets for tablets. a cdc stream set usually represents all streams for a specific table and timestamp, and has a stream id per each tablet of the table. each stream id is represented by 16 bytes. thus the vector could require quite large contiguous allocations for a table that has many tablets. change it to chunked_vector to avoid large contiguous allocations. Fixes scylladb/scylladb#26791 Closes scylladb/scylladb#26792	2025-10-31 13:02:34 +01:00
Tomasz Grabiec	1c0d847281	Merge 'load_balancer: load_stats reconcile after tablet migration and table resize' from Ferenc Szili This change adds the ability to move tablets sizes in load_stats after a tablet migration or table resize (split/merge). This is needed because the size based load balancer needs to have tablet size data which is as accurate as possible, in order to work on fresh tablet size distribution and issue correct tablet migrations. This is the second part of the size based load balancing changes: - First part for tablet size collection via load_stats: #26035 - Second part reconcile load_stats: #26152 - The third part for load_sketch changes: #26153 - The fourth part which performs tablet load balancing based on tablet size: #26254 This is a new feature and backport is not needed. Closes scylladb/scylladb#26152 * github.com:scylladb/scylladb: load_balancer: load_stats reconcile after tablet migration and table resize load_stats: change data structure which contains tablet sizes	2025-10-31 09:58:25 +01:00
Avi Kivity	04a289cae6	Merge 'Auto expand to rack list' from Tomasz Grabiec We want to move towards rack-list based replication factor for tablets being the default mode, and in the future the only supported mode. This PR is a step towards that. We auto-expand numeric RF to rack list on keyspace creation and ALTER when rf_rack_valid_keyspaces option is enabled. The PR is mostly about adjusting tests. The main logic change is in the last patch, which modifies option post-processing in ks_prop_defs. Fixes #26397 Closes scylladb/scylladb#26692 * github.com:scylladb/scylladb: cql3: ks_prop_defs: Expand numeric RF to rack list locator: Move rack_list to topology.hh alternator: Do not set RF for zero-token DCs alternator: Switch keyspace creation to use ks_prop_defs test: alternator: Adjust for rack lists cql3: Move validation of invalid ALTER KEYSPACE earlier, to ks_prop_defs test: cqlpy: Mark tests using rack lists as scylla-only test: Switch to rack-list based RF test: Generalize tests to work with both numeric RF and rack lists test: cluster: test_zero_token_nodes_multidc: Adjust to rack list RF test: Prepare for handling errors specific to rack list path test: cluster: dtest: alternator: Force RF=1 in test_putitem_contention test: Create cluster with multiple racks in multi-dc setups test: boost: network_topology_strategy_test: Adjust to rack-list RF test: tablets: Adjust to rack list test: cluster: test_group0_schema_versioning: Use smaller RF to respect rf-rack-validness test: tablets_test: Convert test_per_shard_goal_mixed_dc_rf to be rack-valid test: object_store: test_backup: Adjust for rack lists test: cluster: tablets: Do not move tablet across racks in test_tablet_transition_sanity test: cluster: mv: Do not move tablets across racks test: cluster: util: Fix docstring for parse_replication_options() tablets, topology_coordinator: Skip tablet draining on replace	2025-10-30 21:54:08 +02:00
Tomasz Grabiec	28f6bdc99b	cql3: ks_prop_defs: Expand numeric RF to rack list Auto-exands numeric RF in CREATE/ALTER KEYSPACE statements for new DCs specified in the statement. Doesn't auto-expand existing options, as the rack choice may not be in line with current replica placement. This requires co-locating tablet replicas, and tracking of co-location state, which is not implemented yet. Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>	2025-10-29 23:32:59 +01:00
Tomasz Grabiec	723622cf70	test: boost: network_topology_strategy_test: Adjust to rack-list RF	2025-10-29 23:32:57 +01:00
Tomasz Grabiec	19d0beff38	test: tablets: Adjust to rack list test_decommission_rack_load_failure expects some tablets to land in the rack which only has the decommissioning node. Since the table uses RF=1, auto-expansion may choose the other rack and put all tablets there, and the expected failure will not happen. Force placement by using rack-list RF.	2025-10-29 23:32:57 +01:00
Tomasz Grabiec	0f38f7185c	test: tablets_test: Convert test_per_shard_goal_mixed_dc_rf to be rack-valid	2025-10-29 23:32:57 +01:00
Piotr Wieczorek	2812e67f47	cdc: Emit a preimage for non-clustered tables Until this patch, CDC haven't fetched a preimage for mutations containing only a partition tombstone. Therefore, single-row deletions in a table witout a clustering key didn't include a preimage, which was inconsistent with single-row clustered deletions. This commit addresses this inconsistency. Second reason is compatibility with DynamoDB Streams, which doesn't support entire-partition deletes. Alternator uses partition tombstones for single-row deletions, though, and in these cases the 'OldImage' was missing from REMOVE records. Fixes https://github.com/scylladb/scylladb/issues/26382 Closes scylladb/scylladb#26578	2025-10-29 17:54:58 +02:00
Nikos Dragazis	d95ebe7058	boost/cql_query_test: Get expected compressor from config Since `5b6570be52`, the default SSTable compression algorithm for user tables is no longer hardcoded; it can be configured via the `sstable_compression_user_table_options.sstable_compression` option in scylla.yaml. Modify the `test_table_compression` test to get the expected value from the configuration. Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>	2025-10-29 14:52:43 +02:00
Pavel Emelyanov	37b9cccc1c	test: Don't reuse on-stack input stream The test consists of several snippets, each creating an input_stream for some short operation and checking the result. Each snipped over-writes the local `input_stream in` variable with the new one. This change wraps each of those snippets into own code block in order to have own new `input_stream in` variable in each. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2025-10-28 19:25:07 +03:00
Pavel Emelyanov	d9bfbeda9a	lister: Fix race between readdir and stat Sometimes file::list_directory() returns entries without type set. In thase case lister calls file_type() on the entry name to get it. In case the call returns disengated type, the code assumes that some error occurred and resolves into exception. That's not correct. The file_type() method returns disengated type only if the file being inspected is missing (i.e. on ENOENT errno). But this can validly happen if a file is removed bettween readdir and stat. In that case it's not "some error happened", but a enry should be just skipped. In "some error happened", then file_type() would resolve into exceptional future on its own. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#26595	2025-10-28 15:10:22 +02:00
Botond Dénes	ac618a53f4	Merge 'db: repair: do not update repair_time if batchlog replay failed' from Aleksandra Martyniuk Currently, batchlog replay is considered successful even if all batches fail to be sent (they are replayed later). However, repair requires all batches to be sent successfully. Currently, if batchlog isn't cleared, the repair never learns and updates the repair_time. If GC mode is set to "repair", this means that the tombstones written before the repair_time (minus propagation_delay) can be GC'd while not all batches were replied. Consider a scenario: - Table t has a row with (pk=1, v=0); - There is an entry in the batchlog that sets (pk=1, v=1) in table t; - The row with pk=1 is deleted from table t; - Table t is repaired: - batchlog reply fails; - repair_time is updated; - propagation_delay seconds passes and the tombstone of pk=1 is GC'd; - batchlog is replayed and (pk=1, v=1) inserted - data resurrection! Do not update repair_time if sending any batch fails. The data is still repaired. For tablet repair the repair runs, but at the end the exception is passed to topology coordinator. Thanks to that the repair_time isn't updated. The repair request isn't removed as well, due to which the repair will need to rerun. Apart from that, a batch is removed from the batchlog if its version is invalid or unknown. The condition on which we consider a batch too fresh to replay is updated to consider propagation_delay. Fixes: https://github.com/scylladb/scylladb/issues/24415 Data resurrection fix; needs backport to all versions Closes scylladb/scylladb#26319 * github.com:scylladb/scylladb: db: fix indentation test: add reproducer for data resurrection repair: fail tablet repair if any batch wasn't sent successfully db/batchlog_manager: fix making decision to skip batch replay db: repair: throw if replay fails db/batchlog_manager: delete batch with incorrect or unknown version db/batchlog_manager: coroutinize replay_all_failed_batches	2025-10-28 14:52:59 +02:00
Ferenc Szili	10f07fb95a	load_balancer: load_stats reconcile after tablet migration and table resize This change adds the ability to move tablets sizes in load_stats after a tablet migration or table resize (split/merge). This is needed because the size based load balancer needs to have tablet size data which is as accurate as possible, in order to issue migrations which improve load balance.	2025-10-28 12:12:09 +01:00
Pavel Emelyanov	54a117b19d	Merge 'retry_strategy: Switch to using seastar's retry_strategy (take two)' from Ernest Zaslavsky With the recent introduction of retry_strategy to Seastar, the pure virtual class previously defined in ScyllaDB is now redundant. This change allows us to streamline our codebase by directly inheriting from Seastar’s implementation, eliminating duplication in ScyllaDB. Despite this update is purely a refactoring effort and does not introduce functional changes it should be ported back to 2025.3 and 2025.4 otherwise it will make future backports of bugfixes/improvements related to `s3_client` near to impossible ref: https://github.com/scylladb/seastar/issues/2803 depends on: https://github.com/scylladb/seastar/pull/2960 Closes scylladb/scylladb#25801 * github.com:scylladb/scylladb: s3_client: remove unnecessary `co_await` in `make_request` s3 cleanup: remove obsolete retry-related classes s3_client: remove unused `filler_exception` s3_client: fix indentation s3_client: simplify chunked download error handling using `make_request` s3_client: reformat `make_request` functions for readability s3_client: eliminate duplication in `make_request` by using overload s3_client: reformat `make_request` function declarations for readability s3_client: reorder `make_request` and helper declarations s3_client: add `make_request` override with custom retry and error handler s3_client: migrate s3_client to Seastar HTTP client s3_client: fix crash in `copy_s3_object` due to dangling stream s3_client: coroutinize `copy_s3_object` response callback aws_error: handle missing `unexpected_status_error` case s3_creds: use Seastar HTTP client with retry strategy retry_strategy: add exponential backoff to `default_aws_retry_strategy` retry_strategy: introduce Seastar-based retry strategy retry_strategy: update CMake and configure.py for new strategy retry_strategy: rename `default_retry_strategy` to `default_aws_retry_strategy` retry_strategy: fix include retry_strategy: Copied utils/s3/retry_strategy.hh to utils/s3/default_aws_retry_strategy.hh retry_strategy: Copied utils/s3/retry_strategy.cc to utils/s3/default_aws_retry_strategy.cc	2025-10-28 13:08:42 +03:00
Piotr Dulikowski	fd966ec10d	Merge 'cdc: garbage collect CDC streams for tablets' from Michael Litvak introduce helper functions that can be used for garbage collecting old cdc streams for tablets-based keyspaces. add a background fiber to the topology coordinator that runs periodically and checks for old CDC streams for tablets keyspaces that can be garbage collected. the garbage collection works by finding the newest cdc timestamp that has been closed for more than the configured cdc TTL, and removing all information from the cdc internal tables about cdc timestamps and streams up to this timestamp. in general it should be safe to remove information about these streams because they are closed for more than TTL, therefore all rows that were written to these streams with the configured TTL should be dead. the exception is if the TTL is altered to a smaller value, and then we may remove information about streams that still have live rows that were written with the longer ttl. Fixes https://github.com/scylladb/scylladb/issues/26669 Closes scylladb/scylladb#26410 * github.com:scylladb/scylladb: cdc: garbage collect CDC streams periodically cdc: helpers for garbage collecting old streams for tablets	2025-10-27 16:16:55 +01:00
Botond Dénes	417270b726	Merge 'Port dtest EAR tests to test.py/pytest in scylla CI' from Calle Wilund Fixes #26641 * Adds shared abstraction for dockerized mock services for out pytests (not using python docker, due to both library and podman) * Adds test fixtures for our key providers (except GCS KMS, for which we have no mock server) to do local testing * Ports (and prunes and sharpens) the test cases from dtest::encryption_at_rest_test to our pytest. * Shared KMIP mock between boost test and pytest and speeds up boost test shutdown. When merged, the dtest counterpart can be decommissioned. Closes scylladb/scylladb#26642 * github.com:scylladb/scylladb: test::cluster::object_store::conftest: Make GS proxy use shared docker mock server wrapper test::cluster::test_encryption: Port dtest EAR tests test::cluster::conftest: Add key_provider fixture test::pylib::encryption_provider: Port dtest encryption provider classes test::pylib::dockerized_service: Add helper for running docker/podman test::pylib::kmip_wrapper: Modify to be usable by pytest fixtures test::boost::kmip_wrapper: Move python script for PyKMIP to pylib	2025-10-27 15:42:52 +02:00
Michael Litvak	440caeabcb	cdc: helpers for garbage collecting old streams for tablets introduce helper functions that can be used for garbage collecting old cdc streams for tablets-based keyspaces. - get_new_base_for_gc: finds a new base timestamp given a TTL, such that all older timestamps and streams can be removed. - get_cdc_stream_gc_mutations: given new base timestamp and streams, builds mutations that update the internal cdc tables and remove the older streams. - garbage_collect_cdc_streams_for_table: combines the two functions above to find a new base and build mutations to update it for a specific table - garbage_collect_cdc_streams: builds gc mutations for all cdc tables	2025-10-26 11:01:20 +01:00
Avi Kivity	997b52440e	Merge 'replica/mutation_dump: include empty/dead partitions in the scan results' from Botond Dénes `select * from mutation_fragment()` queries don't return partitions which are completely empty or only contain tombstones which are all garbage collectible. This is because the underlying `mutation_dump` mechanism has a separate query to discover partitions for scans. This query is a regular mutation scan, which is subject to query compaction and garbage collection. Disable the query compaction for mutation queries executed on behalf of mutation fragment queries, so all data is visible in the result, even that which is fully garbage collectible. Fixes scylladb/scylladb#23707. Scans for mutation-fragment are very rare, so a backport is not necessary. We can backport on-demand. Closes scylladb/scylladb#26227 * github.com:scylladb/scylladb: replica/mutation_dump: multi_range_partition_generator: disable garbage-collection replica: add tombstone_gc_enabled parameter to mutation query methods mutation/mutation_compactor: remove _can_gc member tombstone_gc: add tombstone_gc_state factory methods for gc_all and no_gc	2025-10-24 23:26:16 +03:00
Ernest Zaslavsky	47704deb1e	s3_client: simplify chunked download error handling using `make_request` Refactor `chunked_download_source` to eliminate redundant exception handling by leveraging the new `make_request` override with custom retry strategy. This streamlines the download fiber logic, improving readability and maintainability.	2025-10-23 15:58:11 +03:00
Ernest Zaslavsky	bdb3979456	s3_client: migrate s3_client to Seastar HTTP client Eliminate use of `retryable_http_client` in `s3_client` and adopt Seastar's native HTTP client.	2025-10-23 15:58:10 +03:00
Aleksandra Martyniuk	7f20b66eff	db: repair: throw if replay fails Return a flag determining whether all the batches were sent successfully in batchlog_manager::replay_all_failed_batches (batches skipped due to being too fresh are not counted). Throw in repair_flush_hints_batchlog_handler if not all batches were replayed, to ensure that repair_time isn't updated.	2025-10-23 10:38:31 +02:00
Botond Dénes	f8b0142983	Merge 'Add --drop-unfixable-sstables flag for scrub in segregate mode' from Taras Veretilnyk This PR introduces support for a new scrub option: `--drop-unfixable-sstables`, which enables the dropping of corrupted SSTables during scrub only in segregate mode. The patch includes implementation, validation, and set of tests to ensure correct behavior and error handling. Fixes #19060 Backport is not required, it is a new feature Closes scylladb/scylladb#26579 * github.com:scylladb/scylladb: sstable_compaction_test: add segregate mode tests for drop-unfixable-sstables option test/nodetool: add scrub drop-unfixable-sstables option testcase scrub: add support for dropping unfixable sstables in segregate mode	2025-10-23 11:06:19 +03:00
Taras Veretilnyk	60334c6481	sstable_compaction_test: add segregate mode tests for drop-unfixable-sstables option Added a new test case, sstable_scrub_segregate_mode_drop_unfixable_sstables_test, which verifies that when the drop-unfixable-sstables flag is enabled in segregate mode, corrupted SSTables are correctly dropped.	2025-10-22 17:16:55 +02:00
Taras Veretilnyk	42da7f1eb6	scrub: add support for dropping unfixable sstables in segregate mode This patch adds a new flag `drop-unfixable-sstables` to the scrub operation in segregate mode, allowing to automatically drop SSTables that cannot be fixed during scrub. It also includes API support of the 'drop_unfixable_sstables' paramater and validation to ensure this flag is not enabled in other modes rather than segragate.	2025-10-22 17:16:49 +02:00

1 2 3 4 5 ...

4338 Commits