scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-02 06:05:53 +00:00

Author	SHA1	Message	Date
Marcin Maliszkiewicz	80e627c64b	test: perf: add example commands to perf-alternator and perf-cql-raw	2026-01-30 08:48:19 +01:00
Marcin Maliszkiewicz	ea29e4963e	test: perf: add option to write results to json in perf-cql-raw	2026-01-29 10:56:03 +01:00
Marcin Maliszkiewicz	d974ee1e21	test: perf: add option to write results to json in perf-alternator	2026-01-29 10:55:52 +01:00
Marcin Maliszkiewicz	a74b442c65	test: perf: move write_json_result to a common file The implementation is going to be shared with perf-alternator and perf-cql-raw.	2026-01-29 10:54:11 +01:00
Pavel Emelyanov	02af292869	Merge 'Introduce TTL and retries to address resolution' from Ernest Zaslavsky In production environments, we observed cases where the S3 client would repeatedly fail to connect due to DNS entries becoming stale. Because the existing logic only attempted the first resolved address and lacked a way to refresh DNS state, the client could get stuck in a failure loop. Introduce RR TTL and connection failure retry to - re-resolve the RR in a timely manner - forcefully reset and re-resolve addresses - add a special case when the TTL is 0 and the record must be resolved for every request Fixes: CUSTOMER-96 Fixes: CUSTOMER-139 Should be backported to 2025.3/4 and 2026.1 since we already encountered it in the production clusters for 2025.3 Closes scylladb/scylladb#27891 * github.com:scylladb/scylladb: connection_factory: includes cleanup dns_connection_factory: refine the move constructor connection_factory: retry on failure connection_factory: introduce TTL timer connection_factory: get rid of shared_future in dns_connection_factory connection_factory: extract connection logic into a member connection_factory: remove unnecessary `else` connection_factory: use all resolved DNS addresses s3_test: remove client double-close	2026-01-27 18:45:43 +03:00
Gleb Natapov	9daa109d2c	test: get rid of consistent_cluster_management usage in test consistent_cluster_management is deprecated since scylla-5.2 and no longer used by Scylladb, so it should not be used by test either. Closes scylladb/scylladb#28340	2026-01-27 11:31:30 +01:00
Avi Kivity	fa5ed619e8	Merge 'test: perf: add perf-cql-raw benchmarking tool' from Marcin Maliszkiewicz The tool supports: - auth or no auth modes - simple read and write workloads - connection pool or connection per request modes - in-process or remote modes, remote may be usefull to assess tool's overhead or use it as bigger scale benchmark - multi table mode - non superuser mode It could support in the future: - TLS mode - different workloads - shard awareness Example usage: > build/release/scylla perf-cql-raw --workdir /tmp/scylla-data --smp 2 --cpus 0,1 \ --developer-mode 1 --workload read --duration 5 2> /dev/null > Running test with config: {workload=read, partitions=10000, concurrency=100, duration=5, ops_per_shard=0} Pre-populated 10000 partitions 97438.42 tps (269.2 allocs/op, 1.1 logallocs/op, 35.2 tasks/op, 113325 insns/op, 80572 cycles/op, 0 errors) 102460.77 tps (261.1 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108222 insns/op, 75447 cycles/op, 0 errors) 95707.93 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108443 insns/op, 75320 cycles/op, 0 errors) 102487.87 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 107956 insns/op, 74320 cycles/op, 0 errors) 100409.60 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108337 insns/op, 75262 cycles/op, 0 errors) throughput: mean= 99700.92 standard-deviation=3039.28 median= 100409.60 median-absolute-deviation=2759.85 maximum=102487.87 minimum=95707.93 instructions_per_op: mean= 109256.53 standard-deviation=2281.39 median= 108337.37 median-absolute-deviation=1034.83 maximum=113324.69 minimum=107955.97 cpu_cycles_per_op: mean= 76184.36 standard-deviation=2493.46 median= 75320.20 median-absolute-deviation=922.09 maximum=80572.19 minimum=74320.00 Backports: no, new tool Closes scylladb/scylladb#25990 * github.com:scylladb/scylladb: test: perf: reuse stream id main: test: add future and abort_source to after_init_func test: perf: add option to stress multiple tables in perf-cql-raw test: perf: add perf-cql-raw benchmarking tool test: perf: move cut_arg helper func to common code	2026-01-27 12:23:25 +02:00
Avi Kivity	f1c6094150	Merge 'Remove buffer_input_stream and limiting_input_stream from core code' from Pavel Emelyanov These two streams mostly play together. The former provides an input_stream from read from in-memory temporary buffers, the latter wraps it to limit the size of provided temporary buffers. Both are used to test contiguous data consumer, also the buffer_input_stream has a caller in sstables reversing reader. This PR removes the buffer_input_stream in favor of seastar memory_data_source, and moves the limiting_input_stream into test/lib. Enanching testing code, not backporting Closes scylladb/scylladb#28352 * github.com:scylladb/scylladb: code: Move limiting data source to test/lib util: Simplify limiting_data_source API util: Remove buffer_input_stream test: Use seastar::util::temporary_buffer_data_source in data consumer test sstables: Use seastar::util::as_input_stream() in mx reader	2026-01-26 22:05:59 +02:00
Raphael S. Carvalho	0e07c6556d	test: Remove useless compaction group testing in database_test This compaction group testing is useless because the machinery for it to work was removed. This was useful in the early tablet days, where we wanted to test compaction groups directly. Today groups are stressed and tested on every tablet test. I see a ~40% reduction time after this patch, since database_test is one of the most (if not the most) time consuming in boost suite. Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com> Closes scylladb/scylladb#28324	2026-01-26 19:16:27 +02:00
Avi Kivity	32cc593558	Merge 'tools/scylla-sstable: introduce filter command' from Botond Dénes Filter the content of sstable(s), including or excluding the specified partitions. Partitions can be provided on the command line via `--partition`, or in a file via `--partitions-file`. Produces one output sstable per input sstable -- if the filter selects at least one partition in the respective input sstable. Output sstables are placed in the path provided via `--oputput-dir`. Use `--merge` to filter all input sstables combined, producing one output sstable. Fixes: #13076 New functionality, no backport. Closes scylladb/scylladb#27836 * github.com:scylladb/scylladb: tools/scylla-sstable: introduce filter command tools/scylla-sstable: remove --unsafe-accept-nonempty-output-dir tools/scylla-sstable: make partition_set ordered tools/scylla-stable: remove unused boost/algorithm/string.hpp include	2026-01-26 16:32:38 +02:00
Andrei Chekun	3d3fabf5fb	test.py: change the name of the test in failed directory Generally square brackets are non allowed in URI, while pytest uses it the test name to show that there were additional parameters for the same test. When such a test fail it shows the directory correctly in Jenkins, however attempt to download only this will fail, because of the square brackets in URI. This change substitute the square brackets with round brackets. Closes scylladb/scylladb#28226	2026-01-26 13:29:45 +01:00
Avi Kivity	ec70cea2a1	test/cqlpy: restore LWT tests marked XFAIL for tablets Commit `0156e97560` ("storage_proxy: cas: reject for tablets-enabled tables") marked a bunch of LWT tests as XFAIL with tablets enabled, pending resolution of #18066. But since that event is now in the past, we undo the XFAIL markings (or in some cases, use an any-keyspace fixture instead of a vnodes-only fixture). Ref #18066. Closes scylladb/scylladb#28336	2026-01-26 12:27:19 +02:00
Pavel Emelyanov	77435206b9	code: Move limiting data source to test/lib Only two tests use it now -- the limit-data-source-test iself and a test that validates continuous_data_consumer template. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2026-01-26 12:49:42 +03:00
Pavel Emelyanov	111b376d0d	util: Simplify limiting_data_source API The source maintains "limit generator" -- a function that returns the maximum size of bytes to return from the next buffer. Currently all callers just return constant numbers from it. Passing a function that returns non-constant one can, probably, be used for a fuzzy test, but even the limiting-data-source-test itself doesn't do it, so what's the point... Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2026-01-26 12:46:37 +03:00
Pavel Emelyanov	4639681907	test: Use seastar::util::temporary_buffer_data_source in data consumer test The test creates buffer_data_source_impl and wraps it with limiting data source. The former data_source duplicates the functionality of the existing seastar temporary_buffer_data_source. This patch makes the test code use seastar facility. The buffer_data_source_impl will be removed soon. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2026-01-26 12:44:33 +03:00
Pavel Emelyanov	1796997ace	Merge 'restore: Enable direct download of fully contained SSTables' from Ernest Zaslavsky This PR refactors the streaming subsystem to support direct download of fully contained sstables. Instead of streaming these files, they are downloaded and attached directly to their corresponding tables. This approach reduces overhead, simplifies logic, and improves efficiency. Expected node scope restore performance improvement: ~4 times faster in best case scenario when all sstables are fully contained. 1. Add storage options field to sstable Introduce a data member to store storage options, enabling distinction between local and object storage types. 2. Add method to create component source Extend the storage interface with a public method to create a data_source for any sstable component. 3. Inline streamer instance creation Remove make_sstable_streamer and inline its usage to allow different sets of arguments at call sites. 4. Skip streaming empty sstable sets Avoid unnecessary streaming calls when the sstable set is empty. 5. Enable direct download of contained sstables Replace streaming of fully contained sstables with direct download, attaching them to their corresponding table. Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-200 Refs: https://github.com/scylladb/scylladb/issues/23908 No need to backport as this code targets 2026.2 release (for tablet-aware restore) Closes scylladb/scylladb#26834 * github.com:scylladb/scylladb: tests: reuse test_backup_broken_streaming streaming: enable direct download of contained sstables storage: add method to create component source streaming: keep sharded database reference on tablet_sstable_streamer streaming: skip streaming empty sstable sets streaming: inline streamer instance creation tests: fix incorrect backup/restore test flow	2026-01-26 10:22:34 +03:00
Ernest Zaslavsky	cb2aa85cf5	aws_error: handle all restartable nested exception types Previously we only inspected std::system_error inside std::nested_exception to support a specific TLS-related failure mode. However, nested exceptions may contain any type, including other restartable (retryable) errors. This change unwraps one nested exception per iteration and re-applies all known handlers until a match is found or the chain is exhausted. Closes scylladb/scylladb#28240	2026-01-26 10:19:57 +03:00
Avi Kivity	55422593a7	Merge 'test/lib: Fix bugs in `boost_test_tree_lister.cc`' from Dawid Mędrek In this PR, we fix two bugs present in `boost_test_tree_lister` that affected the output of `--list_json_content` added in scylladb/scylladb@afde5f668a: * The labels test units use were duplicated in the output. * If a test suite or a test file didn't contain any tests, it wasn't listed in the output. Refs scylladb/scylladb#25415 Backport: not needed. The code hasn't been used anywhere yet. Closes scylladb/scylladb#28255 * github.com:scylladb/scylladb: test/lib/boost_test_tree_lister.cc: Record empty test suites test/lib/boost_test_tree_lister.cc: Deduplicate labels	2026-01-25 21:34:32 +02:00
Andrei Chekun	cc5ac75d73	test.py: remove deprecated skip_mode decorator Finishing the deprecation of the skip_mode function in favor of pytest.mark.skip_mode. This PR is only cleaning and migrating leftover tests that are still used and old way of skip_mode. Closes scylladb/scylladb#28299	2026-01-25 18:17:27 +02:00
Ernest Zaslavsky	bd9d5ad75b	s3_test: remove client double-close `test_chunked_download_data_source_with_delays` was calling `close()` on a client twice, remove the unnecessary call	2026-01-25 15:42:48 +02:00
Ernest Zaslavsky	70f5bc1a50	tests: reuse test_backup_broken_streaming reuse the `test_backup_broken_streaming` test to check for direct sstable download	2026-01-25 13:27:44 +02:00
Ernest Zaslavsky	13fb605edb	streaming: enable direct download of contained sstables Instead of streaming fully contained sstables, download them directly and attach them to their corresponding table. This simplifies the process and avoids unnecessary streaming overhead.	2026-01-25 13:27:44 +02:00
Pavel Emelyanov	3e09d3cc97	test: Keep test_gossiper_live_endpoints checks togethger There are two checks for live endpoints performed in test_gossiper.py, but one of those sits in test_gossiper_unreachable_endpoints somehow. This patch moves live endpoints check into live endpoints test. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com> Closes scylladb/scylladb#28224	2026-01-23 16:53:48 +02:00
Piotr Dulikowski	3ec4f67407	Merge 'vector_index: Implement rescoring' from Szymon Malewski This series implements rescoring algorithm. Index options allowing to enable this functionality were introduced in earlier PR https://github.com/scylladb/scylladb/pull/28165. When Vector Index has enabled quantization, Vector Store uses reduced vector representation to save memory, but it may degrade correctness of ANN queries. For quantized index we can enable rescoring algorithm, which recalculates similarity score from full vector representation stored in Scylla and reorder returned result set. It works also with oversampling - we fetch more candidates from Vector Store, rescore them at Scylla and return only requested number of results. Example: Creating a Vector Index with Rescoring ```sql -- Create a table with a vector column CREATE TABLE ks.products ( id int PRIMARY KEY, embedding vector<float, 128> ); -- Create a vector index with rescoring enabled CREATE INDEX products_embedding_idx ON ks.products (embedding) USING 'vector_index' WITH OPTIONS = { 'similarity_function': 'cosine', 'quantization': 'i8', 'oversampling': '2.0', 'rescoring': 'true' }; ``` 1. Quantization (`i8`) compresses vectors in the index, reducing memory usage but introducing precision loss in distance calculations 2. Oversampling (`2.0`) retrieves 2× more candidates than requested from the vector store (e.g., `LIMIT 10` fetches 20 candidates) 3. Rescoring (`true`) recalculates similarity scores using full-precision (`f32`) vectors from the base table and re-ranks results Query example: ```sql -- Find 10 most similar products SELECT id, similarity_cosine(embedding, [0.1, 0.2, ...]) AS score FROM ks.products ORDER BY embedding ANN OF [0.1, 0.2, ...] LIMIT 10; ``` With rescoring enabled, the query: 1. Fetches 20 candidates from the quantized index (due to oversampling=2.0) 2. Reads full-precision embeddings from the base table 3. Recalculates similarity scores with full precision 4. Re-ranks and returns the top 10 results In this implementation we use CQL similarity function implementation to calculate new score values and use them in post query ordering. We add that column manually to selection, but it has to be removed from the final response. Follow-up https://github.com/scylladb/scylladb/pull/28165 Fixes https://scylladb.atlassian.net/browse/SCYLLADB-83 New feature - doesn't need backport. Closes scylladb/scylladb#27769 * github.com:scylladb/scylladb: vector_index: rescoring: Fetch oversampled rows vector_index: rescoring: Sort by similarity column select_statement: Modify `needs_post_query_ordering` condition vector_index: rescoring: Add hidden similarity score column vector_index: Refactor extracting ANN query information	2026-01-23 15:20:10 +01:00
Patryk Jędrzejczak	a41d7a9240	Merge 'test_lwt_shutdown: fix flakiness by removing storage_proxy::stop injection' from Petr Gusev The storage_proxy::stop() is not called by main (it is commented out due to #293), so the corresponding message injection is never hit. When the test releases paxos_state_learn_after_mutate, shutdown may already be in progress or even completed by the time we try to trigger the storage_proxy::stop injection, which makes the test flaky. Fix this by completely removing the storage_proxy::stop injection. The injection is not required for test correctness. Shutdown must wait for the background LWT learn to finish, which is released via the paxos_state_learn_after_mutate injection. The shutdown process blocks on in-flight HTTP requests through seastar::httpd::http_server::stop and its _task_gate, so the HTTP request that releases paxos_state_learn_after_mutate is guaranteed to complete before the node is shut down. Fixes scylladb/scylladb#28260 backport: 2025.4, the `test_lwt_shutdown` test was introduced in this version Closes scylladb/scylladb#28315 * https://github.com/scylladb/scylladb: storage_proxy: drop stop() method test_lwt_shutdown: fix flakiness by removing storage_proxy::stop injection	2026-01-23 15:18:17 +01:00
Avi Kivity	30d6f3b8e0	test: test_proxy_protocol: bump timeout It was observed twice that the test times out in debug mode. Fix by increasing the timeout. The test never expects a timeout, so increasing it won't increase the test duration. Fixes #28028 Closes scylladb/scylladb#28272	2026-01-23 15:37:00 +02:00
Łukasz Paszkowski	09fde82a33	test/scylla_gdb: fix coro_task request usage, rename duplicate test - Pass pytest request fixture into coro_task (used for scylla_tmp_dir and core dump path) - Rename duplicate `test_sstable_summary` that runs sstable-index-cache to `test_sstable_index_cache` so both tests are collected Refs https://github.com/scylladb/scylladb/issues/22501 Closes scylladb/scylladb#28286	2026-01-23 15:25:58 +02:00
Piotr Dulikowski	fe9237fdc9	Merge 'alternator: don't require rf_rack flag for indexes, validate instead' from Michael Litvak In `8df61f6d99` we changed the requirements for creating materialized views and MV-based indexes - instead of requiring the rf_rack_valid_keyspaces flag to be set, we now require the keyspace to be RF-rack-valid at the time of creation, and it is enforced to remain RF-rack-valid while the MV exists. This validation is done in the cql create view/index statements. The same should be done also for alternator - when creating a table with GSI or LSI, or when adding a GSI to an existing table, previously we required the flag rf_rack_valid_keyspaces to be set. Now we change it to instead check if the keyspace is RF-rack-valid, and if not the operation fails with an appropriate error. Fixes https://github.com/scylladb/scylladb/issues/28214 backport to 2025.4 to add RF-rack-valid enforcements in alternator Closes scylladb/scylladb#28154 * github.com:scylladb/scylladb: locator: document the exception type of assert_rf_rack_valid_keyspace alternator: don't require rf_rack flag for indexes, validate instead	2026-01-23 11:49:02 +01:00
Botond Dénes	c4c2f87be7	Merge 'db: fail reads and writes with local consistency level to a DC with RF=0' from null When read or write operations are performed on a DC with RF=0 with LOCAL_QUORUM or LOCAL_ONE consistency level, Cassandra throws `Unavailable` exception. Scylla allowed such read operations and failed write operations with a cryptic: "broken promise" error. This occured because the initial availability check passed (quorum of 0 requires 0 replicas), but execution failed later when no replicas existed to process the mutation. This patch adds an explicit RF=0 validation for LOCAL_ONE and LOCAL_QUORUM that throws before attempting operation execution. The change also requires `test_query_dc_with_rf_0_does_not_crash_db` to be upgraded. This testcase was asserting somewhat similar scenario, but wasn't taking into account the whole matrix of combinations: - scenarios: successful vs unsuccesful operation outcome - local consistency levels: LOCAL_QUORUM & LOCAL_ONE - operations: SELECT (read) & INSERT (write) and so it's been extended to cover both the pre-existing and the current issues and the whole matrix of combinations. Fixes: scylladb/scylladb#27893 A minor change, no need to backport. Closes scylladb/scylladb#27894 * github.com:scylladb/scylladb: db: fail reads and writes with local consistencty level to a DC with RF=0 db: consistency_level: split `local_quorum_for()` db: consistency_level: fix nrs -> nts abbreviation	2026-01-23 12:36:20 +02:00
Petr Gusev	f5ed3e9fea	test_lwt_shutdown: fix flakiness by removing storage_proxy::stop injection storage_proxy::stop() is not called by main (it is commented out due to #293), so the corresponding message injection is never hit. When the test releases paxos_state_learn_after_mutate, shutdown may already be in progress or even completed by the time we try to trigger the storage_proxy::stop injection, which makes the test flaky. Fix this by completely removing the storage_proxy::stop injection. The injection is not required for test correctness. Shutdown must wait for the background LWT learn to finish, which is released via the paxos_state_learn_after_mutate injection. The shutdown process blocks on in-flight api HTTP requests through seastar::httpd::http_server::stop and its _task_gate, so the shutdown will not prevent the HTTP request that released the paxos_state_learn_after_mutate from completing successfully. Fixes scylladb/scylladb#28260	2026-01-23 11:20:36 +01:00
Botond Dénes	35c9a00275	Merge 'test.py: pass correctly extra cmd line arguments' from Andrei Chekun During rewrite --extra-scylla-cmdline-options was missed and it was not passed to the tests that are using pytest. The issue that there were no possibility to pass these parameters via cmd to the Scylla, while tests were not affected because they were using the parameters from the yaml file. This PR fixes this issue so it will be easier to modify the Scylla start parameters without modifying code. No backport needed, only framework enhancement. Closes scylladb/scylladb#28156 * github.com:scylladb/scylladb: test.py: do not crash when there is no boost log test.py: pass correctly extra cmd line arguments	2026-01-23 11:26:01 +02:00
Andrzej Jackowski	c493a66668	test: check cql_requests_count instead of tasks_processed in SL Before this change, the test function `_verify_tasks_processed_metrics` verified that after service level reconfiguration, a given number of `scylla_scheduler_tasks_processed` were processed by a given scheduling group. Moreover, the check verified that another scheduling group didn't process a high number of requests. The second check was vulnerable to flakiness, because sometimes additional load caused extensive work in the second scheduling group (e.g. password hashing in `sl:driver` due to new connections being created). To avoid test failures, this commit changes which metric is verified: instead of `scylla_scheduler_tasks_processed`, the metric `scylla_transport_cql_requests_count` is checked. This prevents similar problems, because there is no reason for a high number of requests to be processed by the second scheduling group. Moreover, it allows decreasing the number of requests that are sent for verification, and thus speeds up the test. Fixes: scylladb/scylladb#27715 Closes scylladb/scylladb#28318	2026-01-23 10:19:29 +01:00
Patryk Jędrzejczak	4e984139b2	Merge 'strongly consistent tables: basic implementation' from Petr Gusev In this PR we add a basic implementation of the strongly-consistent tables: * generate raft group id when a strongly-consistent table is created * persist it into system.tables table * start raft groups on replicas when a strongly-consistent tablet_map reaches them * add strongly-consistent version of the storage_proxy, with the `query` and `mutate` methods * the `mutate` method submits a command to the tablets raft group, the query method reads the data with `raft.read_barrier()` * strongly-consistent versions of the `select_statement` and `modification_statement` are added * a basic `test_strong_consistency.py/test_basic_write_read` is added which to check that we can write and read data in a strongly consistent fashion. Limitations: * for now the strongly consistent tables can have tablets only on shard zero. This is because we (ab/re) use the existing raft system tables which live only on shard0. In the next PRs we'll create separate tables for the new tablets raft groups. * No Scylla-side proxying - the test has to figure out who is the leader and submit the command to the right node. This will be fixed separately. * No tablet balancing -- migration/split/merges require separate complicated code. The new behavior is hidden behind `STRONGLY_CONSISTENT_TABLES` feature, which is enabled when the `STRONGLY_CONSISTENT_TABLES` experimental feature flag is set. Requirements, specs and general overview of the feature can be found [here](https://scylladb.atlassian.net/wiki/spaces/RND/pages/91422722/Strong+Consistency). Short term implementation plan is [here](https://docs.google.com/document/d/1afKeeHaCkKxER7IThHkaAQlh2JWpbqhFLIQ3CzmiXhI/edit?tab=t.0#heading=h.thkorgfek290) One can check the strongly consistent writes and reads locally via cqlsh: scylla.yaml: ``` experimental_features: - strongly-consistent-tables ``` cqlsh: ``` CREATE KEYSPACE IF NOT EXISTS my_ks WITH replication = {'class': 'NetworkTopologyStrategy', 'replication_factor': 1} AND tablets = {'initial': 1} AND consistency = 'local'; CREATE TABLE my_ks.test (pk int PRIMARY KEY, c int); INSERT INTO my_ks.test (pk, c) VALUES (10, 20); SELECT * FROM my_ks.test WHERE pk = 10; ``` Fixes SCYLLADB-34 Fixes SCYLLADB-32 Fixes SCYLLADB-31 Fixes SCYLLADB-33 Fixes SCYLLADB-56 backport: no need Closes scylladb/scylladb#27614 * https://github.com/scylladb/scylladb: test_encryption: capture stderr test/cluster: add test_strong_consistency.py raft_group_registry: disable metrics for non-0 groups strong consistency: implement select_statement::do_execute() cql: add select_statement.cc strong consistency: implement coordinator::query() cql: add modification_statement cql: add statement_helpers strong consistency: implement coordinator::mutate() raft.hh: make server::wait_for_leader() public strong_consistency: add coordinator modification_statement: make get_timeout public strong_consistency: add groups_manager strong_consistency: add state_machine and raft_command table: add get_max_timestamp_for_tablet tablets: generate raft group_id-s for new table tablet_replication_strategy: add consistency field tablets: add raft_group_id modification_statement: remove virtual where it's not needed modification_statement: inline prepare_statement() system_keyspace: disable tablet_balancing for strongly_consistent_tables cql: rename strongly_consistent statements to broadcast statements	2026-01-23 09:52:33 +01:00
Botond Dénes	f375288b58	tools/scylla-sstable: introduce filter command Filter the content of sstable(s), including or excluding the specified partitions. Partitions can be provided on the command line via `--partition`, or in a file via `--partitions-file`. Produces one output sstable per input sstable -- if the filter selects at least one partition in the respective input sstable. Output sstables are placed in the path provided via `--oputput-dir`. Use `--merge` to filter all input sstables combined, producing one output sstable.	2026-01-22 17:20:07 +02:00
Michael Litvak	1f7a65904e	alternator: don't require rf_rack flag for indexes, validate instead In `8df61f6d99` we changed the requirements for creating materialized views and MV-based indexes - instead of requiring the rf_rack_valid_keyspaces flag to be set, we now require the keyspace to be RF-rack-valid at the time of creation, and it is enforced to remain RF-rack-valid while the MV exists. This validation is done in the cql create view/index statements. The same should be done also for alternator - when creating a table with GSI or LSI, or when adding a GSI to an existing table, previously we required the flag rf_rack_valid_keyspaces to be set. Now we change it to instead check if the keyspace is RF-rack-valid, and if not the operation fails with an appropriate error.	2026-01-22 16:11:35 +01:00
Szymon Malewski	29d090845a	vector_index: rescoring: Fetch oversampled rows So far with oversampling the extended set of keys was returned from VS, but query to the base table was still limited by the query `limit`. Now for rescoring we want to fetch rows for all the keys returned from VS. However later we need to restore the command limit, to trim result_set accordingly. For non-rescoring scenarios we trim directly keys set returned from VS if it happens to exceed query limit. With this change rescoring validation tests (except `no_nulls_in_rescored_results`) pass fully. Fixes https://scylladb.atlassian.net/browse/SCYLLADB-83	2026-01-22 15:38:44 +01:00
Szymon Malewski	57e7a4fa4f	select_statement: Modify `needs_post_query_ordering` condition Our plan for rescoring is to use the existing post-query ordering mechanism to sort (and trim) result_set by similarity column. For general SELECT case this ordering is permitted only for queries with IN on the partition key and an ORDER BY, which is checked in `needs_post_query_ordering`. Recently this check was overriden for ANN queries in https://github.com/scylladb/scylladb/pull/28109 to enable IN queries handled by VS without excessive post-processing. In this patch we revert that change - ANN case will be handled by general check. However we change the condition - we will enable post processing anytime `_ordering_comparator` is set. In current implementation `_ordering_comparator` is created only in `select_statement::prepare` with `get_ordering_comparator`, only for the same conditions as were checked in `needs_post_query_ordering`, so this change should be transparent for general SELECT. For ANN query it is also not set (yet), so it will not influence ANN filtering, but we confirm that this functionality still works by adding filtering test: `test/vector_search/filter_test.cc::vector_store_client_test_filtering_ann_cql`. Rescoring ordering for ANN queries will be enabled when we add `_ordering_comparator` in following patch.	2026-01-22 15:38:44 +01:00
Pavel Emelyanov	cb6ee05391	Merge 'Extend snapshot manifest.json with tablet-aware metadata' from Benny Halevy This series extends the json manifest file we create when taking snapshots. It adds the following metadata: - manifesr version and scope - snapshot name - created_at and expires_at timestamps (#24061) - node metadata (host_id, dc, rack) - keyspace and table metadat - tablet_count (#26352) - per-sstable metadata (#26352) Fixes [SCYLLADB-189](https://scylladb.atlassian.net/browse/SCYLLADB-189) Fixes [SCYLLADB-195](https://scylladb.atlassian.net/browse/SCYLLADB-195) Fixes [SCYLLADB-196](https://scylladb.atlassian.net/browse/SCYLLADB-196) * Enhancement, no backport needed [SCYLLADB-189]: https://scylladb.atlassian.net/browse/SCYLLADB-189?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [SCYLLADB-195]: https://scylladb.atlassian.net/browse/SCYLLADB-195?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ [SCYLLADB-196]: https://scylladb.atlassian.net/browse/SCYLLADB-196?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ Closes scylladb/scylladb#27945 * github.com:scylladb/scylladb: snapshot: keep per-sstable metadata in manifest.json snapshot: add table info and tablet_count to manifest.json snapshot: add basic support for snapshot ttl in manifest.json table: snapshot_on_all_shards: take snapshot_options db: snapshot_ctl: move skip_flush to struct snapshot_options snapshot: add snapshot name in manifest.json test: lib: cql_test_env: apply db::config::tablets_mode_for_new_keyspaces snapshot: add node info to manifest.json snapshot: add manifest info to manifest.json test: database_test: snapshot_works: add validate_manifest	2026-01-22 15:19:11 +03:00
Patryk Jędrzejczak	67045b5f17	Merge 'raft_topology, tablets: Drain tablets in parallel with other topology operations' from Tomasz Grabiec Allows other topology operations to execute while tablets are being drained on decommission. In particular, bootstrap on scale-out. This is important for elasticity. Allows multiple decommission/removenode to happen in parallel, which is important for efficiency. Flow of decommission/removenode request: 1) pending and paused, has tablet replicas on target node. Tablet scheduler will start draining tablets. 2) No tablets on target node, request is pending but not paused 3) Request is scheduled, node is in transition 4) Request is done Nodes are considered draining as soon as there is a leave or remove request on them. If there are tablet replicas present on the target node, the request is in a paused state and will not be picked by topology coordinator. The paused state is computed from topology state automatically on reload. When request is not paused, its execution starts in write_both_read_old state. The old tablet_draining state is not entered (it's deprecated now). Tablet load balancing will yield the state machine as soon as some request is no longer paused and ready to be scheduled, based on standard preemption mechanics. Fixes #21452 Closes scylladb/scylladb#24129 * https://github.com/scylladb/scylladb: docs: Document parallel decommission and removenode and relevant task API test: Add tests for parallel decommission/removenode test: util: Introduce ensure_group0_leader_on() test: tablets: Check that there are no migrations scheduled on draining nodes test: lib: topology_builder: Introduce add_draining_request() topology_coordinator, tablets: Fail draining operations when tablet migration fails due to critical disk utilization tablets: topology_coordinator: Refactor to propagate reason for migration rollback tablet_allocator: Skip co-location on draining nodes node_ops: task_manager_module: Populate entity field also for active requests tasks: node_ops: Put node id in the entity field tasks, node_ops: Unify setting of task_stats in get_status() and get_stats() topology: Protect against empty cancelation reason tasks, topology: Make pending node operations abortable doc: topology-over-raft.md: Fix diagram for replacing, tablet_draining is not engaged raft_topology, tablets: Drain tablets in parallel with other topology operations virtual_tables: Show draining and excluded fields in system.cluster_status and system.load_by_node locator: topology: Add "draining" flag to a node topology_coordinator: Extract generate_cancel_request_update() storage_service: Drop dependency in topology_state_machine.hh in the header locator: Extract common code in assert_rf_rack_valid_keyspace() topology_coordinator, storage_service: Validate node removal/decommission at request submission time	2026-01-22 13:06:53 +01:00
Botond Dénes	21900c55eb	tools/scylla-sstable: remove --unsafe-accept-nonempty-output-dir This flag was added to operations which have an --output-dir command-line arguments. These operations write sstables and need a directory where to write them. Back in the numeric-generation world this posed a problem: if the directory contained any sstable, generation clash was almost guaranteed, because each scylla-sstable command invokation would start output generations from 1. To avoid this, empty output directory was a requirement, with the --unsafe-accept-nonempty-output-dir allowing for a force-override. Now in the timeuuid generation days, all this is not necessary anymore: generations are unique, so it is not a problem if the output directory already contains sstables: the probability of generation clash is almost 0. Even if it happens, the tool will just simply fail to write the new sstable with the clashing generation. Remove this historic relic of a flag and the related logic, it is just a pointless nuissance nowadays.	2026-01-22 13:55:59 +02:00
Piotr Smaron	d4c28690e1	db: fail reads and writes with local consistencty level to a DC with RF=0 When read or write operations are performed on a DC with RF=0 with LOCAL_QUORUM or LOCAL_ONE consistency level, Cassandra throws `Unavailable` exception. Scylla allowed such read operations and failed write operations with a cryptic: "broken promise" error. This occured because the initial availability check passed (quorum of 0 requires 0 replicas), but execution failed later when no replicas existed to process the mutation. This patch adds an explicit RF=0 validation for LOCAL_ONE and LOCAL_QUORUM that throws before attempting operation execution. The change also requires `test_query_dc_with_rf_0_does_not_crash_db` to be upgraded. This testcase was asserting somewhat similar scenario, but wasn't taking into account the whole matrix of combinations: - scenarios: successful vs unsuccesful operation outcome - local consistency levels: LOCAL_QUORUM & LOCAL_ONE - operations: SELECT (read) & INSERT (write) and so it's been extended to cover both the pre-existing and the current issues and the whole matrix of combinations. Fixes: scylladb/scylladb#27893	2026-01-22 12:49:45 +01:00
Marcin Maliszkiewicz	32543625fc	test: perf: reuse stream id When one request is super slow and req/s high in theory we have a collision on id, this patch avoids that by reusing id and aborting when there is no free one (unlikely).	2026-01-22 12:26:50 +01:00
Marcin Maliszkiewicz	7bf7ff785a	main: test: add future and abort_source to after_init_func This commit avoids leaking seastar::async future from two benchmark tools: perf-alternator and perf-cql-raw. Additionally it adds abort_source for fast and clean shutdown.	2026-01-22 12:26:50 +01:00
Marcin Maliszkiewicz	0d20300313	test: perf: add option to stress multiple tables in perf-cql-raw	2026-01-22 12:26:50 +01:00
Marcin Maliszkiewicz	a033b70704	test: perf: add perf-cql-raw benchmarking tool The tool supports: - auth or no auth modes - simple read and write workloads - connection pool or connection per request modes - in-process or remote modes, remote may be usefull to assess tool's overhead or use it as bigger scale benchmark - uses prepared statements by default - connection only mode, for testing storms It could support in the future: - TLS mode - different workloads - shard awareness Example usage: > build/release/scylla perf-cql-raw --workdir /tmp/scylla-data --smp 2 --cpus 0,1 \ --developer-mode 1 --workload read --duration 5 2> /dev/null Running test with config: {workload=read, partitions=10000, concurrency=100, duration=5, ops_per_shard=0} Pre-populated 10000 partitions 97438.42 tps (269.2 allocs/op, 1.1 logallocs/op, 35.2 tasks/op, 113325 insns/op, 80572 cycles/op, 0 errors) 102460.77 tps (261.1 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108222 insns/op, 75447 cycles/op, 0 errors) 95707.93 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108443 insns/op, 75320 cycles/op, 0 errors) 102487.87 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 107956 insns/op, 74320 cycles/op, 0 errors) 100409.60 tps (261.0 allocs/op, 0.0 logallocs/op, 31.7 tasks/op, 108337 insns/op, 75262 cycles/op, 0 errors) throughput: mean= 99700.92 standard-deviation=3039.28 median= 100409.60 median-absolute-deviation=2759.85 maximum=102487.87 minimum=95707.93 instructions_per_op: mean= 109256.53 standard-deviation=2281.39 median= 108337.37 median-absolute-deviation=1034.83 maximum=113324.69 minimum=107955.97 cpu_cycles_per_op: mean= 76184.36 standard-deviation=2493.46 median= 75320.20 median-absolute-deviation=922.09 maximum=80572.19 minimum=74320.00	2026-01-22 12:26:50 +01:00
Patryk Jędrzejczak	e11450ccca	test: test_raft_recovery_user_data: prevent repeated ALTER KEYSPACE request The test is currently flaky. With `remove_dead_nodes_with == "remove"`, it sends several ALTER KEYSPACE requests. The request performed just after adding 3 new nodes can unexpectedly be sent twice to two different nodes by the driver. The second receiver rejects the request through the new guardrail added in `2e7ba1f8ce`, and the test fails. This has been acknowledged as a bug in the Python driver. It shouldn't retry non-idempotent requests with the default retry policy. There could be one more bug in the driver, as it looks like the driver decides to resend the request after it disconnects from the first receiver. The first receiver has just bootstrapped, so the driver shouldn't disconnect. We deflake the test by reconnecting the driver before performing the problematic ALTER KEYSPACE request. The change has been tested in byo, as the failure reproduces only in CI. Without the change, the test fails once in ~250 runs in dev mode. With the change, more than 1000 runs passed. Fixes #27862 No backport needed as `2e7ba1f8ce` is only in master. Closes scylladb/scylladb#28290	2026-01-22 14:13:42 +03:00
Botond Dénes	7d2e6c0170	Merge 'config: add enforce_rack_list option' from Aleksandra Martyniuk Add enforce_rack_list option. When the option is set to true, all tablet keyspaces have rack list replication factor. When the option is on: - CREATE STATEMENT always auto-extends rf to rack lists; - ALTER STATEMENT fails when there is numeric rf in any DC. The flag is set to false by default and a node needs to be restarted in order to change its value. Starting a node with enforce_rack_list option will fail, if there are any tablet keyspaces with numeric rf in any DC. enforce_rack_list is a per-node option and a user needs to ensure that no tablet keyspace is altered or created while nodes in the cluster don't have the consistent value. Mark rf_rack_valid_keyspaces as deprecated. Fixes: https://github.com/scylladb/scylladb/issues/26399. New feature; no backport needed Closes scylladb/scylladb#28084 * github.com:scylladb/scylladb: test: add test for enforce_rack_list option db: mark rf_rack_valid_keyspaces as deprecated config: add enforce_rack_list option Revert "alternator: require rf_rack_valid_keyspaces when creating index"	2026-01-22 10:27:35 +02:00
Benny Halevy	d6557764b9	snapshot: keep per-sstable metadata in manifest.json Adds a "sstables" array member to manifest.json. For each sstables, keep the following metadata: id - a uuid for the sstable (the sstable identifier if the use-sstable-identifier option was used, otherwise the sstable uuid generation) toc_name - the name of the TOC.txt file data_size and index_size - in bytes first_token and last_token - of the sstable first and last keys. Fixes: SCYLLADB-196 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:42:52 +02:00
Benny Halevy	dc9093303d	snapshot: add table info and tablet_count to manifest.json Add a table member to manifest.json with the keyspace_name, table_name, table_id, tablets_type, and, for tablets-enabled tables, get tablet_count on each shard and write the minimum to manifest.json. For vnodes-based tables, tablet_count=0. For now, `tablets_type` may be either `none` for vnodes tables, or `powof2` for tablets tables. In the future, when we support arbitrary tablt boundaries, this will be reflected here, and it is likely we would backup the whole tablets map sperately to get all tablet boundaries. Fixes SCYLLADB-195 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:36:52 +02:00
Benny Halevy	91df129e21	snapshot: add basic support for snapshot ttl in manifest.json Store the snapshot `created_at` time and an optional `expires_at` time. Fixes SCYLLADB-189 Signed-off-by: Benny Halevy <bhalevy@scylladb.com>	2026-01-22 09:12:56 +02:00

1 2 3 4 5 ...

10663 Commits