Most of the functionality is tested in cqlpy tests located in
`test_guardrail_write_consistency_level.py`. Add two tests
that require the cluster framework:
- `test_invalid_write_cl_guardrail_config` checks the node startup
path when incorrect `write_consistency_levels_warned` and
`write_consistency_levels_disallowed` values are used.
- `test_write_cl_default` checks the behavior of the default
configuration using a multi-node cluster.
Tests execution time:
- Dev: 10s
- Debug: 18s
Refs: SCYLLADB-259
Implement basic tests for write consistency level guardrails,
verifying that they work for each type of write request (inserts,
updates, deletes, logged batches, unlogged batches, conditional batches,
and counter operations).
All tests are marked as Scylla-only because they currently don't
pass with Cassandra due to differences in handling superusers (see:
SCYLLADB-882).
Tests execution time:
- Dev: 3s
- Debug: 14s
Refs: SCYLLADB-259
Refs: SCYLLADB-882
Enable verification of write consistency level guardrails in
`modification_statement` and `batch_statement`.
Neither guardrail is enabled by default, so as not to disrupt clusters
that are currently using any of the CLs for writes. The warning
guardrail may seem harmless, as it only adds a warning to the CQL
response; however, enabling it can significantly increase network
traffic (as a warning message is added to each response) and also
decrease throughput due to additional allocations required to prepare
the warning. Therefore, both guardrails should be enabled with care.
The newly added `writes_per_consistency_level` metric, which is
incremented unconditionally, can help decide whether a guardrail can
be safely enabled in an existing cluster.
This commit adds additional `if` instructions on the critical path.
However, based on the `perf_simple_query` benchmark for writes,
the difference is marginal (~40 additional instructions, which is
a relative difference smaller than 0.001).
BEFORE:
```
291443.35 tps ( 53.3 allocs/op, 16.0 logallocs/op, 14.2 tasks/op, 48067 insns/op, 18885 cycles/op, 0 errors)
throughput:
mean= 289743.07 standard-deviation=6075.60
median= 291424.69 median-absolute-deviation=1702.56
maximum=292498.27 minimum=261920.06
instructions_per_op:
mean= 48072.30 standard-deviation=21.15
median= 48074.49 median-absolute-deviation=12.07
maximum=48119.87 minimum=48019.89
cpu_cycles_per_op:
mean= 18884.09 standard-deviation=56.43
median= 18877.33 median-absolute-deviation=14.71
maximum=19155.48 minimum=18821.57
```
AFTER:
```
290108.83 tps ( 53.3 allocs/op, 16.0 logallocs/op, 14.2 tasks/op, 48121 insns/op, 18988 cycles/op, 0 errors)
throughput:
mean= 289105.08 standard-deviation=3626.58
median= 290018.90 median-absolute-deviation=1072.25
maximum=291110.44 minimum=274669.98
instructions_per_op:
mean= 48117.57 standard-deviation=18.58
median= 48114.51 median-absolute-deviation=12.08
maximum=48162.18 minimum=48087.18
cpu_cycles_per_op:
mean= 18953.43 standard-deviation=28.76
median= 18945.82 median-absolute-deviation=20.84
maximum=19023.93 minimum=18916.46
```
Fixes: SCYLLADB-259
Add `write_consistency_levels_disallowed_violations` and
`write_consistency_levels_warned_violations` metrics to track
violations of write_consistency_levels guardrails.
Add `writes_per_consistency_level` to track what CL is used by
writes, regardless of the guardrails configuration.
Data gathered by this metric can be used to decide whether enabling
a particular write consistency level guardrail in a particular
existing cluster is safe.
Refs: SCYLLADB-259
Add enum_sets to query_processor that track the configuration
values of `write_consistency_levels_warned` and
`write_consistency_levels_disallowed`.
Refs: SCYLLADB-259
The current way of checking the boost's stdout can have a race
condition when pytest will try to read the file before it was really
flushed. So this PR should eliminate this possibility.
Closesscylladb/scylladb#28783
PR #28703 was merged into master but not with the latest version of the
changes. This patch is an incremental fix for this.
Currently, the elements of the tablet_sizes_per_shard vector are
incremented in separate shards. This is prone to false sharing of cache
lines, and ping-pong of memory, which leads to reduced performance.
In this patch, in order to avoid cache line collisions while updating
the sum of tablet sizes per shard, we align the counter to 64 bytes.
Fixes: SCYLLADB-678
Closesscylladb/scylladb#28757
This pr adds await for each one of the tasks to wait for the MV schema to be added successfully
and then to start the server shutdown
With this change we dont need will not get the shutdown races.
Closesscylladb/scylladb#28774
This commit removes the information that Alternator doesn't support tablets.
The limitation is no longer valid.
Fixes SCYLLADB-778
Closesscylladb/scylladb#28781
`test_autoretrain_dict` sporadically fails because the default
compression algorithm was changed after the test was written.
`9ffa62a986815709d0a09c705d2d0caf64776249` was an attempt to fix it by
changing the compression configuration during node startup. However,
the configuration change had an incorrect YAML format and was
ignored by ScyllaDB. This commit fixes it.
Fixes: scylladb/scylladb#28204Closesscylladb/scylladb#28746
For a while, we have seen coroutine related tests (those that use the
coroutine_task fixture) fail occasionally, because no coroutine frame is
found. Multiple attempts were made to make this problem self-diagnosing
and dump enough information to be able to debug this post-mortem. To no
avail so far. A lot of time was invested into this this benign issue:
See the long discussion at https://github.com/scylladb/scylladb/issues/22501.
It is not known if the bug is in gdb, or the gdb script trying to find
the coroutine frame. In any case, both are only used for debugging, so
we can tolerate occasional failures -- we are forced to do so when
working with gdb anyway.
Instead of piling on more effor there, just skip these tests when the
problem occurs. This solves the CI flakyness.
Fixes: #22501Closesscylladb/scylladb#28745
Add --continue-after-error true to perf-cql-raw and perf-alternator
tests, and --stop-on-error false to perf-simple-query test, so that
tests don't abort on the first error.
Reason for this is that tests are flaky with example failure:
Perf test failed: std::runtime_error (server returned ERROR to EXECUTE)
When CPU is starved on CI we can return timeouts and/or other errors.
The change should make tests more robust on the expense of smaller test
scope. But those tests were written mostly to test startup sequence
as it differs from Scylla's starup.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-759Closesscylladb/scylladb#28767
Lua doesn't have separate integer and floating point numbers,
so we check if a number can fit in an integer and if so convert
it to an integer.
The conversion routine invokes undefined behavior (and even
acknowledges it!). More recent compilers changed their behavior
when casting infinities, breaking test_user_function_double_return
which tests this conversion.
Fix by tightening the conversion to not invoke undefined behavior.
Closesscylladb/scylladb#28503
This patchset:
- ensures the loading semaphore is acquired in cross-shard callbacks
- fixes iterator invalidation problem when reloading all cached permissions
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-780
Backport: no, affected code not released yet
Closesscylladb/scylladb#28766
* github.com:scylladb/scylladb:
auth: cache: fix permissions iterator invalidation in reload_all_permissions
auth/cache: acquire _loading_sem in cross-shard callbacks
The hostent::addr_list is deprecated in favor of address_entry::addr
field that contains the very same addresses.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28565
- add an overload to the rest http client to accept retry strategy instance as an argument
- remove hand rolled error handling from object storage client and replace with common machinery that supports handling and retrying when appropriate
No backport neede since it is only refactoring
Closesscylladb/scylladb#28161
* github.com:scylladb/scylladb:
object_storage: add retryable machinery to object storage
rest_client: add `simple_send` overload
Move the storage test suite from test/storage/ to test/cluster/storage/
to consolidate related cluster-based tests.This removes the standalone
test/storage/suite.yaml as the tests will use the cluster's test configuration.
Initially these tests were in cluster, but to use unshare at first
iteration they were moved outside. Now they are using another way to
handle volumes without unshare, they should be in cluster
Closesscylladb/scylladb#28634
The ANN vector queries with all-zero vectors are allowed even on vector indexes with similarity function set to cosine.
When enabling the rescoring option, those queries would fail as the rescoring calls `similarity_cosine` function underneath, causing an `InvalidRequest` exception as all-zero vectors were not allowed matching Cassandra's behaviour.
To eliminate the discrepancy we want the all-zero vector `similarity_cosine` calls to pass, but return the NaN as the cosine similarity for zero vectors is mathematically incorrect. We decided not to use arbitrary values contrary to USearch, for which the distance (not to be confused with similarity) is defined as cos(0, 0) = 0, cos(0, x) = 1 while supporting the range of values [0, 2].
If we wanted to convert that to similarity, that would mean sim_cos(0, x) = 0.5, which does not support mathematical reasoning why that would be more similar than for example vectors marking obtuse angles.
It's safe to assume that all-zero vectors for cosine similarity shouldn't make any impact, therefore we return NaN and eliminate them from best results.
Adjusted the tests accordingly to check both proper Cassandra and Scylla's behaviour.
Fixes: SCYLLADB-456
Backport to 2026.1 needed, as it fixes the bug for ANN vector queries using rescoring introduced there.
Closesscylladb/scylladb#28609
* github.com:scylladb/scylladb:
test/vector_search: add reproducer for rescoring with zero vectors
vector_search: return NaN for similarity_cosine with all-zero vectors
This patch series moves `test/cluster/dtest/guardrails_test.py`
to `test/cluster/test_guardrails.py`, and migrates it from `cluster/dtest/`
to `cluster/` framework.
There are two motivations for moving the test:
- Execution time reduction (from 12s to 9s in 'dev' in my env)
- Facilitate adding new tests to the `guardrails_test.py` file
No backport, `dtest/guardrails_test.py` is only on master
Closesscylladb/scylladb#28737
* github.com:scylladb/scylladb:
test: move dtest/guardrails_test.py to test_guardrails.py
test: prepare guardrails_test.py to be moved to test/cluster/
The inner loops in reload_all_permissions iterate role's permissions
and _anonymous_permissions maps across yield points. Concurrent
load_permissions calls (which don't hold _loading_sem) can emplace
into those same maps during a yield, potentially triggering a rehash
that invalidates the active iterator.
We want to avoid adding semaphore acquire in load_permissions
because it's on a common path (get_permissions).
Fixing by snapshotting the keys into a vector before iterating with
yields, so no long-lived map iterator is held across suspension
points.
This series hardens MV shutdown behavior by fixing lifecycle tracking for detached view-builder callbacks and aligning update handling with the same async dispatch style used by create/drop.
Patch 1 refactors on_update_view to use a dedicated coroutine dispatcher (dispatch_update_view), keeping update logic serialized under the existing view-builder lock and consistent with the callback architecture already used for create/drop paths.
Patch 2 adds explicit callback lifetime coordination in view_builder:
- introduce a seastar::gate member
- acquire _ops_gate.hold() when launching detached create/update/drop dispatch futures
- keep the hold alive until each detached future resolves
- close the gate during view_builder::drain() so shutdown waits for in-flight callback work before final teardown
Together, these changes reduce shutdown race exposure in MV event handling while preserving existing behavior for normal operation.
Testing:
- pytest --test-py-init test/cluster/mv (47 passed, 7 skipped)
backport: not required started happening in master
fixes: SCYLLADB-687
Closesscylladb/scylladb#28648
* github.com:scylladb/scylladb:
db/view: gate detached view-builder callbacks during shutdown
db:view: refactor on_update_view to use coroutine dispatcher
distribute_role() modifies _roles on non-zero shards via
invoke_on_others() without holding _loading_sem. Similarly, load_all()'s
invoke_on_others() callback calls prune_all() without the semaphore.
When these run concurrently with reload_all_permissions(), which
iterates _roles across yield points, an insertion can trigger
absl::flat_hash_map::resize(), freeing the backing storage while
an iterator still references it.
Fix by acquiring _loading_sem on the target shard in both
distribute_role()'s and load_all()'s invoke_on_others callbacks,
serializing all _roles mutations with coroutines that iterate
the map.
remove hand rolled error handling from object storage client
and replace with common machinery that supports exception
handling and retrying when appropriate
The test uses create_ks_and_cf helper duplicating the existing code that does the same. This PR patches basic tests to use standard facilities. Also it prepares the ground for testing keyspace storage options with rf=3
Cleaning tests, not backporting
Closesscylladb/scylladb#28600
* https://github.com/scylladb/scylladb:
test/object_store: Remove create_ks_and_cf() helper
test/object_store: Replace create_ks_and_cf() usage with standard methods
test/object_store: Shift indentation right for test cases
Currently, test_secondary_index.py::test_indexing_paging_and_aggregation
is very slow, and the slowest test in the test/cqlpy framework: It takes
around 13 seconds on dev build, and because it is CPU-bound (doesn't sleep),
it is much slower on debug builds. The reason for this slowness is that it
needs to set up and read over 10,000 rows which is the default
select_internal_page_size.
But after the patches in pull request (#25368), we can configure
select_internal_page_size, so in this patch we change the test to
temporarily reduce this option to just 50, and then the test can reach
the same code paths with just 142 rows instead of 20120 rows before this
patch.
As a result, the test should now be 140 times faster than it was before.
In practice, because of some fixed overheads (the test creates several
tables and indexes), in dev build mode the test run speedup is "only"
26-fold (to around half a second).
I verified that removing the code added in bb08af7 indeed makes the new
shorter test fail - and this is the only test in test_secondary_index.py
that starts to fail besides test_index_paging_group_by which is also
related (so my revert didn't just break secondary indexing completely).
So the shorter test is still a good regression test.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#28268
The future toolchain did not build the sanitizers, so debug
executables did not link. Fix by not disabling the sanitizers.
Closesscylladb/scylladb#28733
The test_restore_with_streaming_scopes among other things checks how data streams flow while restoring. Whether or not to check the streams is decided based on the min tablet count value, which is compared with a hardcoded 512. This value of 512 matched the tablet count used by this test until it was "optimized" by #27839, where this number changed to 5 and streaming checks became off.
Good news is that the very same checks are still performed by test_refresh_with_streaming_scopes. But it's better to have a working restoration test anyway.
Minor test fix, not backporting
Closesscylladb/scylladb#28607
* github.com:scylladb/scylladb:
test: Fix the condition for streaming directions validation
test: Split test_backup.py::check_data_is_back() into two
Currently, the test assumes that when
'topology_coordinator_pause_before_processing_backlog: waiting' is
logged, the task for decommission must be there. This was based on the
assumption that topology coordinator is idle and decommission request
wakes it up. But if the server is slow enough, it may still be running
the load balancer in reaction to table creation, and block on that
injection point before decommission request was added.
Fix by waiting for the task to appear rather than the injection.
Fixes SCYLLADB-715
Only 2026.1 vulnerable.
Closesscylladb/scylladb#28688
* github.com:scylladb/scylladb:
test_tablets_parallel_decommission: Fix flakiness due to delayed task appearance
test: cluster: task_manager_client: Introduce wait_task_appears()
tests: pylib: util: Add exponential backoff to wait_for
There's a bunch of incremental repair tests that want to call scylla
sstable command. For that they try to find where scylla binary by
scanning /proc directory (see local_process_id and get_scylla_path
helpers).
There's shorter way -- just call manager.get_server_exe().
Same for backup-restore test.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28676
There are three tests and a function with a pair of boolean parameters
called by those. It's less code if the function becomes a test with
parameters.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28677
The test_backup_simple creates a ks/cf, takes a snapshot, backs it up,
then checks that the files were uploaded. The test_backup_move does the
same, but also plays with 'move_files' parameter to be true/false.
In fact, the "move" test was the copy of "simple" one that dropepd check
for scheduling group being "streaming" (backup with --move-files can
check the same, it's not bad), and check for destination bucket to
contain needed files (same here -- checking that files arrived to bucket
after --move-files is good).
In the end of the day, after the change backup test is run two times,
instead of three, and performs extra checks for --move-files case.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28606
https://github.com/scylladb/scylladb/pull/25746 added a new column to `system.clients`: `client_options frozen<map<text, text>>`. This column stores all options sent by the client in the `STARTUP` message.
This PR also added `CLIENT_OPTIONS` to the list of values sent in `SUPPORTED` message, and documented that drivers can send their configuration (as JSON) in `STARTUP` under this key.
Documentation for the new column was not added to the description of `system.clients` table, and documentation about the new `STARTUP` key was added in `protocol-extensions.md`, but in the section about shard awareness extension.
This PR adds missing `system.clients` column description, moves the documentation of `CLIENT_OPTIONS` into its own section, and expands it a bit.
Backport: none, because this fixes internal documentation.
Closesscylladb/scylladb#28126
* github.com:scylladb/scylladb:
protocol-extensions.md: Fix client_options docs
system_keyspace.md: Add client_options column
system_keyspace.md: Fix order in system.clients
Doing it with format("{}", foo) is correct, but to_string is
a bit more lightweight.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28630
The `try-catch` expression is pretty much useless in its current form. If we return the future, the awaiting will only be performed by the caller, completely circumventing the exception handling.
As a result, instead of handling `raft::request_aborted` with a proper error message, the user will face `seastar::abort_requested_exception` whose message is cryptic at best. It doesn't even point to the root of the problem.
Fixes SCYLLADB-665
Backport: This is a small improvement and may help when debugging, so let's backport it to all supported versions.
Closesscylladb/scylladb#28624
* https://github.com/scylladb/scylladb:
test: raft: Add test_aborting_wait_for_state_change
raft: Describe exception types for wait_for_state_change and wait_for_leader
raft: Await instead of returning future in wait_for_state_change
This commit moves `guardrails_test.py`, prepared in the previous
commit of this patch series, to `test/cluster/test_guardrails.py`.
It also cleans up `suite.yaml`.
Disable `test/cluster/dtest/guardrails_test.py` in `suite.yaml` and
make it compatible with the `test/cluster/` framework. This will
allow moving this file from `test/cluster/dtest/` to `test/cluster/`
in the next commit of this patch series.
There are two motivations for moving the test:
- Execution time reduction (from 12s to 9s in 'dev' in my env)
- Facilitate adding new tests to the `guardrails_test.py` file
There are 3 metrics (that goes in every compaction_history entry):
total_tombstone_purge_attempt
total_tombstone_purge_failure_due_to_overlapping_with_memtable
total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable
When a tombstone is not expired (e.g. doesn't satisfy "gc_before" or
grace period), it can be currently accounted as failure due to
overlapping with either memtable or uncompacting sstable.
So those 2 last metrics have noise of *unexpired* tombstones.
What we should do is to only account for expired tombstones in all
those 3 metrics. We lose the info of knowing the amount of tombstones
processed by compaction, now we'll only know about the expired ones.
But those metrics were primarily added for explaining why expired
tombstones cannot be removed.
We could have alternatively added a new field
purge_failure_due_to_being_unexpired or something, but
it requires adding a new field to compaction_history.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-737.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#28669
Links were pointing to the `debian` subdirectory. However, there docker build was refactored to use `redhat`: 1abf981a73, see https://github.com/scylladb/scylladb/pull/22910
No backport, just a README link fixes.
Closesscylladb/scylladb#28699
* github.com:scylladb/scylladb:
docs: fix path to the build_docker.sh which was moved from debian to redhat subdirectory
docs: fix link to docker build README.MD
The patchset fixes abort_source implementation for perf-alternator and perf-cql-raw. It moves
run_standalone function to common code in perf.hh with necessary templating.
We also add extensive testing so that it's more difficult to break the tooling in the future.
Fixes SCYLLADB-560
Backport: no, internal tooling improvement
Closesscylladb/scylladb#28541
* github.com:scylladb/scylladb:
test: cluster: add tests for perf tools
test: perf: fix port race condition on startup in connect workload
test: perf: prepare benchmarks to bind to custom host
test: perf: make perf-alterantor remote port configurable
test: perf: fix ASAN leak warnings in perf-alternator
Reapply "main: test: add future and abort_source to after_init_func"
Some assertions in the Raft-based topology are likely to cause crashes of
multiple nodes due to the consistent nature of the Raft-based code. If the
failing assertion is executed in the code run by each follower (e.g., the code
reloading the in-memory topology state machine), then all nodes can crash. If
the failing assertion is executed only by the leader (e.g., the topology
coordinator fiber), then multiple consecutive group0 leaders will chain-crash
until there is no group0 majority.
Crashing multiple nodes is much more severe than necessary. It's enough to
prevent the topology state machine from making more progress. This will
naturally happen after throwing a runtime error. The problematic fiber will be
killed or will keep failing in a loop. Note that it should be safe to block
the topology state machine, but not the whole group0, as the topology state
machine is mostly isolated from the rest of group0.
We replace some occurrences of `on_fatal_internal_error` and `SCYLLA_ASSERT`
with `on_internal_error`. These are not all occurrences, as some fatal
assertions make sense, for example, in the bootstrap procedure.
We also raise an internal error to prevent a segmentation fault in a few places.
Fixes#27987
Backporting this PR is not required, but we can consider it at least for 2026.1
because:
- it is LTS,
- the changes are low-risk,
- there shouldn't be many conflicts.
Closesscylladb/scylladb#28558
* github.com:scylladb/scylladb:
raft topology: prevent accessing nullptr returned by topology::find
raft topology: make some assertions non-crashing