The test_backup_simple creates a ks/cf, takes a snapshot, backs it up,
then checks that the files were uploaded. The test_backup_move does the
same, but also plays with 'move_files' parameter to be true/false.
In fact, the "move" test was the copy of "simple" one that dropepd check
for scheduling group being "streaming" (backup with --move-files can
check the same, it's not bad), and check for destination bucket to
contain needed files (same here -- checking that files arrived to bucket
after --move-files is good).
In the end of the day, after the change backup test is run two times,
instead of three, and performs extra checks for --move-files case.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28606
https://github.com/scylladb/scylladb/pull/25746 added a new column to `system.clients`: `client_options frozen<map<text, text>>`. This column stores all options sent by the client in the `STARTUP` message.
This PR also added `CLIENT_OPTIONS` to the list of values sent in `SUPPORTED` message, and documented that drivers can send their configuration (as JSON) in `STARTUP` under this key.
Documentation for the new column was not added to the description of `system.clients` table, and documentation about the new `STARTUP` key was added in `protocol-extensions.md`, but in the section about shard awareness extension.
This PR adds missing `system.clients` column description, moves the documentation of `CLIENT_OPTIONS` into its own section, and expands it a bit.
Backport: none, because this fixes internal documentation.
Closesscylladb/scylladb#28126
* github.com:scylladb/scylladb:
protocol-extensions.md: Fix client_options docs
system_keyspace.md: Add client_options column
system_keyspace.md: Fix order in system.clients
Doing it with format("{}", foo) is correct, but to_string is
a bit more lightweight.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28630
The `try-catch` expression is pretty much useless in its current form. If we return the future, the awaiting will only be performed by the caller, completely circumventing the exception handling.
As a result, instead of handling `raft::request_aborted` with a proper error message, the user will face `seastar::abort_requested_exception` whose message is cryptic at best. It doesn't even point to the root of the problem.
Fixes SCYLLADB-665
Backport: This is a small improvement and may help when debugging, so let's backport it to all supported versions.
Closesscylladb/scylladb#28624
* https://github.com/scylladb/scylladb:
test: raft: Add test_aborting_wait_for_state_change
raft: Describe exception types for wait_for_state_change and wait_for_leader
raft: Await instead of returning future in wait_for_state_change
There are 3 metrics (that goes in every compaction_history entry):
total_tombstone_purge_attempt
total_tombstone_purge_failure_due_to_overlapping_with_memtable
total_tombstone_purge_failure_due_to_overlapping_with_uncompacting_sstable
When a tombstone is not expired (e.g. doesn't satisfy "gc_before" or
grace period), it can be currently accounted as failure due to
overlapping with either memtable or uncompacting sstable.
So those 2 last metrics have noise of *unexpired* tombstones.
What we should do is to only account for expired tombstones in all
those 3 metrics. We lose the info of knowing the amount of tombstones
processed by compaction, now we'll only know about the expired ones.
But those metrics were primarily added for explaining why expired
tombstones cannot be removed.
We could have alternatively added a new field
purge_failure_due_to_being_unexpired or something, but
it requires adding a new field to compaction_history.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-737.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#28669
Links were pointing to the `debian` subdirectory. However, there docker build was refactored to use `redhat`: 1abf981a73, see https://github.com/scylladb/scylladb/pull/22910
No backport, just a README link fixes.
Closesscylladb/scylladb#28699
* github.com:scylladb/scylladb:
docs: fix path to the build_docker.sh which was moved from debian to redhat subdirectory
docs: fix link to docker build README.MD
The patchset fixes abort_source implementation for perf-alternator and perf-cql-raw. It moves
run_standalone function to common code in perf.hh with necessary templating.
We also add extensive testing so that it's more difficult to break the tooling in the future.
Fixes SCYLLADB-560
Backport: no, internal tooling improvement
Closesscylladb/scylladb#28541
* github.com:scylladb/scylladb:
test: cluster: add tests for perf tools
test: perf: fix port race condition on startup in connect workload
test: perf: prepare benchmarks to bind to custom host
test: perf: make perf-alterantor remote port configurable
test: perf: fix ASAN leak warnings in perf-alternator
Reapply "main: test: add future and abort_source to after_init_func"
Some assertions in the Raft-based topology are likely to cause crashes of
multiple nodes due to the consistent nature of the Raft-based code. If the
failing assertion is executed in the code run by each follower (e.g., the code
reloading the in-memory topology state machine), then all nodes can crash. If
the failing assertion is executed only by the leader (e.g., the topology
coordinator fiber), then multiple consecutive group0 leaders will chain-crash
until there is no group0 majority.
Crashing multiple nodes is much more severe than necessary. It's enough to
prevent the topology state machine from making more progress. This will
naturally happen after throwing a runtime error. The problematic fiber will be
killed or will keep failing in a loop. Note that it should be safe to block
the topology state machine, but not the whole group0, as the topology state
machine is mostly isolated from the rest of group0.
We replace some occurrences of `on_fatal_internal_error` and `SCYLLA_ASSERT`
with `on_internal_error`. These are not all occurrences, as some fatal
assertions make sense, for example, in the bootstrap procedure.
We also raise an internal error to prevent a segmentation fault in a few places.
Fixes#27987
Backporting this PR is not required, but we can consider it at least for 2026.1
because:
- it is LTS,
- the changes are low-risk,
- there shouldn't be many conflicts.
Closesscylladb/scylladb#28558
* github.com:scylladb/scylladb:
raft topology: prevent accessing nullptr returned by topology::find
raft topology: make some assertions non-crashing
In https://github.com/scylladb/scylladb/pull/27262 table audit has been
re-enabled by default in `scylla.yaml`, logging certain categories to a table,
which should make new Scylla deployments have audit enabled.
Now, in the next release, we also want to enable audit in `db/config.cc`,
which should enable audit for all deployments, which don't explicitly configure
audit otherwise in `scylla.yaml` (or via cmd line).
BTW. Because this commit aligns audit's default config values in `db/config.cc`
to those of `scylla.yaml`, `docs/reference/configuration-parameters.rst`, which
is based on `db/config.cc` will start showing that table audit is the default.
Refs: https://github.com/scylladb/scylladb/issues/28355
Refs: https://scylladb.atlassian.net/browse/SCYLLADB-222
No backport: table audit has been enabled in 2026.1 in `scylla.yaml`,
and should be always on starting from the next release,
which is the release we're currently merging to (2026.2).
Closesscylladb/scylladb#28376
* github.com:scylladb/scylladb:
docs: decommission: note audit ks may require ALTERing
docs: mention table audit enabled by default
audit: disable DDL by default
db/config: enable table audit by default
test/cluster: fix `test_table_desc_read_barrier` assertion
test/cluster: adjust audit in tests involving decommissioning its ks
audit_test: fix incorrect config in `test_audit_type_none`
Compaction and statement groups are carried over on those configs, but
are in fact unused. Drop both.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28540
There are four tests that check how restore with primary-replica-only option works in various scopes and topologies. Cases that check same-racks and same-datacenters are very very similar, so are those that check different-racks and different-datacenters. Parametrizing them and merging saves lots of code (+30 lines, -116 lines)
It's probably worth merging the resulting same-domain with different-domain tests, because the similarity is still large in both, but the result becomes too if-y, so not done here. Maybe later.
Improving tests, not backporting
Closesscylladb/scylladb#28569
* https://github.com/scylladb/scylladb:
test: Merge test_restore_primary_replica_different_... tests
test: Merge test_restore_primary_replica_same_... tests
test: Don't specify expected_replicas in test_restore_primary_replica_different_dc_scope_all
test: Remove local r_servers variable from test_restore_primary_replica_different_dc_scope_all
Fix the build of the test and the upload operation flow
No need to backport since it is only a test we barely use
Closesscylladb/scylladb#28595
* github.com:scylladb/scylladb:
s3_perf: fix upload operation flow
s3_perf: fix the CMake build
Tablet migration keeps sstable snapshot during streaming, which may
cause temporary increase in disk utilization if compaction is running
concurrently. SSTables compacted away are kept on disk until streaming
is done with them. The more tablets we allow to migrate concurrently,
the higher disk space can rise. When the target tablet size is
configured correcly, every tablet should own about 1% of disk
space. So concurrency of 4 shouldn't put us at risk. But target tablet
size is not chosen dynamically yet, and it may not be aligned with
disk capacity.
Also, tablet sizes can temporarily grow above the target, up to 2x
before the split starts, and some more because splits take a while to
complete.
To reduce the impact from this, reduce concurrency of
migration. Concurrency of 2 should still be enough to saturate
resources on the leaving shard.
Also, reducing concurrency means that load balancing is more
responsive to preemption. There will be less bandwidth sharing, so
scheduled migrations complete faster. This is important for scale-out,
where we bootstrap a node and want to start migrations to that new
node as soon as possible.
Refs scylladb/siren#15317Closesscylladb/scylladb#28563
* github.com:scylladb/scylladb:
tablets, config: Reduce migration concurrency to 2
tablets: load_balancer: Always accept migration if the load is 0
config, tablets: Make tablet migration concurrency configurable
The methods of `raft::server` are abortable and if the passed
`abort_source` is triggered, they throw `raft::request_aborted`.
We document that.
Although `raft::server` is an interface, this is consistent with
the descriptions of its other methods.
The `try-catch` expression is pretty much useless in its current form.
If we return the future, the awaiting will only be performed by the
caller, completely circumventing the exception handling.
As a result, instead of handling `raft::request_aborted` with a proper
error message, the user will face `seastar::abort_requested_exception`
whose message is cryptic at best. It doesn't even point to the root
of the problem.
Fixes SCYLLADB-665
Due to lack of checks present in process_execute_internal from
transport/server.cc needs_authorization bool was always set to true
doing some extra work (check_access()) for each request.
We mirror the logic in this patch in test env which perf-simple-query
uses. This can also potentially improve runtime of unittests (marginally).
Note that bug is only in perf tool not scylla itself, the fix
decreases insns/op by around 10%:
Before: 41065 insns/op
After: 37452 insns/op
Command: ./build/release/scylla perf-simple-query --duration 5 --smp 1
Fixes https://github.com/scylladb/scylladb/issues/27941Closesscylladb/scylladb#28704
Using an outdated image can cause problems when `microdnf update`
runs, if the distribution doesn't maintain good update hygiene.
Although, I suspect that when update failures happen they're really
caused by propagation delay of packages to mirrors.
Fix by using --pull=always to get a fresh image.
Ref https://scylladb.atlassian.net/browse/SCYLLADB-714Closesscylladb/scylladb#28680
In storage_service::load_stats_for_tablet_based_tables(), we are passing
a reference to sum_tablet_sizes to the lambda which increments this value
on each shard via map_reduce0(). This means we could have a race
condition because this is executed on separate threads/CPUs.
This patch fixed the problem by collecting the sums by shard into a
vector, then summing those up.
Refs: SCYLLADB-678
Closesscylladb/scylladb#28703
interval_data's move constructor is conditionally noexcept. It
contains a throw statemnt for the case that the underlying type's
move constructor can throw; that throw statemnt is never executed
if we're in the noexept branch. Clang 23 however doesn't understand
that, and warns about throwing in a noexcept function.
Fix that by rewriting the logic using seastar::defer(). In the
noexcept case, the optimizer should eliminate it as dead code.
Closesscylladb/scylladb#28710
Correct the upload operation logic. The previous flow incorrectly
checked for the test file on S3 even when performing operations that do
not download the file, such as uploads.
Remove bootstrap and decomission from allowed_repair_based_node_ops.
Using RBNO over streaming for these operations has no benefits, as they
are not exposed to the out-of-date replica problem that replace,
removenode and rebuild are.
On top of that, RBNO is known to have problems with empty user tables.
Using streaming for boostrap and decomission is safe and faster
than RBNO in all condition, especially when the table is small.
One test needs adjustment as it relies on RBNO being used for all node
ops.
Fixes: SCYLLADB-105
Closesscylladb/scylladb#28080
It checks if all workloads can be properly
executed with succesfull startup and teardown.
Especially testing alternator in remote mode is important
because it's invoked like this during pgo training in pgo.py.
Test runtime:
Release - 24s
Debug - 1m 15s
Test time consists mostly of Scylla startup in various modes.
Other workloads at startup call prepopulate() which connects
with retry loop therefore it waits until cql port is open.
This commit adds a single place where we will wait for port
for all workloads.
Timeout is set to 5 minutes so that even slowest machines
are able to start.
There is a handful of places in the code related to dictionary
compression which calls get_units to acquire semaphore units but the
returned future is not awaited, seemingly by mistake. The result of
get_units is assigned to a variable - which is reasonable at a glance
because the semaphore units need to be assigned to a variable in order
to control their scope - but at the same time if co_await is mistakenly
omitted, like here, doing so will silence the nodiscard check of
seastar::future and, effectively, the get_units call will be nearly
useless. Unfortunately, this is an easy mistake to make.
Fix the places in the code that acquire semaphore units via get_units
but never await the future returned by it. I found them by manual code
inspection, so I hope that I didn't miss any.
Closesscylladb/scylladb#28581
With audit feature enabled, it's not immediately obvious that its
pseudo-system keyspace `audit` may require adjusting its RF across DCs
before decommissioning a node, and this should be documented.
DDL audit category doesn't make sense if its enabled by default on its
own, as no DDL statements are going to be audited if audit_keyspaces/audit_tables
setting is empty. This may be counter-intuitive to our users, who may
expect to actually see these statements logged if we're enabling this by
default. Also, it doesn't make sense to enable a setting by default if
it has no effect.
Additionally, listed all possible audit categories for user's
convenience.
In https://github.com/scylladb/scylladb/pull/27262 table audit has been
re-enabled by default in `scylla.yaml`, logging certain categories to a table,
which should make new Scylla deployments have audit enabled.
Now, in the next release, we also want to enable audit in `db/config.cc`,
which should enable audit for all deployments, which don't explicitly configure
audit otherwise in `scylla.yaml` (or via cmd line).
BTW. Because this commit aligns audit's default config values in `db/config.cc`
to those of `scylla.yaml`, `docs/reference/configuration-parameters.rst`, which
is based on `db/config.cc` will start showing that table audit is the default.
Refs: https://github.com/scylladb/scylladb/issues/28355
Refs: https://scylladb.atlassian.net/browse/SCYLLADB-222
The test `assertion desc_schema[0] == desc_schema[1]` does a direct
list comparison, which is order-sensitive. Before enabling audit by default,
both nodes would return only the test keyspace/table, so the order
didn't matter. With audit enabled, there will be multiple keyspaces,
and they can be returned in different order by different nodes.
When table audit is enabled, Scylla creates the "audit" ks with
NetworkTopologyStrategy and RF=3. During node decommission, streaming can fail
for the audit ks with "zero replica after the removal" when all nodes from a DC
are removed, and so we have to ALTER audit ks to either zero the number of its
replicas, to allow for a clear decommission, or have them in the 2nd DC.
BTW. https://github.com/scylladb/scylladb/issues/27395 is the same change, but
in dtests repository.
Passing Python `None` to setup is incorrect, because config updates are sent
as a dict and `None` is treated as "unset" - meaning: use Scylla's default.
Using the explicit string "none" to guarantee that audit is disabled.
There is no point running repair for tables using RF one. Row level
repair will skip it but the auto repair scheduler will keep scheduling
such repairs since repair_time could not be updated.
Skip such repairs at the scheduler level for auto repair.
If the request is issued by user, we will have to schedule such
repair otherwise the user request will never be finished.
Fixes SCYLLADB-561
Closesscylladb/scylladb#28640
This commit introduces four changes:
- In the `table` example, singular forms (node, partition) are changed to
plural forms (nodes, partitions). Currently, the default `table`
audit configuration is RF=3 and writes use CL=ONE. Therefore,
a `table` audit log write failure should not be caused by a single
node unavailability, and plural forms are more adequate.
- In the `table` example, unreachability due to network issues is
mentioned because with RF=3, audit failure due to network problems
is more likely to happen than a simultaneous failure of three
nodes (such network failures happened in SCYLLADB-706).
- In the `syslog` example, a slash `/` is changed to `or`, so `table`
and `syslog` examples have similar structure.
- As the `syslog` line is already being changed, I also change `unix`
to `Unix`, as the capitalized form is the correct one.
Refs SCYLLADB-706
Closesscylladb/scylladb#28702
The connection's `cpu_concurrency_t` struct tracks the state of a connection
to manage the admission of new requests and prevent CPU overload during
connection storms. When a connection holds units (allowed only 0 or 1), it is
considered to be in the "CPU state" and contributes to the concurrency limits
used when accepting new connections.
The bug stems from the fact that `counted_data_source_impl::get` and
`counted_data_sink_impl::put` calls can interleave during execution. This
occurs because of `should_parallelize` and `_ready_to_respond`, the latter being
a future chain that can run in the background while requests are being read.
Consequently, while reading request (N), the system may concurrently be
writing the response for request (N-1) on the same connection.
This interleaving allows `return_all()` to be called twice before the
subsequent `consume_units()` is invoked. While the second `return_all()` call
correctly returns 0 units, the matching `consume_units()` call would
mistakenly take an extra unit from the semaphore. Over time, a connection
blocked on a read operation could end up holding an unreturned semaphore
unit. If this pattern repeats across multiple connections, the semaphore
units are eventually depleted, preventing the server from accepting any
new connections.
The fix ensures that we always consume the exact number of units that were
previously returned. With this change, interleaved operations behave as
follows:
get() return_all — returns 1 unit
put() return_all — returns 0 units
get() consume_units — takes back 1 unit
put() consume_units — takes back 0 units
Logically, the networking phase ends when the first network operation
concludes. But more importantly, when a network operation
starts, we no longer hold any units.
Other solutions are possible but the chosen one seems to be the
simplest and safest to backport.
Fixes SCYLLADB-485
Backport: all supported affected versions, bug introduced with initial feature implementation in: ed3e4f33fdClosesscylladb/scylladb#28530
* github.com:scylladb/scylladb:
test: auth_cluster: add test for hanged AUTHENTICATING connections
transport: fix connection code to consume only initially taken semaphore units
This patchset replaces permissions cache based on loading_cache with a new unified (permissions and roles), full, coherent auth cache.
Reason for the change is that we want to improve scenarios under stress and simplify operation manuals. New cache doesn't require any tweaking. And it behaves particularly better in scenarios with lots of schema entities (e.g. tables) combined with unprepared queries. Old cache can generate few thousands of extra internal tps due to cache refresh.
Benchmark of unprepared statements (just to populate the cache) with 1000 tables shows 3k tps of internal reads reduction and 9.1% reduction of median instructions per op. So many tables were used to show resource impact, cache could be filled with other resource types to show the same improvement.
Backport: no, it's a new feature.
Fixes https://github.com/scylladb/scylladb/issues/7397
Fixes https://github.com/scylladb/scylladb/issues/3693
Fixes https://github.com/scylladb/scylladb/issues/2589
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-147Closesscylladb/scylladb#28078
* github.com:scylladb/scylladb:
test: boost: add auth cache tests
auth: add cache size metrics
docs: conf: update permissions cache documentation
auth: remove old permissions cache
auth: use unified cache for permissions
auth: ldap: add permissions reload to unified cache
auth: add permissions cache to auth/cache
auth: add service::revoke_all as main entry point
auth: explicitly life-extend resource in auth_migration_listener
The hostent::addr_list is deprecated in favor of address_entry::addr
field that contains the very same addresses.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28566
All users of it had been updated to get the streaming group elsewhere,
so this getter is no longer needed.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#28527
sccache combines the functions of ccache and distcc, and
promises to support C++20 modules in the future. Switch
to sccache in anticipation of modules support.
The documentation is adjusted since cache will be
persistent for sccache without further work.
Closesscylladb/scylladb#28524
There are some places that get `map<foo, bar>` and return it to the caller as `"key": string(foo), "value": string(bar)` json. For that there's `map_to_key_value()` helper in api.hh that re-formats the map into a vector of json elements and returns it, letting seastar json-ize that vector.
Recently in seastar there appeared stream_range_as_array() helper that helps streaming any range without converting it into intermediate collection. Some of the hottest users of `map_to_key_value()` had been converted, this PR converts few remainders and removes the helper in question to encourage further usage of the stream_range_as_array().
Code cleanup, not backporting
Closesscylladb/scylladb#28491
* github.com:scylladb/scylladb:
api: Remove map_to_key_value() helpers
api: Streamify view_build_statuses handler
api: Streamify few more storage_service/ handlers
api: Add map_to_json() helper
api: Coroutinize view_build_statuses handler