Commit Graph

52383 Commits

Author SHA1 Message Date
Calle Wilund
ef795eda5b test_encryption: Fix test_system_auth_encryption
Fixes: SCYLLADB-915

Test was quite broken; Not waiting for coro:s, as well as a bunch
of checks no longer even close to valid (this is a ported dtest, and
not a very good one).

Closes scylladb/scylladb#28887
2026-03-09 14:38:31 +02:00
Marcin Maliszkiewicz
f177259316 Merge 'vector_search: small improvements' from Karol Nowacki
vector_search: small improvements

This PR addresses several minor code quality issues and style inconsistencies within the vector_search module.

No backport is needed as these improvements are not visible to the end user.

Closes scylladb/scylladb#28718

* github.com:scylladb/scylladb:
  vector_search: fix names of private members
  vector_search: remove unused global variable
2026-03-09 11:42:35 +01:00
Botond Dénes
6bba4f7ca1 Merge 'test: cluster: util: sleep for 0.01s between writes in do_writes' from Patryk Jędrzejczak
Tests use `start_writes` as a simple write workload to test that writes
succeed when they should (e.g., there is no availability loss), but not to
test performance. There is no reason to overload the CPU, which can lead to
test failures.

I suspect this function to be the cause of SCYLLADB-929, where the failures
of `test_raft_recovery_user_data` (that creates multiple write workloads
with `start_writes`) indicated that the machine was overloaded.
The relevant observations:
- two runs failed at the same time in debug mode,
- there were many reactor stalls and RPC timeouts in the logs (leading to
  unexpected events like servers marking each other down and group0
  leader changes).

I didn't prove that `start_writes` really caused this, but adding this sleep
should be a good change, even if I'm wrong.

The number of writes performed by the test decreases 30-50 times with the
sleep.

Note that some other util functions like `start_writes_to_cdc_table` have
such a sleep.

This PR also contains some minor updates to `test_raft_recovery_user_data`.

Fixes SCYLLADB-929

No backport:
- the failures were observed only in master CI,
- no proof that the change fixes the issue, so backports could be a waste
  of time.

Closes scylladb/scylladb#28917

* github.com:scylladb/scylladb:
  test: test_raft_recovery_user_data: replace asyncio.gather with gather_safely
  test: test_raft_recovery_user_data: use the exclude_node API
  test: test_raft_recovery_user_data: drop tablet_load_stats_cfg
  test: cluster: util: sleep for 0.01s between writes in do_writes
2026-03-09 12:12:04 +02:00
Nadav Har'El
47e8206482 test/alternator: test, and document, Alternator's data encoding
This patch adds a test file, test/alternator/test_encoding.py, testing
how Alternator stores its data in the underlying CQL database. We test
how tables are named, and how attributes of different types are encoded
into CQL.

The test, which begins with a long comment, also doubles as developer-
oriented *documention* on how Alternator encodes its data in the CQL
database. This documentation is not intended for end-users - we do
not want to officially support reading or writing Alternator tables
through CQL. But once in a while, this information can come in handy
for developers.

More importantly, this test will also serve as a regression test,
verifying that Alternator's encoding doesn't change unintentionally.
If we make an unintentional change to the way that Alternator stores
its data, this can break upgrades: The new code might not be able to
read or write the old table with its old encoding. So it's important
to make sure we never make such unintentional changes to the encoding
of Alternator's data. If we ever do make *intentional* changes to
Alternator's data encoding, we will need to fix the relevant test;
But also not forget to make sure that the new code is able to read
the old encoding as well.

The new tests use both "dynamodb" (Alternator) and "cql" fixtures,
to test how CQL sees the Alternator tables. So naturally are these
tests are marked "scylla_only" and skipped on DynamoDB.

Fixes #19770.

Closes scylladb/scylladb#28866
2026-03-09 10:50:09 +01:00
Andrzej Jackowski
6fb5ab78eb db/config: move guardrails config to one place and reorder
The motivations for this patch are as follows:
 - Guardrails should follow similar conventions, e.g. for config names,
   metrics names, testing. Keeping guardrails together makes it easier
   to find and compare existing guardrails when new guardrails are
   implemented.
 - The configuration is used to auto-generate the documentation
   (particularly, the `configuration-parameters` page). Currently,
   the order of parameters in the documentation is inconsistent (e.g.
   `minimum_replication_factor_fail_threshold` before
   `minimum_replication_factor_warn_threshold` but
   `maximum_replication_factor_fail_threshold` after
   `maximum_replication_factor_warn_threshold`), which can be confusing
   to customers.

Fixes: SCYLLADB-256

Closes scylladb/scylladb#28932
2026-03-09 10:50:00 +01:00
Patryk Jędrzejczak
46b7170347 Merge 'test/pylib: centralize timeout scaling and propagate build_mode in LWT helpers' from Alex Dathskovsky
This series improves timeout handling consistency across the test framework and makes build-mode effects explicit in LWT tests. (starting with LWT test that got flaky)

1. Centralize timeout scaling
Introduce scale_timeout(timeout) fixture in runner.py to provide a single, consistent mechanism for scaling test timeouts based on build mode.
Previously, timeout adjustments were done in an ad-hoc manner across different helpers and tests. Centralizing the logic:
Ensures consistent behavior across the test suite
Simplifies maintenance and reasoning about timeout behavior
Reduces duplication and per-test scaling logic
This becomes increasingly important as tests run on heterogeneous hardware configurations, where different build modes (especially debug) can significantly impact execution time.

2. Make scale_timeout explicit in LWT helpers
Propagate scale_timeout explicitly through BaseLWTTester and Worker, validating it at construction time instead of relying on implicit pytest fixture injection inside helper classes.
Additionally:
Update wait_for_phase_ops() and wait_for_tablet_count() to use scale_timeout_by_mode() for consistent polling behavior across modes
Update all LWT test call sites to pass build_mode explicitly
Increase default timeout values, as the previous defaults were too short and prone to flakiness, particularly under slower configurations such as debug builds

Overall, this series improves determinism, reduces flakiness, and makes the interaction between build mode and test timing explicit and maintainable.

backport: not required just an enhansment for test.py infra

Closes scylladb/scylladb#28840

* https://github.com/scylladb/scylladb:
  test/auth_cluster: align service-level timeout expectations with scaled config
  test/lwt: propagate scale_timeout through LWT helpers; scale resize waits Pass scale_timeout explicitly through BaseLWTTester and Worker, validating it at construction time instead of relying on implicit pytest fixture injection inside helper classes. Update wait_for_phase_ops() and wait_for_tablet_count() to use scale_timeout_by_mode() so polling behavior remains consistent across build modes. Adjust LWT test call sites to pass scale_timeout explicitly. Increase default timeout values, as the previous defaults were too short and prone to flakiness under slower configurations (notably debug/dev builds).
  test/pylib: introduce scale_timeout fixture helper
2026-03-09 10:28:19 +01:00
Patryk Jędrzejczak
4c8dba15f1 Merge 'strong_consistency/state_machine: ensure and upgrade mutations schema' from Michał Jadwiszczak
This patch fixes 2 issues within strong consistency state machine:
- it might happen that apply is called before the schema is delivered to the node
- on the other hand, the apply may be called after the schema was changed and purged from the schema registry

The first problem is fixed by doing `group0.read_barrier()` before applying the mutations.
The second one is solved by upgrading the mutations using column mappings in case the version of the mutations' schema is older.

Fixes SCYLLADB-428

Strong consistency is in experimental phase, no need to backport.

Closes scylladb/scylladb#28546

* https://github.com/scylladb/scylladb:
  test/cluster/test_strong_consistency: add reproducer for old schema during apply
  test/cluster/test_strong_consistency: add reproducer for missing schema during apply
  test/cluster/test_strong_consistency: extract common function
  raft_group_registry: allow to drop append entries requests for specific raft group
  strong_consistency/state_machine: find and hold schemas of applying mutations
  strong_consistency/state_machine: pull necessary dependencies
  db/schema_tables: add `get_column_mapping_if_exists()`
2026-03-09 09:49:22 +01:00
Marcin Maliszkiewicz
4150c62f29 Merge 'test_proxy_protocol: fix flaky system.clients visibility checks' from Piotr Smaron
`test_proxy_protocol_port_preserved_in_system_clients` failed because it
didn't see the just created connection in system.clients immediately. The
last lines of the stacktrace are:
```
            # Complete CQL handshake
            await do_cql_handshake(reader, writer)

            # Now query system.clients using the driver to see our connection
            cql = manager.get_cql()
            rows = list(cql.execute(
                f"SELECT address, port FROM system.clients WHERE address = '{fake_src_addr}' ALLOW FILTERING"
            ))

            # We should find our connection with the fake source address and port
>           assert len(rows) > 0, f"Expected to find connection from {fake_src_addr} in system.clients"
E           AssertionError: Expected to find connection from 203.0.113.200 in system.clients
E           assert 0 > 0
E            +  where 0 = len([])
```
Explanation: we first await for the hand-made connection to be completed,
then, via another connection, we're querying system.clients, and we don't
get this hand-made connection in the resultset.
The solution is to replace the bare cql.execute() calls with await wait_for_results(), a helper
 that polls via cql.run_async() until the expected row count is reached
 (30 s timeout, 100 ms period).

Fixes: SCYLLADB-819

The flaky test is present on master and in previous release, so backporting only there.

Closes scylladb/scylladb#28849

* github.com:scylladb/scylladb:
  test_proxy_protocol: introduce extra logging to aid debugging
  test_proxy_protocol: fix flaky system.clients visibility checks
2026-03-09 08:37:57 +01:00
Yaron Kaikov
977bdd6260 .github/workflows/trigger-scylla-ci: fix heredoc injection in trigger-scylla-ci workflow
Move all ${{ }} expression interpolations into env: blocks so they are
passed as environment variables instead of being expanded directly into
shell scripts. This prevents an attacker from escaping the heredoc in
the Validate Comment Trigger step and executing arbitrary commands on
the runner.

The Verify Org Membership step is hardened in the same way for
defense-in-depth.

Refs: GHSA-9pmq-v59g-8fxp
Fixes: SCYLLADB-954

Closes scylladb/scylladb#28935
2026-03-08 21:34:51 +02:00
Artsiom Mishuta
fda68811e8 test.py: fix strict-config argument.
The ini-level strict_config was removed/never existed as a config key in pytest 8 — it's only a command-line flag(and  back in pytest 9)
In pytest 8.3.5, the equivalent is the --strict-config CLI flag, not an ini option

Fixes SCYLLADB-955

Closes scylladb/scylladb#28939
2026-03-08 16:09:29 +02:00
Dawid Mędrek
5feed00caa Merge 'raft: read_barrier: update local commit_idx to read_idx when it's safe' from Patryk Jędrzejczak
When the local entry with `read_idx` belongs to the current term, it's
safe to update the local `commit_idx` to `read_idx`.

The motivation for this change is to speed up read barriers. `wait_for_apply`
executed at the end of `read_barrier` is delayed until the follower learns
that the entry with `read_idx` is committed. It usually happens quickly in
the `read_quorum` message. However, non-voters don't receive this message,
so they have to wait for `append_entries`. If no new entries are being
added, `append_entries` can come only from `fsm::tick_leader()`. For group0,
this happens once every 100ms.

The issue above significantly slows down cluster setups in tests. Nodes
join group0 as non-voters, and then they are met with several read barriers
just after a write to group0. One example is `global_token_metadata_barrier`
in `write_both_read_new` performed just after `update_topology_state` in
`write_both_read_old`.

I tested the performance impact of this change with the following test:
```python
for _ in range(10):
    await manager.servers_add(3)
```
It consistently takes 44-45s with the change and 50-51s without the change
in dev mode.

No backport:
- non-critical performance improvement mostly relevant in tests,
- the change requires some soak time in master.

Closes scylladb/scylladb#28891

* github.com:scylladb/scylladb:
  raft: server: fix the repeating typo
  raft: clarify the comment about read_barrier_reply
  raft: read_barrier: update local commit_idx to read_idx when it's safe
  raft: log: clarify the specification of term_for
2026-03-06 18:50:08 +01:00
Piotr Smaron
f12e4ea42b test_proxy_protocol: introduce extra logging to aid debugging
In case of an error, we want to see the contents of the system.clients
table to have a better understanding of what happened - whether the
row(s) are really missing or maybe they are there, but 1 digit doesn't
match or the row is half-written.
We'll therefore query for the whole table on the CQL side, and then
filter out the rows we want to later proceed with on the python side.
This way we can dump the contents of the whole system.clients table if
something goes south.
2026-03-06 14:50:12 +01:00
Piotr Smaron
d8cf2c5f23 test_proxy_protocol: fix flaky system.clients visibility checks
`test_proxy_protocol_port_preserved_in_system_clients` failed because it
didn't see the just created connection in system.clients immediately. The
last lines of the stacktrace are:
```
            # Complete CQL handshake
            await do_cql_handshake(reader, writer)

            # Now query system.clients using the driver to see our connection
            cql = manager.get_cql()
            rows = list(cql.execute(
                f"SELECT address, port FROM system.clients WHERE address = '{fake_src_addr}' ALLOW FILTERING"
            ))

            # We should find our connection with the fake source address and port
>           assert len(rows) > 0, f"Expected to find connection from {fake_src_addr} in system.clients"
E           AssertionError: Expected to find connection from 203.0.113.200 in system.clients
E           assert 0 > 0
E            +  where 0 = len([])
```
Explanation: we first await for the hand-made connection to be completed,
then, via another connection, we're querying system.clients, and we don't
get this hand-made connection in the resultset.
The solution is to replace the bare cql.execute() calls with await wait_for_results(), a helper
 that polls via cql.run_async() until the expected row count is reached
 (30 s timeout, 100 ms period).

Fixes: SCYLLADB-819
2026-03-06 14:49:59 +01:00
Botond Dénes
4fdc0a5316 Merge 'Relax test's check_mutation_replicas() argument list' from Pavel Emelyanov
The one accepts long list of arguments, some of those is not really needed. Also some callers can be relaxed not to provide default values for arguments with such.

Improving tests, not backporting

Closes scylladb/scylladb#28861

* github.com:scylladb/scylladb:
  test: Remove passing default "expected_replicas" to check_mutation_replicas()
  test: Remove scope and primary-replica-only arguments from check_mutation_replicas() helper
2026-03-06 11:25:00 +02:00
Szymon Malewski
d817e56e87 vector_similarity_fcts.cc: fix strict aliasing violation in extract_float_vector
Previous code performed endian conversion by bulk-copying raw bytes
into a std::vector<float> and then iterating over it via a
reinterpret_cast<uint32_t*> pointer. Accessing float storage through a
uint32_t* violates C++ strict aliasing rules, giving the compiler
freedom to reorder or elide the stores, causing undefined behavior.

Replace the two-pass approach with a single-pass loop using
seastar::consume_be<uint32_t>() and std::bit_cast<float>(), which is
both well-defined and auto-vectorizable.

Follow-up #28754

Closes scylladb/scylladb#28912
2026-03-06 09:15:45 +01:00
Artsiom Mishuta
5d7a73cc5b test.py add support if non_gating tests
Add support for non_gating, the opposite of gating in dtest terminology, tests in test.py
codebase

This test will/should not be run by any current gating job (ci/next/nightly)

Closes scylladb/scylladb#28902
2026-03-06 09:39:32 +02:00
Andrei Chekun
01498a00d5 test.py: make HostRegistry singleton
HostRegistry initialized in several places in the framework, this can
lead to the overlapping IP, even though the possibility is low it's not
zero. This PR makes host registry initialized once for the master
thread and pytest. To avoid communication between with workers, each
worker will get its own subnet that it can use solely for its own goals.
This simplifies the solution while providing the way to avoid overlapping IP's.

Closes scylladb/scylladb#28520
2026-03-06 09:25:29 +02:00
Artsiom Mishuta
2be4d8074d test.py disable XFail tests on CI run
This PR disables running FXAIL tests on ci run to speed it up.

tests will continue run on "nightly" job and FAIL on unexpected pass
and will continue run on "NEXT" job and NOT FAIL on unexpected pass

Closes scylladb/scylladb#28886
2026-03-06 09:12:06 +02:00
Szymon Malewski
f9d213547f cql3: selection: fix add_column_for_post_processing for ORDER BY
The purpose of `add_column_for_post_processing` is to add columns that are required for processing of a query,
but are not part of SELECT clause and shouldn't be returned. They are added to the final result set, but later are not serialized.
Mainly it is used for filtering and grouping columns, with a special case of `WHERE primary_key IN ...  ORDER BY ...` when the whole result set needs additional final sorting,
and ordering columns must be added as well.
There was a bug that manifested in #9435, #8100 and was actually identified in #22061.
In case of selection with processing (e.g functions involved), result set row is formed in two stages.
Initially it is a list of columns fetched from replicas - on which filtering and grouping is performed.
After that the actual selection is resolved and the final number of columns can change.
Ordering is performed on this final shape, but the ordering column index returned by `add_column_for_post_processing` refereed to initial shape.
If selection refereed to the same column twice (e.g. `v, TTL(v)` as in #9435) final row was longer than initial and ordering refereed to incorrect column.
If a function in selection refereed to multiple columns (e.g. as_json(.., ..) which #8100 effectively uses) the final row was shorter
and ordering tried to use a non-existing column.

This patch fixes the problem by making sure that column index of the final result set is used for ordering.

The previously crashing test `cassandra_tests/validation/entities/json_test.py::testJsonOrdering` doesn't have to be skipped, but now it is failing on issue #28467.

Fixes #9435
Fixes #8100
Fixes #22061

Closes scylladb/scylladb#28472
2026-03-05 19:22:34 +02:00
Patryk Jędrzejczak
c8c57850d9 test: test_raft_recovery_user_data: replace asyncio.gather with gather_safely 2026-03-05 17:13:52 +01:00
Patryk Jędrzejczak
c3aa4ed23c test: test_raft_recovery_user_data: use the exclude_node API
The API is now available.
2026-03-05 17:13:52 +01:00
Patryk Jędrzejczak
dd75687251 test: test_raft_recovery_user_data: drop tablet_load_stats_cfg
The issue has been fixed.
2026-03-05 17:13:52 +01:00
Patryk Jędrzejczak
52940c4f31 test: cluster: util: sleep for 0.01s between writes in do_writes
Tests use `start_writes` as a simple write workload to test that writes
succeed when they should (e.g., there is no availability loss), but not to
test performance. There is no reason to overload the CPU, which can lead to
test failures.

I suspect this function to be the cause of SCYLLADB-929, where the failures
of `test_raft_recovery_user_data` (that creates multiple write workloads
with `start_writes`) indicated that the machine was overloaded.
The relevant observations:
- two runs failed at the same time in debug mode,
- there were many reactor stalls and RPC timeouts in the logs (leading to
  unexpected events like servers marking each other down and group0
  leader changes).

I didn't prove that `start_writes` really caused this, but adding this sleep
should be a good change, even if I'm wrong.

The number of writes performed by the test decreases 30-50 times with the
sleep.

Note that some other util functions like `start_writes_to_cdc_table` have
such a sleep.

Fixes SCYLLADB-929
2026-03-05 17:13:40 +01:00
Calle Wilund
ab3d3d8638 build: add slirp4netns to dependencies
Needed for port forwarded podman-in-podman containers

[avi:
  - move from Dockerfile to install-dependencies.sh so non-container
  builds also get it
  - regenerate frozen toolchain with optimized clang from

    https://devpkg.scylladb.com/clang/clang-21.1.8-Fedora-43-aarch64.tar.gz
    https://devpkg.scylladb.com/clang/clang-21.1.8-Fedora-43-x86_64.tar.gz
]

Closes scylladb/scylladb#28870
2026-03-05 17:44:17 +02:00
Michał Jadwiszczak
37bbbd3a27 test/cluster/test_strong_consistency: add reproducer for old schema during apply 2026-03-05 13:50:20 +01:00
Michał Jadwiszczak
6aef4d3541 test/cluster/test_strong_consistency: add reproducer for missing schema during apply 2026-03-05 13:50:16 +01:00
Michał Jadwiszczak
4795f5840f test/cluster/test_strong_consistency: extract common function 2026-03-05 13:47:43 +01:00
Michał Jadwiszczak
3548b7ad38 raft_group_registry: allow to drop append entries requests for specific raft group
Similar to `raft_drop_incoming_append_entries`, the new error injection
`raft_drop_incoming_append_entries_for_specified_group` skips handler
for `raft_append_entries` RPC but it allows to specify id of raft group
for which the requests should be dropped.

The id of a raft group should be passed in error injection parameters
under `value` key.
2026-03-05 13:47:43 +01:00
Michał Jadwiszczak
b0cffb2e81 strong_consistency/state_machine: find and hold schemas of applying mutations
It might happen that a strong consistency command will arrive to a node:
- before it knows about the schema
- after the schema was changes and the old version was removed from the
  memory

To fix the first case, it's enough to perform a read barrier on group0.
In case of the second one, we can use column mapping the upgrade the
mutation to newer schema.

Also, we should hold pointers to schemas until we finish `_db.apply()`,
so the schema is valid for the whole time.
And potentially we should hold multiple pointers because commands passed
to `state_machine::apply()` may contain mutations to different schema
versions.

This commit relies on a fact that the tablet raft group and its state
machine is created only after the table is created locally on the node.

Fixes SCYLLADB-428
2026-03-05 13:47:40 +01:00
Tomasz Grabiec
b90fe19a42 Merge 'service: assert that tables updated via group0 use schema commitlog' from Aleksandra Martyniuk
Set enable_schema_commitlog for each group0 tables.

Assert that group0 tables use schema commitlog in ensure_group0_schema
(per each command).

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-914.

Needs backport to all live releases as all are vulnerable

Closes scylladb/scylladb#28876

* github.com:scylladb/scylladb:
  test: add test_group0_tables_use_schema_commitlog
  db: service: remove group0 tables from schema commitlog schema initializer
  service: ensure that tables updated via group0 use schema commitlog
  db: schema: remove set_is_group0_table param
2026-03-05 13:28:13 +01:00
Botond Dénes
509f2af8db Merge 'repair: Fix rwlock in compaction_state and lock holder lifecycle' from Raphael Raph Carvalho
Consider this:

- repair takes the lock holder
- tablet merge filber destories the compaction group and the compaction state
- repair fails
- repair destroy the lock holder

This is observed in the test:

```
repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036] Repair 1 out of 1 tablets: table=sec_index.users range=(432345564227567615,504403158265495551] replicas=[0e9d51a5-9c99-4d6e-b9db-ad36a148b0ea:15, 498e354c-1254-4d8d-a565-2f5c6523845a:9, 5208598c-84f0-4526-bb7f-573728592172:28]

...

repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036]: Started to repair 1 out of 1 tables in keyspace=sec_index, table=users, table_id=ea2072d0-ccd9-11f0-8dba-c5ab01bffb77, repair_reason=repair
repair - Enable incremental repair for table=sec_index.users range=(432345564227567615,504403158265495551]
table - Disabled compaction for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Got unrepaired compaction and repair lock for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Disabled compaction for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
table - Got unrepaired compaction and repair lock for range=(432345564227567615,504403158265495551] session_id=a13a72cc-cd2d-11f0-8e9b-76d54580ab09 for incremental repair
repair - repair[5d73d094-72ee-4570-a3cc-1cd479b2a036]: get_sync_boundary: got error from node=0e9d51a5-9c99-4d6e-b9db-ad36a148b0ea, keyspace=sec_index, table=users, range=(432345564227567615,504403158265495551], error=seastar::rpc::remote_verb_error (Compaction state for table [0x60f008fa34c0] not found)
compaction_manager - Stopping 1 tasks for 1 ongoing compactions for table sec_index.users compaction_group=238 due to tablet merge
compaction_manager - Stopping 1 tasks for 1 ongoing compactions for table sec_index.users compaction_group=238 due to tablet merge

....

scylla[10793] Segmentation fault on shard 28, in scheduling group streaming
```

The rwlock in compaction_state could be destroyed before the lock holder
of the rwlock is destroyed. This causes user after free when the lock
the holder is destroyed.

To fix it, users of repair lock will now be waited when a compaction
group is being stopped.
That way, compaction group - which controls the lifetime of rwlock -
cannot be destroyed while the lock is held.
Additionally, the merge completion fiber - that might remove groups -
is properly serialized with incremental repair.

The issue can be reproduced using sanitize build consistently and can not
be reproduced after the fix.

Fixes #27365

Closes scylladb/scylladb#28823

* github.com:scylladb/scylladb:
  repair: Fix rwlock in compaction_state and lock holder lifecycle
  repair: Prevent repair lock holder leakage after table drop
2026-03-05 14:18:25 +02:00
Patryk Jędrzejczak
f1978d8a22 raft: server: fix the repeating typo 2026-03-05 13:06:08 +01:00
Patryk Jędrzejczak
5a43695f6a raft: clarify the comment about read_barrier_reply
The comment could be misleading. It could suggest that the returned index is
already safe to read. That's not necessarily true. The entry with the
returned index could, for example, be dropped by the leader if the leader's
entry with this index had a different term.
2026-03-05 13:06:08 +01:00
Patryk Jędrzejczak
1ae2ae50a6 raft: read_barrier: update local commit_idx to read_idx when it's safe
When the local entry with `read_idx` belongs to the current term, it's
safe to update the local `commit_idx` to `read_idx`. The argument for
safety is in the new comment above `maybe_update_commit_idx_for_read`.

The motivation for this change is to speed up read barriers. `wait_for_apply`
executed at the end of `read_barrier` is delayed until the follower learns
that the entry with `read_idx` is committed. It usually happens quickly in
the `read_quorum` message. However, non-voters don't receive this message,
so they have to wait for `append_entries`. If no new entries are being
added, `append_entries` can come only from `fsm::tick_leader()`. For group0,
this happens once every 100ms.

The issue above significantly slows down cluster setups in tests. Nodes
join group0 as non-voters, and then they are met with several read barriers
just after a write to group0. One example is `global_token_metadata_barrier`
in `write_both_read_new` performed just after `update_topology_state` in
`write_both_read_old`.

Writing a test for this change would be difficult, so we trust the nemesis
tests to do the job. They have already found consistency issues in read
barriers. See #10578.
2026-03-05 13:06:08 +01:00
Patryk Jędrzejczak
1cbd0da519 raft: log: clarify the specification of term_for
When `idx > last_idx()`, the function does an out-of-bounds access to `_log`.
This may look contradictory to the current specification.
2026-03-05 13:06:07 +01:00
Michał Jadwiszczak
33a16940be strong_consistency/state_machine: pull necessary dependencies
Both migration manager and system keyspace will be used in next commit.
The first one is needed to execute group0 read barrier and we need
system keyspace to get column mappings.
2026-03-05 12:33:17 +01:00
Alex
b32ef8ecd5 test/auth_cluster: align service-level timeout expectations with scaled config
Use scale_timeout_by_mode() in make_scylla_conf() to derive
  request_timeout_in_ms in test/pylib/scylla_cluster.py.

  Update test_connections_parameters_auto_update in
  test/cluster/auth_cluster/test_raft_service_levels.py to expect the
  mode-specific timeout string returned by the REST endpoint after this
  scaling change.
2026-03-05 13:32:15 +02:00
Alex
a66565cc42 test/lwt: propagate scale_timeout through LWT helpers; scale resize waits
Pass scale_timeout explicitly through BaseLWTTester and Worker, validating it at construction time instead of relying on implicit pytest fixture injection inside helper classes.
Update wait_for_phase_ops() and wait_for_tablet_count() to use scale_timeout_by_mode() so polling behavior remains consistent across build modes.
Adjust LWT test call sites to pass scale_timeout explicitly.
Increase default timeout values, as the previous defaults were too short and prone to flakiness under slower configurations (notably debug/dev builds).
2026-03-05 13:07:09 +02:00
Alex
73f1a65203 test/pylib: introduce scale_timeout fixture helper
Introduce scale_timeout(mode) to centralize test timeout scaling logic based on build mode, the function will return a callable that will handle the timeout by mode.
This ensures consistent timeout behavior across test helpers and eliminates ad-hoc per-test scaling adjustments.
Centralizing the logic improves maintainability and makes timeout behavior easier to reason about.
This becomes increasingly important as we run tests on heterogeneous hardware configurations.
Different build modes (especially debug) can significantly affect execution time, and having a single scaling mechanism helps keep test stability predictable across environments.

No functional change beyond unifying existing timeout scaling behavior.
2026-03-05 13:07:09 +02:00
Anna Stuchlik
855c503c63 doc: fix the unified installer instructions
This commit updates the documentation for the unified installer.

- The Open Source example is replaced with version 2025.1 (Source Available, currently supported, LTS).
- The info about CentOS 7 is removed (no longer supported).
- Java 8 is removed.
- The example for cassandra-stress is removed (as it was already removed on other installation pages).

Fixes https://github.com/scylladb/scylladb/issues/28150

Closes scylladb/scylladb#28152
2026-03-05 12:57:06 +02:00
Michał Jadwiszczak
d25be9e389 db/schema_tables: add get_column_mapping_if_exists()
In scenarios where we want to firsty check if a column mapping exists
and if we don't want do flow control with exception, it is very wasteful
to do
```
if (column_mapping_exists()) {
  get_column_mapping();
}
```
especially in a hot path like `state_machine::apply()` becase this will
execute 2 internal queries.

This commit introduces `get_column_mapping_if_exists()` function,
which simply wrapps result of `get_column_mapping()` in optional and
doesn't throw an exception if the mapping doesn't exist.
2026-03-05 11:55:57 +01:00
Artsiom Mishuta
7b30a3981b test.py: enable strict_config,xfail_strict,strict-markers
this commit enables 3 strict pytest options:

strict_config - if any warnings encountered while parsing the pytest section of the configuration file will raise errors.
xfail_strict - if markers not registered in the markers section of the configuration file will raise errors.
strict-markers - if tests marked with @pytest.mark.xfail that actually succeed will by default fail the test suite

and fix errors that occur after enabling these options

Closes scylladb/scylladb#28859
2026-03-05 12:54:26 +02:00
Dawid Mędrek
7564a56dc8 Merge 'tombstone_gc: allow using repair-mode tombstone gc with RF=1 tables' from Botond Dénes
Currently, repair-mode tombstone-gc cannot be used on tables with RF=1. We want to make repair-mode the default for all tablet tables (and more, see https://github.com/scylladb/scylladb/issues/22814), but currently a keyspace created with RF=1 and later altered to RF>1 will end up using timeout-mode tombstone gc. This is because the repair-mode tombstone-gc code relies on repair history to determine the gc-before time for keys/ranges. RF=1 tables cannot run repairs so they will have empty repair history and consequently won't be able to purge tombstones.
This PR solves this by keeping a registry of RF=1 tables and consulting this registry when creating `tombstone_gc_state` objects. If the table is RF=1, tombstone-gc will work as if the table used immediate-mode tombstone-gc. The registry is updated on each replication update. As soon as the table is not RF=1 anymore, the tombstone-gc reverts to the natural repair-mode behaviour.

After this PR, tombstone-gc defaults to repair-mode for all tables, regardless of RF and tablets/vnodes.

Fixes: SCYLLADB-106.

New feature, no backport required.

Closes scylladb/scylladb#22945

* github.com:scylladb/scylladb:
  test/{boost,cluster}: add test for tombstone gc mode=repair with RF=1
  tombstone_gc: allow use of repair-mode for RF=1 tables
  replica/table: update rf=1 table registry in shared tombstone-gc state
  tombstone_gc: tombstone_gc_before_getter: consider RF when getting gc before time
  tombstone_gc: unpack per_table_history_maps
  tombstone_gc: extract _group0_gc_time from per_table_history_map
  tombstone_gc: drop tombstone_gc_state(nullptr) ctor and operator bool()
  test/lib/random_schema: use timeout-mode tombstone_gc
  tombstone_gc_options: add C++ friendly constructor
  test: move away from tombstone_gc_state(nullptr) ctor
  treewide: move away from tombstone_gc_state(nullptr) ctor
  sstable: move away from tombstone_gc_mode::operator bool()
  replica/table: add get_tombstone_gc_state()
  compaction: use tombstone_gc_state with value semantics
  db/row_cache: use tombstone_gc_state with value semantics
  tombstone_gc: introduce tombstone_gc_state::for_tests()
2026-03-05 11:50:31 +01:00
Piotr Dulikowski
a2669e9983 test: test_mv_merge_allowed: add mistakenly omitted awaits
The test test_mv_merge_allowed asserts in two places that the tablet
count is 2. It does so by calling an async function but, mistakenly, the
returned coroutine was not awaited. The coroutine is, apparently, truthy
so the assertions always passed.

Fix the test to properly await the coroutines in the assertions.

Fixes: SCYLLADB-905

Closes scylladb/scylladb#28875
2026-03-05 11:29:23 +01:00
Avi Kivity
5ae40caa6d dist: tune tcp_mem to 3% of total memory in scylla-kernel-conf package
tcp_mem defaults to 9% of total memory. ScyllaDB defaults to 93%. The
sum is more than 100%.

Fix by tuning tcp_mem to 3% of total memory.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-734

Closes scylladb/scylladb#28700
2026-03-05 12:51:04 +03:00
Patryk Jędrzejczak
bb1a798c2c Merge 'raft: Throw stopped_error if server aborted' from Dawid Mędrek
This PR solves a series of similar problems related to executing
methods on an already aborted `raft::server`. They materialize
in various ways:

* For `add_entry` and `modify_config`, a `raft::not_a_leader` with
  a null ID will be returned IF forwarding is disabled. This wasn't
  a big problem because forwarding has always been enabled for group0,
  but it's something that's nice to fix. It's might be relevant for
  strong consistency that will heavily rely on this code.

* For `wait_for_leader` and `wait_for_state_change`, the calls may
  hang and never resolve. A more detailed scenario is provided in a
  commit message.

For the last two methods, we also extend their descriptions to indicate
the new possible exception type, `raft::stopped_error`. This change is
correct since either we enter the functions and throw the exception
immediately (if the server has already been aborted), or it will be
thrown upon the call to `raft::server::abort`.

We fix both issues. A few reproducer tests have been included to verify
that the calls finish and throw the appropriate errors.

Fixes SCYLLADB-841

Backport: Although the hanging problems haven't been spotted so far
          (at least to the best of my knowledge), it's best to avoid
          running into a problem like that, so let's backport the
          changes to all supported versions. They're small enough.

Closes scylladb/scylladb#28822

* https://github.com/scylladb/scylladb:
  raft: Make methods throw stopped_error if server aborted
  raft: Throw stopped_error if server aborted
  test: raft: Introduce get_default_cluster
2026-03-05 10:47:39 +01:00
Marcin Maliszkiewicz
c3f59e4fa1 Merge 'cql3: implement write_consistency_levels guardrails' from Andrzej Jackowski
This patch series implements `write_consistency_levels_warned` and `write_consistency_levels_disallowed` guardrails, allowing the configuration of which consistency levels are unwanted for writes. The motivation for these guardrails is to forbid writing with consistency levels that don't provide high durability guarantees (like CL=ANY, ONE, or LOCAL_ONE).

Neither guardrail is enabled by default, so as not to disrupt clusters that are currently using any of the CLs for writes. The warning guardrail may seem harmless, as it only adds a warning to the CQL response; however, enabling it can significantly increase network traffic (as a warning message is added to each response) and also decrease throughput due to additional allocations required to prepare the warning. Therefore, both guardrails should be enabled with care. The newly added `writes_per_consistency_level` metric, which is incremented unconditionally, can help decide whether a guardrail can be safely enabled in an existing cluster.

This commit adds additional `if` instructions on the critical path. However, based on the `perf_simple_query` benchmark for writes, the difference is marginal (~40 additional instructions, which is a relative difference smaller than 0.001).

BEFORE:
```
291443.35 tps ( 53.3 allocs/op,  16.0 logallocs/op,  14.2 tasks/op,   48067 insns/op,   18885 cycles/op,        0 errors)
throughput:
 mean=   289743.07 standard-deviation=6075.60
 median= 291424.69 median-absolute-deviation=1702.56
 maximum=292498.27 minimum=261920.06
instructions_per_op:
 mean=   48072.30 standard-deviation=21.15
 median= 48074.49 median-absolute-deviation=12.07
 maximum=48119.87 minimum=48019.89
cpu_cycles_per_op:
 mean=   18884.09 standard-deviation=56.43
 median= 18877.33 median-absolute-deviation=14.71
 maximum=19155.48 minimum=18821.57
```

AFTER:
```
290108.83 tps ( 53.3 allocs/op,  16.0 logallocs/op,  14.2 tasks/op,   48121 insns/op,   18988 cycles/op,        0 errors)
throughput:
 mean=   289105.08 standard-deviation=3626.58
 median= 290018.90 median-absolute-deviation=1072.25
 maximum=291110.44 minimum=274669.98
instructions_per_op:
 mean=   48117.57 standard-deviation=18.58
 median= 48114.51 median-absolute-deviation=12.08
 maximum=48162.18 minimum=48087.18
cpu_cycles_per_op:
 mean=   18953.43 standard-deviation=28.76
 median= 18945.82 median-absolute-deviation=20.84
 maximum=19023.93 minimum=18916.46
```

Fixes: SCYLLADB-259
Refs: SCYLLADB-739
No backport, it's a new feature

Closes scylladb/scylladb#28570

* github.com:scylladb/scylladb:
  scylla.yaml: add write CL guardrails to scylla.yaml
  scylla.yaml: reorganize guardrails config to be in one place
  test: add cluster tests for write CL guardrails
  test: implement test_guardrail_write_consistency_level
  cql3: start using write CL guardrails
  cql3/query_processor: implement metrics to track CL of writes
  db: cql3/query_processor: add write_consistency_levels enum_sets
  config: add write_consistency_levels_* guardrails configuration
2026-03-05 09:55:38 +01:00
Yauheni Khatsianevich
aa85f5a9c3 test: migrating alternator ttl tests to scylla repo
migrating alternator_ttl_tests.py to scylla repo as part of
deprecating dtest framework
migrated tests:
- test_ttl_with_load_and_decommission

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-869

Closes scylladb/scylladb#28858
2026-03-05 10:04:14 +02:00
Nadav Har'El
8e32d97be6 test/alternator: fix run script
The test/alternator/run script currently fails, Scylla fails to boot
complaining that "--alternator-ttl-period-in-seconds" is specified
twice (which is, unfortunately, not allowed). The problem is that
recently we started to set this option in test/cqlpy/run.py, for
CQL's new row-level TTL, so now it is no longer needed in
test/alternator/run - and in fact not allowed and we must remove it.

This patch only affects the script test/alternator/run, and has no
affect on running tests through test.py or Jenkins.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#28868
2026-03-05 10:06:38 +03:00
Botond Dénes
9b2242c752 test/cluster/test_repair.py: fix test_repair_timtestamp_difference
This test forgot to await its check() calls, which is the pass-condition
of the test. Once the await was added, the test started failing. Turns
out, the test was broken, but this was never discovered, because due to
the missing await, the errors were not propagated.
This patch adds the missing await and fixes the discovered problems:
* Use cql.run_async() instead of cql.execute()
* Fix json path for timestamp
* Missing flush/compact

Fixes: SCYLLADB-911

Closes scylladb/scylladb#28883
2026-03-05 10:04:49 +03:00