Commit Graph

53948 Commits

Author SHA1 Message Date
Ferenc Szili
6b3e18c4a9 test: verify load balancer handles dropped tables gracefully
Add test_load_balancing_with_dropped_table that simulates the race between
DROP TABLE and the load balancer by capturing a token metadata snapshot
before dropping the table, then passing the stale snapshot to
balance_tablets(). Verifies it completes without aborting and produces no
migrations for the dropped table.
2026-04-27 10:33:56 +02:00
Ferenc Szili
4987204f71 tablet_allocator: handle dropped tables gracefully in get_schema_and_rs
The load balancer's get_schema_and_rs() would trigger on_internal_error when
a table present in the token metadata snapshot had been concurrently dropped
from the live schema. This race is possible because the balancer coroutine
yields between building the candidate list and checking replication
constraints, allowing a DROP TABLE schema mutation to be applied by another
fiber in the meantime.

Change get_schema_and_rs() to return {nullptr, nullptr} for dropped tables
instead of aborting. Update all callers to skip dropped tables:
- make_sizing_plan: continue to next table
- make_resize_plan: continue to next table (merge suppression is moot)
- check_constraints: return skip_info with empty viable targets
- get_rs: return nullptr, checked by check_constraints
2026-04-27 10:33:53 +02:00
Anna Mikhlin
86472e43e1 Update ScyllaDB version to: 2026.3.0-dev 2026-04-26 15:30:13 +03:00
Andrei Chekun
f2f4915e09 test.py: fix framework test
Framework test was not skipping unit directory where C++ tests are
located. With bug fixing this started to fail. Add ignoring this
directory as well.
2026-04-25 18:04:55 +02:00
Piotr Szymaniak
d5efd1f676 test/cluster: wait for Alternator readiness in server startup
server_add() only waits for CQL readiness before returning. The
Alternator HTTP port may not be listening yet, causing
ConnectionRefused with Alternator tests.

Extend the ServerUpState enum and startup loop to also check Alternator
port readiness when configured. Whenever Alternator port(s) is/are
configured, each is verified if connectable and queryable,
similar to how CQL ports are probed.

Fixes SCYLLADB-1701

Closes scylladb/scylladb#29625
2026-04-25 16:35:44 +03:00
Piotr Smaron
d14d07a079 test: fix flaky test_sstable_write_large_{row,cell} by using a fixed partition key
Commit ce00d61917 ("db: implement large_data virtual tables with feature
flag gating") changed these two tests to construct their mutation with
a randomly generated partition key (simple_schema::make_pkey()) instead
of the previously fixed pk "pv", with the comment that this avoids a
"Failed to generate sharding metadata" error.

simple_schema::make_pkey() delegates to tests::generate_partition_key(),
which defaults to key_size{1, 128}, i.e. the partition key length is
uniformly random in [1, 128] bytes. That interacts badly with the fact
that both tests pick thresholds at exact byte boundaries of the MC
sstable row encoding:

  - The large-data handler records a row's size as
      _data_writer->offset() - current_pos
    (sstables/mx/writer.cc: collect_row_stats()), i.e. the number of
    bytes the row took on disk.
  - For the first clustering row, the body includes a vint-encoded
    prev_row_size = pos - _prev_row_start.
  - _prev_row_start is captured at the start of the partition
    (consume_new_partition()) before the partition key is written to
    the data stream, so prev_row_size rolls in the partition key's
    serialized length (2-byte prefix + pk bytes) + deletion_time +
    static row size.

A random-size partition key therefore perturbs the first clustering
row's encoded size by 1-2 bytes across runs (the vint of prev_row_size
crosses the 128 boundary), flipping the test's byte-exact threshold
comparison. On seed 2104744000 this produced:

  critical check row_size_count == expected.size() has failed [3 != 2]

Fix the two byte-exact-sensitive tests by reverting their partition key
to the fixed s.new_mutation("pv") used before ce00d61917. Under smp=1
(which these tests run with, per -c1 in the test invocation) a fixed
key is always shard-local, so no sharding-metadata issue arises here.

The other tests modified by ce00d61917 (test_sstable_log_too_many_rows,
test_sstable_log_too_many_dead_rows, test_sstable_too_many_collection_elements,
test_large_data_records_round_trip, etc.) assert on row/element counts
or use thresholds with enough slack that the partition key size does
not matter, and are left unchanged.

Add an explanatory comment to each fixed site so the pitfall is not
re-introduced by a future refactor.

Verified stable with:
  ./test.py --mode=dev     test/boost/sstable_3_x_test.cc::test_sstable_write_large_row  --repeat 100 --max-failures 1
  ./test.py --mode=dev     test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1
  ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_row  --repeat 100 --max-failures 1
  ./test.py --mode=release test/boost/sstable_3_x_test.cc::test_sstable_write_large_cell --repeat 100 --max-failures 1

All four invocations: 100/100 passed.

Fixes: SCYLLADB-1685

Closes scylladb/scylladb#29621
2026-04-25 16:32:02 +03:00
Andrei Chekun
92c09d106d test.py: fix test collection bug
In certain circumstances current way of collecting can be error prone.
Collection can stop when the first file is skipped in the mode leaving
the rest of the files in CLI not collected.
Another issue that if the file specified twice, with directory and file
explicitly, it will produce incorrect CppFile in the stash causing
KeyError.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1714
2026-04-24 17:57:11 +02:00
Dimitrios Symonidis
c40842f60a db, sstables: add node_owner to sstables registry primary key
Add a node_owner column (locator::host_id) to system.sstables and
make it part of the partition key, so the primary key becomes
  PRIMARY KEY ((table_id, node_owner), generation).

This is the first step toward moving the sstables registry into
system_distributed: once distributed, each node's startup scan
must read only the rows it owns, which requires the owning node
to be part of the partition key. Partitioning by (table_id,
node_owner) turns that scan into a single-partition read of
exactly the local node's rows.

The new column is populated via sstables_manager::get_local_host_id().
No backward compatibility is preserved; the feature is experimental
and gated by keyspace-storage-options.
2026-04-24 16:41:09 +02:00
Dimitrios Symonidis
ce78c5113e db, sstables: rename sstables registry column owner to table_id
The partition-key column in system.sstables named 'owner' actually
holds a table_id. Rename the CQL column and the matching C++
parameter and member names so the identifier describes what it
stores. No behavior change.

This prepares the schema for an upcoming node_owner partition-key
column (the local host id), which needs a free name.
2026-04-24 16:24:07 +02:00
Pavel Emelyanov
71b9704464 storage_proxy: Use shared updateable_timeout_config for CAS contention timeout
The cas_contention_timeout_in_ms option is already exposed via the
shared updateable_timeout_config as cas_timeout_in_ms. Read it from
there instead of going through db::config, dropping another use of
database as a db::config proxy.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 16:24:32 +03:00
Pavel Emelyanov
33cd3b5d68 alternator: Use shared updateable_timeout_config by reference
Pass sharded<updateable_timeout_config>& into alternator::controller
and through to alternator::server, which now stores a reference
instead of constructing its own updateable_timeout_config from
proxy.data_dictionary().get_config(). This removes the last
creator of a per-owner updateable_timeout_config copy and completes
the consolidation onto the single sharded<updateable_timeout_config>
instance built in main.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 15:29:39 +03:00
Pavel Emelyanov
1a045d0cdd cql_transport: Use shared updateable_timeout_config by reference
Pass sharded<updateable_timeout_config>& into cql_transport::controller,
which feeds the shard-local instance as a reference into
cql_server_config::timeout_config. This drops the per-shard local
updateable_timeout_config constructed from db::config inside the
controller's sharded_parameter lambda, replacing it with a reference
into the shared sharded instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 15:21:31 +03:00
Pavel Emelyanov
aa99c1fd6e storage_proxy: Use shared updateable_timeout_config by reference
Drop storage_proxy's own updateable_timeout_config member built from
db::config and take a reference to the shared sharded instance
introduced by the previous patch. Both main and cql_test_env pass
std::ref(timeout_cfg) into storage_proxy::start so each shard's
storage_proxy references its shard-local updateable_timeout_config.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 15:07:21 +03:00
Pavel Emelyanov
7b7295fde0 main: Introduce sharded<updateable_timeout_config>
Build a single sharded updateable_timeout_config from db::config in
both main and cql_test_env, sitting next to sharded<cql_config>.
Subsequent patches migrate storage_proxy, the CQL transport controller
and alternator server from their per-owner updateable_timeout_config
copies to references into this shared instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 15:03:35 +03:00
Andrzej Jackowski
8855e77465 auth: make shutdown the exact reverse of startup
The previous parallel stop of the authenticator and authorizer
was a micro-optimization that obscured the lifecycle invariant
that shutdown should reverse startup.

Refs SCYLLADB-1679
2026-04-24 13:34:09 +02:00
Andrzej Jackowski
adf1e26bab test: ldap: add test for pruner crash during shutdown
Verify that service::stop() drains the LDAP pruner before
clearing the permission loader. The test installs a slow
permission loader and confirms the pruner is actively
reloading when teardown begins.

Refs SCYLLADB-1679
2026-04-24 13:34:09 +02:00
Andrzej Jackowski
37a547604f auth: start authorizer and set permission loader before role manager
LDAP role manager starts a pruner fiber that calls
reload_all_permissions() which asserts _permission_loader is set.
The permission loader calls _authorizer->authorize(), so the
authorizer must be started before the loader is set.

Start authorizer, then set the permission loader, then start the
role manager, ensuring both dependencies are satisfied before the
pruner can fire.

Fixes SCYLLADB-1679
2026-04-24 13:34:09 +02:00
Andrzej Jackowski
c3e5285d45 auth: stop role manager before clearing permission loader
service::stop() cleared the permission loader and stopped
the role manager concurrently (via when_all_succeed). The
LDAP pruner could be mid-reload at a yield point when the
loader was set to null, causing it to call a null function.

Stop the role manager first so the pruner is fully drained
before the loader is cleared.

Fixes SCYLLADB-1679
2026-04-24 13:34:09 +02:00
Pavel Emelyanov
7ca8a863d9 storage_proxy: Keep own updateable_timeout_config
Storage_proxy was reading read_request_timeout_in_ms and
write_request_timeout_in_ms directly from db::config via
database::get_config() at four call sites. Give storage_proxy its own
updateable_timeout_config member (built from db::config the same way
cql transport controller and alternator server do) and use its
read_timeout_in_ms / write_timeout_in_ms observers instead.

Storage_proxy no longer needs database::get_config() for coordinator
timeout values. A later refactor may turn these per-owner copies into
references to a single shared updateable_timeout_config.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 14:27:09 +03:00
Andrzej Jackowski
f75e5ac65b auth: reload LDAP permission cache on local shard only
The LDAP role manager's _cache_pruner fiber used
invoke_on_all() to reload permissions on every shard.
Since auth::service::start() runs on all shards in
parallel via invoke_on_all(), the pruner on shard X
could call reload_all_permissions() on shard Y before
shard Y finished start() and set its permission loader,
hitting SCYLLA_ASSERT(_permission_loader). The same
cross-shard race existed during shutdown.

Each shard runs its own pruner instance, so reloading
locally is sufficient — all shards are still covered.
This also removes redundant N-squared reload calls.

Refs SCYLLADB-1679
2026-04-24 13:06:58 +02:00
Pavel Emelyanov
111165d9de view: Turn calculate_view_update_throttling_delay into node_update_backlog member
The free function calculate_view_update_throttling_delay() took the
view_flow_control_delay_limit_in_ms as a parameter, which forced its
two callers (storage_proxy and view_update_generator) to fish the
option out of db::config via database::get_config(). Now that the
option lives on node_update_backlog, make the throttling calculation a
member of node_update_backlog and have the callers invoke it on their
node_update_backlog reference.

This removes two database::get_config() call sites.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 13:52:12 +03:00
Pavel Emelyanov
855372db3c view: Place view_flow_control_delay_limit_in_ms on node_update_backlog
Store the view_flow_control_delay_limit_in_ms config option as an
updateable_value on node_update_backlog. The value is threaded from
main.cc into the backlog object at construction time. Existing call
sites (tests) that construct node_update_backlog without the option
continue to work via a default argument.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 13:47:54 +03:00
Pavel Emelyanov
ec2339e635 view: Add node_update_backlog reference to view_update_generator
Pass node_update_backlog explicitly to view_update_generator via its
constructor and start() call. This is plumbing only; no behavior change.
A subsequent patch will use this reference to compute view update
throttling delays without going through database::get_config().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-24 13:45:46 +03:00
Botond Dénes
70261dc674 Merge 'test/cluster: scale failure_detector_timeout_in_ms by build mode' from Marcin Maliszkiewicz
The failure_detector_timeout_in_ms override of 2000ms in 6 cluster test files is too aggressive for debug/sanitize builds. During node joins, the coordinator's failure detector times out on RPC pings to the joining node while it is still applying schema snapshots, marks it DOWN, and bans it — causing flaky test failures.

Scale the timeout by MODES_TIMEOUT_FACTOR (3x for debug/sanitize, 2x for dev, 1x for release) via a shared failure_detector_timeout fixture in conftest.py.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1587
Backport: no, elasticsearch analyser shows only a single failure

Closes scylladb/scylladb#29522

* github.com:scylladb/scylladb:
  test/cluster: scale failure_detector_timeout_in_ms by build mode
  test/cluster: add failure_detector_timeout fixture
2026-04-24 09:10:43 +03:00
Botond Dénes
d280517e27 test/cluster/test_incremental_repair: fix flaky do_tablet_incremental_repair_and_ops
The log grep in get_sst_status searched from the beginning of the log
(no from_mark), so the second-repair assertions were checking cumulative
counts across both repairs rather than counts for the second repair alone.

The expected values (sst_add==2, sst_mark==2) relied on this cumulative
behaviour: 1 from the first repair + 1 from the second = 2. This works
when the second repair encounters exactly one unrepaired sstable, but
fails whenever the second repair sees two.

The second repair can see two unrepaired sstables when the 100 keys
inserted before it (via asyncio.gather) trigger a background auto-flush
before take_storage_snapshot runs. take_storage_snapshot always flushes
the memtable itself, so if an auto-flush already split the batch into two
sstables on disk, the second repair's snapshot contains both and logs
"Added sst" twice, making the cumulative count 3 instead of 2.

Fix: take a log mark per-server before each repair call and pass it to
get_sst_status so each check counts only the entries produced by that
repair. The expected values become 1/0/1 and 1/1/1 respectively,
independent of how many sstables happened to exist beforehand.

get_sst_status gains an optional from_mark parameter (default None)
which preserves existing call sites that intentionally grep from the
start of the log.

Fixes: SCYLLADB-1086

Closes scylladb/scylladb#29484
2026-04-23 17:17:16 +02:00
Wojciech Mitros
7634d3f7d4 test/cluster: fix flaky test_hints_consistency_during_replace
The test creates a sync point immediately after writing 100 rows
with CL=ANY, without waiting for pending hint writes to complete.

store_hint() is fire-and-forget: it submits do_store_hint() to a gate
and returns immediately. do_store_hint() updates _last_written_rp only
after writing to the commitlog. If create_sync_point() is called before
all do_store_hint() coroutines complete, the captured replay position
is stale, and await_sync_point() returns DONE before all hints are
replayed, leaving some rows missing.

Fix by waiting for the size_of_hints_in_progress metric to reach zero
before creating the sync point, ensuring all in-flight hint writes have
completed and _last_written_rp is up to date. This follows the same
pattern already used in test_sync_point.

Fixes: SCYLLADB-1560

Closes scylladb/scylladb#29623
2026-04-23 17:03:48 +02:00
Botond Dénes
b49cf6247f test: fix flaky test_read_repair_with_trace_logging by reading tracing with CL=ALL
Tracing events are written to system_traces.events with CL=ANY, so they
are only guaranteed to be present on the local node of the query
coordinator. Reading them back with the driver default (CL=LOCAL_ONE)
may route the query to a replica that has not yet received all events,
causing the assertion on 'digest mismatch, starting read repair' to fail
intermittently.

Fix execute_with_tracing() to read tracing via the ResponseFuture API
with query_cl=ConsistencyLevel.ALL, so events from all replicas are
merged before the caller inspects them.

Fixes: SCYLLADB-1633

Closes scylladb/scylladb#29566
2026-04-23 16:57:29 +02:00
Michał Jadwiszczak
878f341338 test/cluster/test_view_building_coordinator: fix view_updates_drained predicate
The previous fix for the flakiness in test_file_streaming waited for
the scylla_database_view_update_backlog metric to drop to 0 via
wait_for(view_updates_drained, ...). However, the predicate returned
True/False, while wait_for treats any non-None result as 'done' and
keeps retrying only on None. So when the backlog was non-zero the
predicate returned False, which wait_for interpreted as success and
returned immediately - the test could then stop servers[0]/servers[1]
before the view updates generated by new_server from the migrated
staging sstable were actually delivered, leading to a partially
populated MV (e.g. 431/1000 rows) and a failing assertion.

Fix the predicate to return None instead of False when the backlog is
not yet drained, so wait_for will actually retry until the metric
reaches 0 (or the deadline is hit).

Fixes SCYLLADB-1182

Closes scylladb/scylladb#29587
2026-04-23 17:52:22 +03:00
Andrei Chekun
67b3ad94a0 test.py: enhance error output in case no tests were executed
By default, pytest produces the error if provided file is not exists. But
coupled with xdist it will produce no errors. This is due how the pytest
works with xdist. test.py always uses the parameter -n, so if something
will go wrong there will be no errors produced, only exit code 5 will be
thrown. This PR will print warning in case pytest's exit code is 5.

Closes scylladb/scylladb#29584
2026-04-23 14:03:55 +02:00
Calle Wilund
c97ce32f47 Update position in dma_read(iovec) in create_file_for_seekable_source
Fixes: SCYLLADB-1523

The returned file object does not increment file pos as is. One line fix.
Added test to make sure this read path works as expected.

Closes scylladb/scylladb#29456
2026-04-23 14:54:20 +03:00
Michael Litvak
3468e8de8b test/mv/test_mv_staging: wait for cql after restart
Wait for cql on all hosts after restarting a server in the test.

The problem that was observed is that the test restarts servers[1] and
doesn't wait for the cql to be ready on it. On test teardown it drops
the keyspace, trying to execute it on the host that is not ready, and
fails.

Fixes SCYLLADB-1632

Closes scylladb/scylladb#29562
2026-04-23 12:40:19 +02:00
Benny Halevy
6cb4c27f8c test/cluster/dtest/ccmlib/scylla_node: add debug logging
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2026-04-23 09:21:06 +03:00
Andrzej Jackowski
2503546251 test: audit: parameterize source address in audit assertions
Maintenance socket connections report a different source address
than regular CQL connections. Make the source field configurable
in the audit test helpers so that upcoming maintenance socket
tests can verify the correct address.

Also fix the syslog backend address parser to handle IPv6
addresses formatted as [ip]:port.

Refs SCYLLADB-1615
2026-04-23 07:02:02 +02:00
Marcin Maliszkiewicz
3df951bc9c Merge 'audit: set audit_info for native-protocol BATCH messages' from Andrzej Jackowski
Commit 16b56c2451 ("Audit: avoid dynamic_cast on a hot path") moved
audit info into batch_statement via set_audit_info(), but only wired it
for the CQL-text BATCH path (raw::batch_statement::prepare()).
Native-protocol BATCH messages (opcode 0x0D), handled by
process_batch_internal in transport/server.cc, construct a
batch_statement without setting audit_info. This causes audit to
silently skip the entire batch.

Set audit_info on the batch_statement so these batches are audited.

Fixes SCYLLADB-1652

No backport - bug introduced recently.

Closes scylladb/scylladb#29570

* github.com:scylladb/scylladb:
  test/audit: add reproducer for native-protocol batch not being audited
  audit: set audit_info for native-protocol BATCH messages
  test/audit: rename internal test methods to avoid CI misdetection
2026-04-22 18:56:28 +02:00
Piotr Szymaniak
9a86044c63 test: Stop providing alternator-streams experimental flag
Now that alternator-streams is no longer an experimental feature,
stop passing it in test configurations.
2026-04-22 15:25:37 +02:00
Piotr Szymaniak
870013b437 alternator: Graduate Alternator Streams from experimental
Alternator Streams were experimental until 2026.2, when they became GA.
Stop requiring `--experimental-features=alternator-streams` by:

- Removing ALTERNATOR_STREAMS from the experimental feature enum
- Mapping "alternator-streams" to UNUSED for backward compatibility
- Removing the gating that disabled the ALTERNATOR_STREAMS gossip
  feature when the experimental flag was absent
- Removing the runtime guard that rejected StreamSpecification requests
  without the feature flag
- Updating config_test to reflect the new UNUSED mapping

The gms::feature alternator_streams is kept for rolling upgrade
compatibility with older nodes.

Fixes SCYLLADB-1680
2026-04-22 15:22:15 +02:00
Botond Dénes
eb3326b417 Merge 'test.py: migrate all bare skips to typed skip markers' from Artsiom Mishuta
should be merged after #29235

Complete the typed skip markers migration started in the plugin PR.
Every bare `@pytest.mark.skip` decorator and `pytest.skip()` runtime call
across the test suite is replaced with a typed equivalent, making skip
reasons machine-readable in JUnit XML and Allure reports.

**62 files changed** across 8 commits, covering ~127 skip sites in total.

Bare `pytest.skip` provides only a free-text reason string. CI dashboards
(JUnit, Allure) cannot distinguish between a test skipped due to a known
bug, a missing feature, a slow test, or an environment limitation. This
makes it hard to track skip debt, prioritize fixes, or filter dashboards
by skip category.

The typed markers (`skip_bug`, `skip_not_implemented`, `skip_slow`,
`skip_env`) introduced by the `skip_reason_plugin` solve this by embedding
a `skip_type` field into every skip report entry.

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_bug` | 24 | 16 | Skip reason references a known bug/issue |
| `skip_not_implemented` | 10 | 5 | Feature not yet implemented in Scylla |
| `skip_slow` | 4 | 3 | Test too slow for regular CI runs |
| `skip_not_implemented` (bare) | 2 | 1 | Bare `@pytest.mark.skip` with no reason (COMPACT STORAGE, #3882) |

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_env` | ~85 | 34 | Feature/config/topology not available at runtime |
| `skip_bug` | 2 | 2 | Known bugs: Streams on tablets (#23838), coroutine task not found (#22501) |

- **Comments**: 7 comments/docstrings across 5 files updated from `pytest.skip()` to `skip()`
- **Plugin hardened**: `warnings.warn()` → `pytest.UsageError` for bare `@pytest.mark.skip` at collection time — bare skips are now a hard error, not a warning
- **Guard tests**: New `test/pylib_test/test_no_bare_skips.py` with 3 tests that prevent regression:
  - AST scan for bare `@pytest.mark.skip` decorators
  - AST scan for bare `pytest.skip()` runtime calls
  - Real `pytest --collect-only` against all Python test directories

Runtime skip sites use the convenience wrappers from `test.pylib.skip_types`:
```python
from test.pylib.skip_types import skip_env
```

Usage:
```python
skip_env("Tablets not enabled")
```

1. **test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs** — 24 decorator sites, 16 files
2. **test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented** — 10 decorator sites, 5 files
3. **test: migrate @pytest.mark.skip to @pytest.mark.skip_slow** — 4 decorator sites, 3 files
4. **test: migrate bare @pytest.mark.skip to skip_not_implemented** — 2 bare decorators, 1 file
5. **test: migrate runtime pytest.skip() to typed skip_env()** — ~85 sites, 34 files
6. **test: migrate runtime pytest.skip() to typed skip_bug()** — 2 sites, 2 files
7. **test: update comments referencing pytest.skip() to skip()** — 7 comments, 5 files
8. **test/pylib: reject bare pytest.mark.skip and add codebase guards** — plugin hardening + 3 guard tests

- All 60 plugin + guard tests pass (`test/pylib_test/`)
- No bare `@pytest.mark.skip` or `pytest.skip()` calls remain in the codebase
- `pytest --collect-only` succeeds across all test directories with the hardened plugin

SCYLLADB-1349

Closes scylladb/scylladb#29305

* github.com:scylladb/scylladb:
  test/alternator: replace bare pytest.skip() with typed skip helpers
  test: migrate new bare skips introduced by upstream after rebase
  test/pylib: reject bare pytest.mark.skip and add codebase guards
  test: update comments referencing pytest.skip() to skip_env()
  test: migrate runtime pytest.skip() to typed skip_bug()
  test: migrate runtime pytest.skip() to typed skip_env()
  test: migrate bare @pytest.mark.skip to skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_slow
  test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs
2026-04-22 15:48:27 +03:00
Avi Kivity
e84e7dfb7a build: drop utils/rolling_max_tracker.hh from precompiled header
Added by mistake. Precompiled headers should only include library
headers that rarely change, since any dependency change causes a
full rebuild.

Closes scylladb/scylladb#29560
2026-04-22 15:46:50 +03:00
Botond Dénes
3aced88586 Merge 'audit: decrease allocations / instructions on will_log() fast path' from Marcin Maliszkiewicz
Audit::will_log() runs on every CQL/Alternator request. Since
9646ee05bd it constructs three temporary sstrings per call to look up
the audited keyspaces set / tables map with std::string_view keys,
costing ~180 insns/op and 2 allocations if sstring misses SSO.

This series switches the containers to std::less<> comparators to
enable heterogeneous lookup, then drops the sstring temporaries from
will_log().

perf-simple-query --smp 1 --duration 15 --audit "table"
                  --audit-keyspaces "ks-non-existing"
                  --audit-categories "DCL,DDL,AUTH,DML,QUERY"

  baseline         3d0582d51e          36777 insns/op
  regression     9646ee05bd          36952        (+175)
  this series                                      36768        (-184, fixed)

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1616
Backport: no, offending commit is not backported

Closes scylladb/scylladb#29565

* github.com:scylladb/scylladb:
  audit: drop sstring temporaries on the will_log() fast path
  audit: enable heterogeneous lookup on audited keyspaces/tables
2026-04-22 15:46:16 +03:00
Marcin Maliszkiewicz
4043d95810 Merge 'storage_service: fix REST API races during shutdown and cross-shard forwarding' from Piotr Smaron
REST route removal unregisters handlers but does not wait for requests
that already entered storage_service.  A request can therefore suspend
inside an async operation, restart proceeds to tear the service down,
and the coroutine later resumes against destroyed members such as
_topology_state_machine, _group0, or _sys_ks — a use-after-destruction
bug that surfaces as UBSAN dynamic-type failures (e.g. the crash seen
from topology_state_load()).

Fix this by holding storage_service::_async_gate from the entry
boundary of every externally-triggered async operation so that stop()
drains them before teardown begins.  The gate is acquired in
run_with_api_lock, run_with_no_api_lock, and in individual REST
handlers that bypass those wrappers (reload_raft_topology_state,
mark_excluded, removenode, schema reload, topology-request
waits/abort, cleanup, ring/schema queries, SSTable dictionary
training/publish, and sampling).

Additionally, fix get_ownership() and abort_topology_request() which
forward work to shard 0 but were still referencing the caller-shard's
`this` pointer instead of the destination-shard instance, causing
silent cross-shard access to shard-local state.
Add a cluster regression test that repeatedly exercises the multi-shard
ownership REST path to cover the forwarding fix.

Fixes: SCYLLADB-1415

Should be backported to all branches, the code has been introduced around 2024.1 release.

Closes scylladb/scylladb#29373

* github.com:scylladb/scylladb:
  storage_service: fix shard-0 forwarding in REST helpers
  storage_service: gate REST-facing async operations during shutdown
  storage_service: prepare for async gate in REST handlers
2026-04-22 14:43:31 +02:00
Radosław Cybulski
cc39b54173 alternator: use stream_arn instead of std::string in list_streams
Use `stream_arn` object for storage of last returned to the user stream
instead of raw `std::string`. `stream_arn` is used for parsing ARN
incoming from the user, for returning `std::string` was used because of
buggy copy / move operations of `stream_arn`. Those were fixed, so we're
fixing usage as well.

Fixes: SCYLLADB-1241

Closes scylladb/scylladb#29578
2026-04-22 14:02:53 +02:00
Artsiom Mishuta
183c6d120e test: exclude pylib_test from default test runs
Add pylib_test to norecursedirs in pytest.ini so it is not collected
during ./test.py or pytest test/ runs, but can still be run directly
via 'pytest test/pylib_test'.

Also fix pytest log cleanup: worker log files (pytest_gw*) were not
being deleted on success because cleanup was restricted to the main
process only. Now each process (main and workers) cleans up its own
log file on success.

Closes scylladb/scylladb#29551
2026-04-22 11:38:40 +02:00
Piotr Smaron
dffb266b79 storage_service: fix shard-0 forwarding in REST helpers
get_ownership() and abort_topology_request() forward work to shard 0
via container().invoke_on(0, ...) but the lambda captured 'this' and
accessed members through it instead of through the shard-0 'ss'
parameter.  This means the lambda used the caller-shard's instance,
defeating the purpose of the forwarding.

Use the 'ss' parameter consistently so the operations run against the
correct shard-0 state.
2026-04-22 10:30:33 +02:00
Piotr Smaron
6a91d046f3 storage_service: gate REST-facing async operations during shutdown
Hold _async_gate in all REST-facing async operations so that stop()
drains in-flight requests before teardown, preventing use-after-free
crashes when REST calls race with shutdown.

A centralized gated() wrapper in set_storage_service (api/storage_service.cc)
automatically holds the gate for every REST handler registered there,
so new handlers get shutdown-safety by default.

run_with_api_lock_internal and run_with_no_api_lock hold _async_gate on
shard 0 as well, because REST requests arriving on any shard are forwarded
there for execution.

Methods that previously self-forwarded to shard 0 (mark_excluded,
prepare_for_tablets_migration, set_node_intended_storage_mode,
get_tablets_migration_status, finalize_tablets_migration) now assert
this_shard_id() == 0.  Their REST handlers call them via
run_with_no_api_lock, which performs the shard-0 hop and gate hold
centrally.

Fixes: SCYLLADB-1415
2026-04-22 10:30:33 +02:00
Piotr Smaron
74dd33811e storage_service: prepare for async gate in REST handlers
Add hold_async_gate() public accessor for use by the REST registration
layer in a followup commit.

Convert run_with_no_api_lock to a coroutine so a followup commit can
hold the async gate across the entire forwarded operation.

No functional changes.
2026-04-22 10:28:54 +02:00
Michał Jadwiszczak
2b29962583 test/strong_consistency: verify metrics
This patch adds simple asserts to an existing `test_basic_write_read`
to verify that strong consistency metrics are correctly collected.
2026-04-22 10:06:49 +02:00
Botond Dénes
18ceeaf3ef Merge 'Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode' from Raphael Raph Carvalho
When tombstone_gc=repair, the repaired compaction view's sstable_set_for_tombstone_gc()
previously returned all sstables across all three views (unrepaired, repairing, repaired).
This is correct but unnecessarily expensive: the unrepaired and repairing sets are never
the source of a GC-blocking shadow when tombstone_gc=repair, for base tables.

The key ordering guarantee that makes this safe is:
- topology_coordinator sends send_tablet_repair RPC and waits for it to complete.
  Inside that RPC, mark_sstable_as_repaired() runs on all replicas, moving D from
  repairing → repaired (repaired_at stamped on disk).
- Only after the RPC returns does the coordinator commit repair_time + sstables_repaired_at
  to Raft.
- gc_before = repair_time - propagation_delay only advances once that Raft commit applies.

Therefore, when a tombstone T in the repaired set first becomes GC-eligible (its
deletion_time < gc_before), any data D it shadows is already in the repaired set on
every replica. This holds because:
- The memtable is flushed before the repairing snapshot is taken (take_storage_snapshot
  calls sg->flush()), capturing all data present at repair time.
- Hints and batchlog are flushed before the snapshot, ensuring remotely-hinted writes
  arrive before the snapshot boundary.
- Legitimate unrepaired data has timestamps close to 'now', always newer than any
  GC-eligible tombstone (USING TIMESTAMP to write backdated data is user error / UB).

Excluding the repairing and unrepaired sets from the GC shadow check cannot cause any
tombstone to be wrongly collected. The memtable check is also skipped for the same
reason: memtable data is either newer than the GC-eligible tombstone, or was flushed
into the repairing/repaired set before gc_before advanced.

Safety restriction — materialized views:
The optimization IS applied to materialized view tables. Two possible paths could inject
D_view into the MV's unrepaired set after MV repair: view hints and staging via the
view-update-generator. Both are safe:

(1) View hints: flush_hints() creates a sync point covering BOTH _hints_manager (base
mutations) AND _hints_for_views_manager (view mutations). It waits until ALL pending view
hints — including D_view entries queued in _hints_for_views_manager while the target MV
replica was down — have been replayed to the target node before take_storage_snapshot() is
called. D_view therefore lands in the MV's repairing sstable and is promoted to repaired.
When a repaired compaction then checks for shadows it finds D_view in the repaired set,
keeping T_mv non-purgeable.

(2) View-update-generator staging path: Base table repair can write a missing D_base to a
replica via a staging sstable. The view-update-generator processes the staging sstable
ASYNCHRONOUSLY: it may fire arbitrarily later, even after MV repair has committed
repair_time and T_mv has been GC'd from the repaired set. However, the staging processor
calls stream_view_replica_updates() which performs a READ-BEFORE-WRITE via
as_mutation_source_excluding_staging(): it reads the CURRENT base table state before
building the view update. If T_base was written to the base table (as it always is before
the base replica can be repaired and the MV tombstone can become GC-eligible), the
view_update_builder sees T_base as the existing partition tombstone. D_base's row marker
(ts_d < ts_t) is expired by T_base, so the view update is a no-op: D_view is never
dispatched to the MV replica. No resurrection can occur regardless of how long staging is
delayed.

A potential sub-edge-case is T_base being purged BEFORE staging fires (leaving D_base as
the sole survivor, so stream_view_replica_updates would dispatch D_view). This is blocked
by an additional invariant: for tablet-based tables, the repair writer stamps repaired_at
on staging sstables (repair_writer_impl::create_writer sets mark_as_repaired = true and
perform_component_rewrite writes repaired_at = sstables_repaired_at + 1 on every staging
sstable). After base repair commits sstables_repaired_at to Raft, the staging sstable
satisfies is_repaired(sstables_repaired_at, staging_sst) and therefore appears in
make_repaired_sstable_set(). Any subsequent base repair that advances sstables_repaired_at
further still includes the staging sstable (its repaired_at ≤ new sstables_repaired_at).
D_base in the staging sstable thus shadows T_base in every repaired compaction's shadow
check, keeping T_base non-purgeable as long as D_base remains in staging.

A base table hint also cannot bypass this. A base hint is replayed as a base mutation. The
resulting view update is generated synchronously on the base replica and sent to the MV
replica via _hints_for_views_manager (path 1 above), not via staging.

USING TIMESTAMP with timestamps predating (gc_before + propagation_delay) is explicitly
UB and excluded from the safety argument.

For tombstone_gc modes other than repair (timeout, immediate, disabled) the invariant
does not hold for base tables either, so the full storage-group set is returned.

The expected gain is reduced bloom filter and memtable key-lookup I/O during repaired
compactions: the unrepaired set is typically the largest (it holds all recent writes),
yet for tombstone_gc=repair it never influences GC decisions.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-231.

Closes scylladb/scylladb#29310

* github.com:scylladb/scylladb:
  compaction: Restrict tombstone GC sstable set to repaired sstables for tombstone_gc=repair mode
  test/repair: Add tombstone GC safety tests for incremental repair
2026-04-22 10:21:37 +03:00
Michał Jadwiszczak
7352b37048 test/cluster/test_view_building_coordinator: add reproducer for tombstone threshold warning 2026-04-22 09:10:14 +02:00
Michał Jadwiszczak
396d4b17a0 docs: document tombstone avoidance in view_building_tasks 2026-04-22 09:10:14 +02:00
Michał Jadwiszczak
1162fd315e view_building: add task_uuid_generator to view_building_task_mutation_builder
Following previous commit, use the generator in view building task mutation builder.
2026-04-22 09:10:14 +02:00