Commit Graph

11361 Commits

Author SHA1 Message Date
Dawid Mędrek
a8bc90a375 Merge 'cql3: fix DESCRIBE INDEX WITH INTERNALS name' from Piotr Smaron
This series fixes two related inconsistencies around secondary-index
names.
1. `DESCRIBE INDEX ... WITH INTERNALS` returned the backing
   materialized-view name in the `name` column instead of the logical
   index name.
2. The snapshot REST API accepted backing table names for MV-backed
   secondary indexes, but not the logical index names exposed to users.

The snapshot side now resolves logical secondary-index names to backing
table names where applicable, reports logical index names in snapshot
details, rejects vector index names with HTTP 400, and keeps multi-keyspace
DELETE atomic by resolving all keyspaces before deleting anything.
The tests were also extended accordingly, and the snapshot test helper
was fixed to clean up multi-table snapshots using one DELETE per table.

Fixes: SCYLLADB-1122

Minor bugfix, no need to backport.

Closes scylladb/scylladb#29083

* github.com:scylladb/scylladb:
  cql3: fix DESCRIBE INDEX WITH INTERNALS name
  test: add snapshot REST API tests for logical index names
  test: fix snapshot cleanup helper
  api: clarify snapshot REST parameter descriptions
  api: surface no_such_column_family as HTTP 400
  db: fix clear_snapshot() atomicity and use C++23 lambda form
  db: normalize index names in get_snapshot_details()
  db: add resolve_table_name() to snapshot_ctl
2026-04-09 08:37:51 +03:00
Piotr Smaron
d458ff50b0 cql3: fix DESCRIBE INDEX WITH INTERNALS name
DESCRIBE INDEX ... WITH INTERNALS returned the name of
the backing materialized view in the name column instead
of the logical index name.

Return the logical index name from schema::describe()
for index schemas so all callers observe the
user-facing name consistently.

Fixes: SCYLLADB-1122
2026-04-08 13:38:17 +02:00
Piotr Smaron
04837ba20f test: add snapshot REST API tests for logical index names
Add focused REST coverage for logical secondary-index
names in snapshot creation, deletion, and details
output.

Also cover vector-index rejection and verify
multi-keyspace delete resolves all keyspaces before
deleting anything so mixed index kinds cannot cause
partial removal.
2026-04-08 13:38:17 +02:00
Piotr Smaron
6b85da3ce3 test: fix snapshot cleanup helper
The snapshot REST helper cleaned up multi-table
snapshots with a single DELETE request that passed a
comma-separated cf filter, but the API accepts only one
table name there.

Delete each table snapshot separately so existing tests
that snapshot multiple tables use the API as
documented.
2026-04-08 13:36:27 +02:00
Piotr Smaron
39baa1870e db: normalize index names in get_snapshot_details()
Snapshot details exposed backing secondary-index view
names instead of logical index names.

Normalize index entries in get_snapshot_details() so the
REST API reports the user-facing name, and update the
existing REST test to assert that behavior directly.
2026-04-08 13:36:27 +02:00
Petr Gusev
7750d5737c strong consistency: replace local consistency with global
Currently we don't support 'local' consistency, which would
imply maintaining separate raft group for each dc. What we
support is actually 'global' consistency -- one raft group
per tablet replica set. We don't plan to support local
consistency for the first GA.

Closes scylladb/scylladb#29221
2026-04-08 12:52:32 +02:00
Patryk Jędrzejczak
850db950f8 Merge 'raft: include demoted voters in read barrier during joint config' from Qian Cheng
Hi, thanks for Scylla!

We found a small issue in tracker::set_configuration() during joint consensus and put together a fix.

When a server is demoted from voter to non-voter, set_configuration processes the current config first (can_vote=false), then the previous config. But when it finds the server already in the progress map (tracker.cc:118), it hits `continue` without updating can_vote. So the server's follower_progress::can_vote stays false even though it's still a voter in the previous config.

This causes broadcast_read_quorum (fsm.cc:1055) to skip the demoted server, reducing the pool of responders. Since committed() correctly includes the server in _previous_voters for quorum calculation, read barriers can stall if other servers are slow.

The fix is to use configuration::can_vote() in tracker::set_configuration.

We included a reproduction unit test (test_tracker_voter_demotion_joint_config) that extracts the set_configuration algorithm and demonstrates the mismatch. We weren't able to build the full Scylla test suite to add an in-tree test, so we kept it as a standalone file for reference.

No backport: the bug is non-critical and the change needs some soak time in master.

Closes scylladb/scylladb#29226

* https://github.com/scylladb/scylladb:
  fix: use is_voter::yes instead of true in test assertions
  test: add tracker voter demotion test to fsm_test.cc
  fix: use configuration::can_vote() in tracker::set_configuration
2026-04-08 12:37:27 +02:00
Qian-Cheng-nju
a416238155 test: add tracker voter demotion test to fsm_test.cc 2026-04-08 12:37:19 +02:00
Michał Jadwiszczak
568f20396a test: fix flaky test_create_index_synchronous_updates trace event race
The test_create_index_synchronous_updates test in test_secondary_index_properties.py
was intermittently failing with 'assert found_wanted_trace' because the expected
trace event 'Forcing ... view update to be synchronous' was missing from the
trace events returned by get_query_trace().

Root cause: trace events are written asynchronously to system_traces.events.
The Python driver's populate() method considers a trace complete once the
session row in system_traces.sessions has duration IS NOT NULL, then reads
events exactly once. Since the session row and event rows are written as
separate mutations with no transactional guarantee, the driver can read an
incomplete set of events.

Evidence from the failed CI run logs:
- The entire test (CREATE TABLE through DROP TABLE) completed in ~300ms
  (01:38:54,859 - 01:38:55,157)
- The INSERT with tracing happened in a ~50ms window between the second
  CREATE INDEX completing (01:38:55,108) and DROP TABLE starting
  (01:38:55,157)
- The 'Forcing ... synchronous' trace message is generated during the
  INSERT write path (db/view/view.cc:2061), so it was produced, but
  not yet flushed to system_traces.events when the driver read them
- This matches the known limitation documented in test/alternator/
  test_tracing.py: 'we have no way to know whether the tracing events
  returned is the entire trace'

Fix: replace the single-shot trace.events read with a retry loop that
directly queries system_traces.events until the expected event appears
(with a 30s timeout). Use ConsistencyLevel.ONE since system_traces has
RF=2 and cqlpy tests run on a single-node cluster.

The same race condition pattern exists in test_mv_synchronous_updates in
test_materialized_view.py (which this test was modeled after), so the
same fix is proactively applied there as well.

Fixes SCYLLADB-1314

Closes scylladb/scylladb#29374
2026-04-08 12:35:10 +02:00
Botond Dénes
418141ec08 Merge 'Drop create_dataset() helper from object_store tests' from Pavel Emelyanov
There's only one test left that uses it, and it can be patched to use standard ks/cf creation helpers from pylib. This patch does so and drops the lengthy create_dataset() helper

Tests improvements, no need to backport

Closes scylladb/scylladb#29176

* github.com:scylladb/scylladb:
  test/backup: drop create_dataset helper
  test/backup: use new_test_keyspace in test_restore_primary_replica
2026-04-08 12:19:54 +03:00
Petr Gusev
1e3c8c5a87 test_mutation_schema_change: use tablets
The enable_tablets(false) was added when LWT wasn't supported for tablets, now it's, so no need in this attribute are more.

The test covers behavior which should work in similar way for both vnodes and tablets -> it doesn't seem it would benefit much from running it in both enable_tablets(true) and enable_tablets(false) modes.

Closes scylladb/scylladb#29167
2026-04-08 12:19:54 +03:00
Botond Dénes
aeefbda304 Merge 'Simplify and improve API descibe_ring code flow' from Pavel Emelyanov
The endpoint in question has some places worth fixing, in particular

- the keyspace parameter is not validated
- the validated table name is resolved into table_id, but the id is unused
- two ugly static helpers to stream obtained token ranges into json

Improving the API code flow, not backporting

Closes scylladb/scylladb#29154

* github.com:scylladb/scylladb:
  api: Inline describe_ring JSON handling
  storage_service: Make describe_ring_for_table() take table_id
2026-04-08 10:50:07 +03:00
Artsiom Mishuta
b1e9c0b867 test/pylib: add typed skip markers plugin
Add skip_reason_plugin.py — a framework-agnostic pytest plugin that
provides typed skip markers (skip_bug, skip_not_implemented, skip_slow,
skip_env) so that the reason a test is skipped is machine-readable in
JUnit XML and Allure reports.  Bare untyped pytest.mark.skip now
triggers a warning (to become an error after full migration).  Runtime
skips via skip() are also enriched by parsing the [type] prefix from
the skip message.

The plugin is a class (SkipReasonPlugin) that receives the concrete
SkipType enum and an optional report_callback from conftest.py, keeping
it decoupled from allure and project-specific types.

Extract SkipType enum and convenience runtime skip wrappers (skip_bug,
skip_env, etc.) into test/pylib/skip_types.py so callers only need a
single import instead of importing both SkipType and skip() separately.
conftest.py imports SkipType from the new module and registers the
plugin instance unconditionally (for all test runners).

New files:
- test/pylib/skip_reason_plugin.py: core plugin — typed marker
  processing, bare-skip warnings, JUnit/Allure report enrichment
  (including runtime skip() parsing via _parse_skip_type helper)
- test/pylib/skip_types.py: SkipType enum and convenience wrappers
  (skip_bug, skip_not_implemented, skip_slow, skip_env)
- test/pylib_test/test_skip_reason_plugin.py: 17 pytester-based
  test functions (51 cases across 3 build modes) covering markers,
  warnings, reports, callbacks, and skip_mode interaction

Infrastructure changes:
- test/conftest.py: import SkipType from skip_types, register
  SkipReasonPlugin with allure report callback
- test/pylib/runner.py: set SKIP_TYPE_KEY/SKIP_REASON_KEY stash keys
  for skip_mode so the report hook can enrich JUnit/Allure with
  skip_type=mode without longrepr parsing
- test/pytest.ini: register typed marker definitions (required for
  --strict-markers even when plugin is not loaded)

Migrated test files (representative samples):
- test/cluster/test_tablet_repair_scheduler.py:
  skip -> skip_bug (#26844), skip -> skip_not_implemented
- test/cqlpy/.../timestamp_test.py: skip -> skip_slow
- test/cluster/dtest/schema_management_test.py: skip -> skip_not_implemented
- test/cluster/test_change_replication_factor_1_to_0.py: skip -> skip_bug (#20282)
- test/alternator/conftest.py: skip -> skip_env
- test/alternator/test_https.py: use skip_env() wrapper

Fixes SCYLLADB-79

Closes scylladb/scylladb#29235
2026-04-08 10:38:56 +03:00
Pavel Emelyanov
e0fa9ee332 Merge 'storage: implement sstable clone for object storage' from Ernest Zaslavsky
This patch series implements `object_storage_base::clone`, which was previously a stub that aborted at runtime. Clone creates a copy of an sstable under a new generation and is used during compaction.

The implementation uses server-side object copies (S3 CopyObject / GCS Objects: rewrite) and mirrors the filesystem clone semantics: TemporaryTOC is written first to mark the operation as in-progress, component objects are copied, and TemporaryTOC is removed to commit (unless the caller requested the destination be left unsealed).

The first two patches fix pre-existing bugs in the underlying storage clients that were exposed by the new clone code path:
- GCS `copy_object` used the wrong HTTP method (PUT instead of POST) and sent an invalid empty request body.
- S3 `copy_object` silently ignored the abort_source parameter.

1. **gcp_client: fix copy_object request method and body** — Fix two bugs in the GCS rewrite API call.
2. **s3_client: pass through abort_source in copy_object** — Stop ignoring the abort_source parameter.
3. **object_storage: add copy_object to object_storage_client** — New interface method with S3 and GCS implementations.
4. **storage: add make_object_name overload with generation** — Helper for building destination object names with a different generation.
5. **storage: make delete_object const** — Needed by the const clone method.
6. **storage: implement object_storage_base::clone** — The actual clone implementation plus a copy_object wrapper.
7. **test/boost: enable sstable clone tests for S3 and GCS** — Re-enable the previously skipped tests.

A test similar to `sstable_clone_leaving_unsealed_dest_sstable` was added to properly test the sealed/unsealed states for object storage. Works for both  S3 and GCS.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1045
Prerequisite: https://github.com/scylladb/scylladb/pull/28790
No need to backport since this code targets future feature

Closes scylladb/scylladb#29166

* github.com:scylladb/scylladb:
  compaction_test: enable sstable clone tests for S3 and GCS
  storage: implement object_storage_base::clone
  storage: make delete_object const in object_storage_base
  storage: add make_object_name overload with generation
  sstables: add get_format() accessor to sstable
  object_storage: add copy_object to object_storage_client
  s3_client: pass through abort_source in copy_object
  gcp_client: fix copy_object request method and body
2026-04-08 09:35:10 +03:00
Nadav Har'El
4eeb9f4120 lwt, vector: write to CDC when vector index is enabled.
The vector-search feature introduced the somewhat confusing feature of
enabling CDC without explicitly enabling CDC: When a vector index is
enabled on a table, CDC is "enabled" for it even if the user didn't
ask to enable CDC.

For this, write-path code began to use a new cdc_enabled() function
instead of checking schema.cdc_options.enabled() directly.  This
cdc_enabled() function checks if either this enabled() is true, or
has_vector_index() is true.

Unfortunately, LWT writes continued to use cdc_options.enabled() instead
of the new cdc_enabled(). This means that if a vector index is used and
a vector is written using an LWT write, the new value is not indexed.

This patch fixes this bug. It also adds a regression test that fails
before this patch and passes afterwards - the new test verifies that
when a table has a vector index (but no explicit CDC enabled), the CDC
log is updated both after regular writes and after successful LWT writes.

This patch was also tested in the context of the upcoming vector-search-
for-Alternator pull request, which has a test reproducing this bug
(Alternator uses LWT frequently, so this is very important there).
It will also be tested by the vector-store test suite ("validator").

Fixes SCYLLADB-1342

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#29300
2026-04-08 07:55:05 +03:00
Marcin Maliszkiewicz
1bf3110adb Merge 'test: add test_upgrade_preserves_ddl_audit_for_tables' from Andrzej Jackowski
Verify that upgrading from 2025.1 to master does not silently drop DDL
auditing for table-scoped audit configurations ([SCYLLADB-1155](https://scylladb.atlassian.net/browse/SCYLLADB-1155)).

Test time in dev: 4s

Refs: SCYLLADB-1155
Fixes: SCYLLADB-1305
No backport, test for bug on master

[SCYLLADB-1155]: https://scylladb.atlassian.net/browse/SCYLLADB-1155?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Closes scylladb/scylladb#29223

* github.com:scylladb/scylladb:
  test: add test_upgrade_preserves_ddl_audit_for_tables
  test: audit: split validate helper so callers need not pass audit_settings
  test: audit: declare manager attribute in AuditTester base class
2026-04-07 17:29:11 +02:00
Marcin Maliszkiewicz
895fdb6d29 Merge 'ldap: fix double-free of LDAPMessage in poll_results()' from Andrzej Jackowski
In the unregistered-ID branch, ldap_msgfree() was called on a result
already owned by an RAII ldap_msg_ptr, causing a double-free on scope
exit. Remove the redundant manual free.

Fixes: SCYLLADB-1344

Backport: 2026.1, 2025.4, 2025.1 - it's a memory corruption, with a one-line fix, so better backport it everywhere.

Closes scylladb/scylladb#29302

* github.com:scylladb/scylladb:
  test: ldap: add regression test for double-free on unregistered message ID
  ldap: fix double-free of LDAPMessage in poll_results()
2026-04-07 17:27:43 +02:00
Ernest Zaslavsky
422f107122 compaction_test: enable sstable clone tests for S3 and GCS
Now that object_storage_base::clone is implemented,
remove the early-return skips and re-enable the
sstable_clone_leaving_unsealed_dest_sstable tests for
both S3 and GCS storage backends.
2026-04-07 18:16:52 +03:00
Nadav Har'El
a0e79f391f Merge 'alternator: fix batch write item squashing cdc entries' from Radosław Cybulski
When `BatchWriteItem` operates on multiple items sharing the same partition key in `always_use_lwt` write isolation mode, all CDC log entries are emitted under a single timestamp. The previous `get_records` parsing algorithm in `alternator/streams.cc` assumed that all CDC log entries sharing the same timestamp correspond to a single DynamoDB item change. As a result, it would incorrectly squash multiple distinct item changes into a single Streams record — producing wrong event data (e.g., one INSERT instead of four, with mismatched key/attribute values).

Note: the bug is specific to `always_use_lwt` mode because only in LWT mode does the entire batch share a single timestamp. In non-LWT modes, each item in the batch receives a separate timestamp, so the entries naturally stay separate.

**Commit 1: alternator: add BatchWriteItem Streams test**

- Adds new tests `test_streams_batchwrite_no_clustering_deletes_non_existing_items` and `test_streams_batchwrite_no_clustering_deletes_existing_items` that cover the corner cases of batch-deleting a existing and non-existing item in a table without a clustering key. CDC tables without clustering keys are handled differently, and this path was previously untested for delete operations.
- Adds a new test `test_streams_batchwrite_into_the_same_partition_will_report_wrong_stream_data`, that is a simple way to trigger a bug.
- Adds a new test `test_streams_batchwrite_into_the_same_partition_deletes_existing_items`, that validates various combinations of puts and deletes in a single BatchWrite against the same partition.
- Adds a new `test_table_ss_new_and_old_images_write_isolation_always` fixture and extends `create_table_ss` to accept `additional_tags`, enabling tests with a specific write isolation mode.

**Commit 2: alternator: fix BatchWriteItem squashed Streams entries**

The core fix rewrites the CDC log entry parsing in `get_records` to distinguish items by their clustering key:

- Introduces `managed_bytes_ptr_hash` and `managed_bytes_ptr_equal` helper structs for pointer-based hash map lookups on `managed_bytes`.
- Replaces the single `record`/`dynamodb` pair with a `std::unordered_map<const managed_bytes*, Record, ...>` (`records_map`) keyed by the base table's clustering key value from each CDC log row. For tables without a clustering key, all entries map to a single sentinel key.
- Adds a validation that Alternator tables have at most one clustering key column (as required by the DynamoDB data model).
- On end-of-record (`eor`), flushes all accumulated per-clustering-key records into the output, each with a unique `eventID` (the `event_id` format now includes an index suffix).
- Adjusts the limit check: since a single CDC timestamp bucket can now produce multiple output records, the limit may be slightly exceeded to avoid breaking mid-batch.

Fixes #28439
Fixes: SCYLLADB-540

Closes scylladb/scylladb#28452

* github.com:scylladb/scylladb:
  alternator/test: explain why 'always' write isolation mode is used in tests
  alternator/test: add scylla_only to always write isolation fixture
  alternator: fix BatchWriteItem squashed Streams entries
  alternator: add BatchWriteItem test (failing)
2026-04-07 17:49:23 +03:00
Nadav Har'El
22e7ef46a7 Merge 'vector_search: fix SELECT on local vector index' from Karol Nowacki
Queries against local vector indexes were failing with the error:
```ANN ordering by vector requires the column to be indexed using 'vector_index'```

This was a regression introduced by 15788c3734, which incorrectly
assumed the first column in the targets list is always the vector column.
For local vector indexes, the first column is the partition key, causing
the failure.

Previously, serialization logic for the target index option was shared
between vector and secondary indexes. This is no longer viable due to
the introduction of local vector indexes and vector indexes with filtering
columns, which have  different target format.

This commit introduces a dedicated JSON-based serialization format for
vector index targets, identifying the target column (tc), filtering
columns (fc), and partition key columns (pk). This ensures unambiguous
serialization and deserialization for all vector index types.

This change is backward compatible for regular vector indexes. However,
it breaks compatibility for local vector indexes and vector indexes with
filtering columns created in version 2026.1.0. To mitigate this, usage
of these specific index types will be blocked in the 2026.1.0 release
by failing ANN queries against them in vector-store service.

Fixes: SCYLLADB-895

Backport to 2026.1 is required as this issue occurs also on this branch.

Closes scylladb/scylladb#28862

* github.com:scylladb/scylladb:
  index: fix DESC INDEX for vector index
  vector_search: test: refactor boilerplate setup
  vector_search: fix SELECT on local vector index
  index: test: vector index target option serialization test
  index: test: secondary index target option serialization test
2026-04-07 17:43:35 +03:00
Pavel Emelyanov
58e59e8c0d Merge 'test: add test_sstable_clone_preserves_staging_state' from Benny Halevy
Add a test that verifies filesystem_storage::clone preserves the sstable
state: an sstable in staging is cloned to a new generation, the clone is
re-loaded from the staging directory, and its state is asserted to still
be staging.

The change proves that https://scylladb.atlassian.net/browse/SCYLLADB-1205
is invalid, and can be closed.

* No functional change and no backport needed

Closes scylladb/scylladb#29209

* github.com:scylladb/scylladb:
  test: add test_sstable_clone_preserves_staging_state
  test: derive sstable state from directory in test_env::make_sstable
  sstables: log debug message in filesystem_storage::clone
2026-04-07 17:02:04 +03:00
Botond Dénes
816f2bf163 Merge 'cql3: fix null handling in data_value formatting' from Dario Mirovic
`data_value::to_parsable_string()` crashes with a null pointer dereference when called on a `null` data_value. Return `"null"` instead.

Added tests after the fix. Manually checked that tests fail without the fix.

Fixes SCYLLADB-1350

This is a fix that prevents format crash. No known occurrence in production, but backport is desirable.

Closes scylladb/scylladb#29262

* github.com:scylladb/scylladb:
  test: boost: test null data value to_parsable_string
  cql3: fix null handling in data_value formatting
2026-04-07 16:35:31 +03:00
Dimitrios Symonidis
701808d7aa test/object_store: parametrize test_basic over replication factor
Extend test_basic to run with both RF=1 and RF=3 to verify that
object storage works correctly with multiple replicas. The test now
starts one server per replica (each on its own rack), flushes all
nodes, validates tablet replica counts for RF>1, and restarts all
servers before verifying data is still readable.

Fixes: SCYLLADB-546

Closes scylladb/scylladb#28583
2026-04-07 16:27:44 +03:00
Nadav Har'El
f642db0693 test/alternator: tests for missing support of ReturnConsumedCapacity
As noted in issue #5027 and issue #29138, Alternator's support for
ReturnConsumedCapacity is lacking in a two areas:

1. While ReturnConsumedCapacity is supported for most relevant
   operations, it's not supported in two operations: Query and Scan.

2. While ReturnConsumedCapacity=TOTAL is supported, INDEXES is not
   supported at all.

This patch adds extensive tests for all these cases. All these tests
pass on DynamoDB but fail on Alternator, so are marked with "xfail".

The tests for ReturnConsumedCapacity=INDEXES are deliberately split
into two: First, we test the case where the table has no indexes, so
INDEXES is almost the same as TOTAL and should be very easy to
implement. A second test checks the cases where there are indexes,
and different operations increment the capacity of the base table
and/or indexes differently - it will require significantly more work
to make the second test pass.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#29188
2026-04-07 16:07:41 +03:00
Avi Kivity
8c629d55b0 test: vector_search: check [[nodiscard]] return values of expected<> types
Clang 22 verifies [[nodiscard]] for co_await,
causing compilation failures where return values of expected<> were
silently discarded.

These call sites were discarding the return value of client::request()
and vector_store_client::ann(), both of which return expected<> types
marked [[nodiscard]]. Rather than suppressing the warning with (void)
casts, properly check the return values using the established test
patterns: BOOST_CHECK(result) where the call is expected to succeed,
and BOOST_CHECK(!result) where the call is expected to fail.

Closes scylladb/scylladb#29297
2026-04-07 15:25:08 +03:00
Andrei Chekun
93583bf193 test.py: use safe_drive_shutdown in the tests
These methods for closing driver was missed during original fix.

Fixes: SCYLLADB-900

Closes scylladb/scylladb#29093
2026-04-07 14:35:18 +03:00
Avi Kivity
00409b61f1 Merge 'Add Vnodes to Tablets Migration Procedure' from Nikos Dragazis
This PR introduces the vnodes-to-tablets migration procedure, which enables converting an existing vnode-based keyspace to tablets.

The migration is implemented as a manual, operator-driven process executed in several stages. The core idea is to first create tablet maps with the same token boundaries and replica hosts as the vnodes, and then incrementally convert the storage of each node to the tablets layout. At a high level, the procedure is the following:
1. Create tablet maps for all tables in the keyspace.
2. Sequentially upgrade all nodes from vnodes to tablets:
    1. Mark a node for upgrade in the topology state.
    2. Restart the node. During startup, while the node is offline, it reshards the SSTables on vnode boundaries and switches to a tablet ERM.
    3. Wait for the node to return online before proceeding to the next node.
4. Finalize the migration:
    1. Update the keyspace schema to mark it as tablet-based.
    2. Clear the group0 state related to the migration.

From the client's perspective, the migration is online; the cluster can still serve requests on that keyspace, although performance may be temporarily degraded.

During the migration, some nodes use vnode ERMs while others use tablet ERMs. Cluster-level algorithms such as load balancing will treat the keyspace's tables as vnode-based. Once migration is finalized, the keyspace is permanently switched to tablets and cannot be reverted back to vnodes. However, a rollback procedure is available before finalization.

The patch series consists of:
* Load balancer adjustments to ignore tablets belonging to a migrating keyspace.
* A new vnode-based resharding mode, where SSTables are segregated on vnode boundaries rather than with the static sharder.
* A new per-node `intended_storage_mode` column in `system.topology`. Represents migration intent (whether migration should occur on restart) and direction.
* Four new REST endpoints for driving the migration (start, node upgrade/downgrade, finalize, status), along with `nodetool` wrappers. The finalization is implemented as a global topology request.
* Wiring of the migration process into the startup logic: the `distributed_loader` determines a migrating table's ERM flavor from the `intended_storage_mode` and the ERM flavor determines the `table_populator`'s resharding mode. Token metadata changes have been adjusted to preserve the ERM flavor.
* Cluster tests for the migration process.

Fixes SCYLLADB-722.
Fixes SCYLLADB-723.
Fixes SCYLLADB-725.
Fixes SCYLLADB-779.
Fixes SCYLLADB-948.

New feature, no backport is needed.

Closes scylladb/scylladb#29065

* github.com:scylladb/scylladb:
  docs: Add ops guide for vnodes-to-tablets migration
  test: cluster: Add test for migration of multiple keyspaces
  test: cluster: Add test for error conditions
  test: cluster: Add vnodes->tablets migration test (rollback)
  test: cluster: Add vnodes->tablets migration test (1 table, 3 nodes)
  test: cluster: Add vnodes->tablets migration test (1 table, 1 node)
  scylla-nodetool: Add migrate-to-tablets subcommand
  api: Add REST endpoint for vnode-to-tablet migration status
  api: Add REST endpoint for migration finalization
  topology_coordinator: Add `finalize_migration` request
  database: Construct migrating tables with tablet ERMs
  api: Add REST endpoint for upgrading nodes to tablets
  api: Add REST endpoint for starting vnodes-to-tablets migration
  topology_state_machine: Add intended_storage_mode to system.topology
  distributed_loader: Wire vnode-based resharding into table populator
  replica: Pick any compaction group for resharding
  compaction: resharding_compaction: add vnodes_resharding option
  storage_service: Preserve ERM flavor of migrating tables
  tablet_allocator: Exclude migrating tables from load balancing
  feature_service: Add vnodes_to_tablets_migrations feature
2026-04-07 14:32:22 +03:00
Łukasz Paszkowski
6f364fd3b7 db: fix system.size_estimates to aggregate sstable estimates across all shards
The estimate() function in the size_estimates virtual reader only
considered sstables local to the shard that happened to own the
keyspace's partition key token. Since sstables are distributed across
shards, this caused partition count estimates to be approximately
1/smp_count of the actual value.

This bug has been present since the virtual reader was introduced in
225648780d.

Use db.container().map_reduce0() to aggregate sstable estimates
across all shards. Each shard contributes its local count and
estimated_histogram, which are then merged to produce the correct
total.

Also fix the `test_partitions_estimate_full_overlap` test which becomes
flaky (xpassing ~1% of runs) because autocompaction could merge the
two overlapping sstables before the size estimate was read. Wrap the
test body in nodetool.no_autocompaction_context to prevent this race.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1179
Refs https://github.com/scylladb/scylladb/issues/9083

Closes scylladb/scylladb#29286
2026-04-07 14:13:26 +03:00
Avi Kivity
8b4a91982b cmake: add missing rolling_max_tracker_test and symmetric_key_test
Added in 5b2a07b408 and c596ae6eb1 without cmake integration.

Closes scylladb/scylladb#29328
2026-04-07 14:09:00 +03:00
Avi Kivity
d01c9a425f test: test_out_of_storage_prevention: fix invalid escape in regex
Python warns that the sequence "\(" is an invalid escape and
might be rejected in the future. Protect against that by using
a raw string.

Closes scylladb/scylladb#29334
2026-04-07 14:06:32 +03:00
Pavel Emelyanov
0ae781c008 Merge 'test: auth_test: coroutinize' from Avi Kivity
Convert auth_test.cc to coroutines for improved readability. Each test is converted in its own commit. Some
are trivial.

Indentation is left broken in some commits to reduce the diff, then fixed up in the last commit.

Code cleanup, so no backport.

Closes scylladb/scylladb#29336

* github.com:scylladb/scylladb:
  auth_test: fix whitespace
  auth_test: coroutinize test_try_describe_schema_with_internals_and_passwords_as_anonymous_user
  auth_test: coroutinize test_try_login_after_creating_roles_with_hashed_password
  auth_test: coroutinize test_create_roles_with_hashed_password_and_log_in
  auth_test: coroutinize test_try_create_role_with_hashed_password_as_anonymous_user
  auth_test: coroutinize test_try_to_create_role_with_password_and_hashed_password
  auth_test: coroutinize test_try_to_create_role_with_hashed_password_and_password
  auth_test: coroutinize test_alter_with_workload_type
  auth_test: coroutinize test_alter_with_timeouts
  auth_test: coroutinize role_permissions_table_is_protected
  auth_test: coroutinize role_members_table_is_protected
  auth_test: coroutinize roles_table_is_protected
  auth_test: coroutinize test_password_authenticator_operations
  auth_test: coroutinize test_password_authenticator_attributes
  auth_test: coroutinize test_default_authenticator
2026-04-07 14:05:32 +03:00
Botond Dénes
513af59130 encryption: improve error message when KMS host is not configured
When an SSTable was encrypted with a KMS host that is not present in
scylla.yaml, the error thrown was:

  std::invalid_argument (No such host: <host-name>)

This message is very obscure in general, and especially confusing when
encountered while using the scylla-sstable tool: it gives no indication
that the SSTable is encrypted, that a KMS host lookup is involved, or
what the user needs to do to fix the problem.

Replace it with a message that names the missing host and points
directly to the relevant scylla.yaml section:

  Encryption host "<host-name>" is not defined in scylla.yaml.
  Make sure it is listed under the "kmip_hosts" section.

The wording is intentionally kept neutral (not framed as an SSTable tool
problem) because the same code path is exercised by production ScyllaDB
when a node's configuration no longer contains a host referenced by an
existing data file (e.g. after a config rollback or when restoring data
from a different cluster). The production use-case takes precedence, but
the message is equally actionable from the tool.

Closes scylladb/scylladb#29228
2026-04-07 14:00:27 +03:00
Pavel Emelyanov
d6df5ef60a Merge 'compaction_test: Make compaction tests backend‑agnostic and add S3/GCS support' from Ernest Zaslavsky
This series updates the storage abstraction and extends the compaction tests to support object‑storage backends (S3 and GCS), while tightening several parts of the test environment.

The changes include:

- New exists/object_exists helpers across storage backends and clock fixes in the S3 client to make signature generation stable under test conditions.

- A new get_storage_for_tests accessor and adjustments to the test environment to avoid premature teardown of the sstable registry.

- Refactoring of compaction tests to remove direct sstable access, ensure proper schema setup, and avoid use of moved‑from objects.

- Extraction of test_env‑based logic into reusable functions and addition of S3/GCS variants of the compaction tests.

Not all tests were converted to be backend‑agnostic yet, and a few require further investigation before they can run cleanly against S3/GCS backends. These will be addressed in follow‑up work.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-704 however, followup is needed

No backport needed since this change targeting future feature

Closes scylladb/scylladb#28790

* github.com:scylladb/scylladb:
  compaction_test: fix formatting after previous patches
  compaction_test: add S3/GCS variations to tests
  compaction_test: extract test_env-based tests into functions
  compaction_test: replace file_exists with storage::exists
  compaction_test: initialize tables with schema via make_table_for_tests
  compaction_test: use sstable APIs to manipulate component files
  compaction_test: fix use-after-move issue
  sstable_utils: add `get_storage` and `open_file` helpers
  test_env: delay unplugging sstable registry
  storage: add `exists` method to storage abstraction
  s3_client: use lowres_system_clock for aws_sigv4
  s3_client: add `object_exists` helper
  gcs_client: add `object_exists` helper
2026-04-07 13:53:48 +03:00
Avi Kivity
bc10e1a171 test: fix flaky test_login by not retrying authentication failures
The fix for SCYLLADB-1373 (b4f652b7c1) changed get_session() to use
the default timeout=30 for the retry loop in patient_*_cql_connection
(previously timeout=0.1). This correctly allowed retrying transient
NoHostAvailable errors during node startup, but introduced a new
flakiness in test_login and other auth tests.

The failure chain:

1. test_login connects with bad credentials (e.g. user="doesntexist")
2. get_session() calls patient_exclusive_cql_connection(), which calls
   retry_till_success() with bypassed_exception=NoHostAvailable
3. The first attempt correctly fails: the server rejects the credentials
   with AuthenticationFailed, wrapped in NoHostAvailable
4. retry_till_success() catches NoHostAvailable indiscriminately and
   retries, not distinguishing between transient errors (node not ready)
   and permanent errors (bad credentials)
5. A subsequent retry attempt times out (connect_timeout=5), producing
   OperationTimedOut wrapped in NoHostAvailable
6. After 30 seconds, the last NoHostAvailable is raised -- now wrapping
   OperationTimedOut instead of the original AuthenticationFailed
7. The assertion `isinstance(..., AuthenticationFailed)` fails

With the old timeout=0.1, the deadline was already exceeded after the
first attempt, so the original AuthenticationFailed propagated.

Fix: Add a `should_retry` predicate parameter to retry_till_success()
and use it in patient_cql_connection() and
patient_exclusive_cql_connection() to immediately re-raise
NoHostAvailable when it wraps AuthenticationFailed. Retrying
authentication failures is never useful since the credentials won't
change between attempts.

Fixes: SCYLLADB-1382

Closes scylladb/scylladb#29348
2026-04-07 10:17:31 +03:00
Avi Kivity
b4f652b7c1 test: fix flaky test_create_ks_auth by removing bad retry timeout
get_session() was passing timeout=0.1 to patient_exclusive_cql_connection
and patient_cql_connection, leaving only 0.1 seconds for the retry loop
in retry_till_success(). Since each connection attempt can take up to 5
seconds (connect_timeout=5), the retry loop effectively got only one
attempt with no chance to retry on transient NoHostAvailable errors.

Use the default timeout=30 seconds, consistent with all other callers.

Fixes: SCYLLADB-1373

Closes scylladb/scylladb#29332
2026-04-05 19:13:15 +03:00
Avi Kivity
2f0d178510 auth_test: fix whitespace
Fix over-indented lines inside do_with_mc lambda bodies introduced
during coroutinization.
2026-04-05 18:28:23 +03:00
Avi Kivity
7a24da9e88 auth_test: coroutinize test_try_describe_schema_with_internals_and_passwords_as_anonymous_user
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
e1b52cf337 auth_test: coroutinize test_try_login_after_creating_roles_with_hashed_password
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
24d36ad459 auth_test: coroutinize test_create_roles_with_hashed_password_and_log_in
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
6f20129eec auth_test: coroutinize test_try_create_role_with_hashed_password_as_anonymous_user
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
cece181113 auth_test: coroutinize test_try_to_create_role_with_password_and_hashed_password
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
752391f757 auth_test: coroutinize test_try_to_create_role_with_hashed_password_and_password
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
287625b297 auth_test: coroutinize test_alter_with_workload_type
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
4eeb5ef54d auth_test: coroutinize test_alter_with_timeouts
Use co_await instead of return for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
170c71b25d auth_test: coroutinize role_permissions_table_is_protected
Use co_await for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
13eccf519f auth_test: coroutinize role_members_table_is_protected
Use co_await for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
43ff3798ad auth_test: coroutinize roles_table_is_protected
Use co_await for improved readability.
2026-04-05 18:26:30 +03:00
Avi Kivity
c586eeb003 auth_test: coroutinize test_password_authenticator_operations
Flatten continuation chains (.then()) into linear thread-style code
with .get() calls for improved readability. Remove the now-unused
require_throws helper template.
2026-04-05 18:26:25 +03:00
Avi Kivity
fbccfe5c9d auth_test: coroutinize test_password_authenticator_attributes
Use co_await instead of return+do_with_cql_env+make_ready_future
for improved readability.
2026-04-05 17:28:09 +03:00
Avi Kivity
e3dee64003 auth_test: coroutinize test_default_authenticator
Use co_await instead of return+do_with_cql_env+make_ready_future
for improved readability.
2026-04-05 17:27:45 +03:00