Commit Graph

11801 Commits

Author SHA1 Message Date
Avi Kivity
c2a2e11c40 Merge 'Prepare the way for incremental repair' from Botond Dénes
With incremental repair, each replica::compaction_group will have 3 logical compaction groups, repaired, repairing and unrepaired. The definition of group is a set of sstables that can be compacted together. The logical groups will share the same instance of sstable_set, but each will have its own logical sstable set. Existing compaction::table_state is a view for a logical compaction group. So it makes sense that each replica::compaction_group will have multiple views. Each view will provide to compaction layer only the sstables that belong to it. That way, we preserve the existing interface between replica and compaction layer, where each compaction::table_state represents a single logical group.
The idea is that all the incremental repair knowledge is confined to repair and replica layer, compaction doesn't want to know about it, it just works on logical groups, what each represents doesn't matter from the perspective of the subsystem. This is the best way forward to not violate layers and reduce the maintenance burden in the long run.
We also proceed to rename table_state to compaction_group_view, since it's a better description. Working with multiple terms is confusing. The placeholder for implementing the sstable classifier is also left in tablet_storage_group_manager, by the time being, all sstables will go to the unrepaired logical set, which preserves the current behavior.

New functionality, no backport required

Closes scylladb/scylladb#25287

* github.com:scylladb/scylladb:
  test: Add test that compaction doesn't cross logical group boundary
  replica: Introduce views in compaction_group for incremental repair
  compaction: Allow view to be added with compaction disabled
  replica: Futurize retrieval of sstable sets in compaction_group_view
  treewide: Futurize estimation of pending compaction tasks
  replica: Allow compaction_group to have more than one view
  Move backlog tracker to replica::compaction_group
  treewide: Rename table_state to compaction_group_view
  tests: adjust for incremental repair
2025-08-09 17:21:17 +03:00
Emil Maskovsky
7c54401d3d raft: enforce odd number of voters in group0
Implement odd number voter enforcement in the group0 voter calculator to
ensure proper Raft consensus behavior. Raft consensus requires a majority
of voters to make decisions, and odd numbers of voters is preferred
because an even number doesn't add additional reliability but introduces
the risk of scenarios where no group can make progress. If an even number
of voters is divided into two groups of equal size during a network
partition, neither group will have majority and both will be unable to
commit new entries. With an odd number of voters, such equal partition
scenarios are impossible (unless the network is partitioned into at least
three groups).

Fixes: scylladb/scylladb#23266
2025-08-08 19:49:20 +02:00
Emil Maskovsky
29ddb2aa18 test/raft: adapt test_tablets_lwt.py for odd voter number enforcement
The test_lwt_timeout_while_creating_paxos_state_table was failing after
implementing odd number voter enforcement in the group0 voter calculator.

Previously with 2 nodes:
- 2 nodes → 2 voters → stop 1 node → 1/2 voters (no quorum) → expected Raft timeout

With odd voter count enforcement:
- 2 nodes → 1 voter → stop 1 node → 0/1 voters → Cassandra availability error

This change updates the test to use 3 nodes instead of 2, ensuring proper
no-quorum scenarios:
- 3 nodes → 3 voters → stop 2 nodes → 1/3 voters (no quorum) → Raft timeout

The test now correctly validates LWT timeout behavior while being compatible
with the odd number voter enforcement requirement.
2025-08-08 19:49:10 +02:00
Emil Maskovsky
7fc75aff3e test/raft: adapt test_raft_no_quorum.py for odd voter enforcement
Update the no-quorum cluster tests to work correctly with the new odd
number voter enforcement in the group0 voter calculator. The tests now
properly account for the changed voter counts when validating no-quorum
scenarios.
2025-08-08 19:48:58 +02:00
Benny Halevy
0a20834d2a replica: table: get rid of update_sstables_known_generation
It is not needed anymore.
With that database::_sstable_generation_generator can
be a regular member rather than optional and initialized
later.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
42cb25c470 sstables: sstable_directory: stop tracking highest_generation
It is not needed anymore as we always generate
uuid generations.

Convert sstable_directory_test_table_simple_empty_directory_scan
to use the newly added empty() method instead of
checking the highest generation seen.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
b01524c5a3 replica: distributed_loader: stop tracking highest_generation
It is not needed anymore as we always generate
uuid generations.

Move highest_generation_seen(sharded<sstables::sstable_directory>& directory)
to sstables/sstable_directory module.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
6cc964ef16 sstables: sstable_generation: get rid of uuid_identifiers bool class
Now that all call sites enable uuid_identifiers.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:21 +03:00
Benny Halevy
0ad1898f0a feature_service: move UUID_SSTABLE_IDENTIFIERS to supported_feature_set
The feature is supported by all live versions since
version 5.4 / 2024.1.

(Although up to 6da758d74c
it could be disabled using the config option)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-08 11:46:15 +03:00
Botond Dénes
70aa81990b Merge 'Alternator - add the ability to write, not just read, system tables' from Nadav Har'El
In commit 44a1daf we added the ability to read Scylla system tables with Alternator. This feature is useful, among other things, in tests that want to read Scylla's configuration through the system table system.config. But tests often want to modify system.config, e.g., to temporarily reduce some threshold to make tests shorter. Until now, this was not possible

This series add supports for writing to system tables through Alternator, and examples of tests using this capability (and utility functions to make it easy).

Because the ability to write to system tables may have non-obvious security consequences, it is turned off by default and needs to be enabled with a new configuration option "alternator_allow_system_table_write"

No backports are necessary - this feature is only intended for tests. We may later decide to backport if we want to backport new tests, but I think the probability we'll want to do this is low.

Fixes #12348

Closes scylladb/scylladb#19147

* github.com:scylladb/scylladb:
  test/alternator: utility functions for changing configuration
  alternator: add optional support for writing to system table
  test/alternator: reduce duplicated code
2025-08-08 09:13:15 +03:00
Raphael S. Carvalho
beaaf00fac test: Add test that compaction doesn't cross logical group boundary
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:58:01 +03:00
Raphael S. Carvalho
d351b0726b replica: Introduce views in compaction_group for incremental repair
Wired the unrepaired, repairing and repaired views into compaction_group.

Also the repaired filter was wired, so tablet_storage_group_manager
can implement the procedure to classify the sstable.

Based on this classifier, we can decide which view a sstable belongs
to, at any given point in time.

Additionally, we made changes changes to compaction_group_view
to return only sstables that belong to the underlying view.

From this point on, repaired, repairing and unrepaired sets are
connected to compaction manager through their views. And that
guarantees sstables on different groups cannot be compacted
together.
Repairing view specifically has compaction disabled on it altogether,
we can revert this later if we want, to allow repairing sstables
to be compacted with one another.

The benefit of this logical approach is having the classifier
as the single source of truth. Otherwise, we'd need to keep the
sstable location consistest with global metadata, creating
complexity

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:58:00 +03:00
Raphael S. Carvalho
9d3755f276 replica: Futurize retrieval of sstable sets in compaction_group_view
This will allow upcoming work to gently produce a sstable set for
each compaction group view. Example: repaired and unrepaired.

Locking strategy for compaction's sstable selection:
Since sstable retrieval path became futurized, tasks in compaction
manager will now hold the write lock (compaction_state::lock)
when retrieving the sstable list, feeding them into compaction
strategy, and finally registering selected sstables as compacting.
The last step prevents another concurrent task from picking the
same sstable. Previously, all those steps were atomic, but
we have seen stall in that area in large installations, so
futurization of that area would come sooner or later.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:58:00 +03:00
Raphael S. Carvalho
20c3301a1a treewide: Futurize estimation of pending compaction tasks
This is to allow futurization of compaction_group_view method that
retrieves sstable set.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:29 +03:00
Raphael S. Carvalho
2c4a9ba70c treewide: Rename table_state to compaction_group_view
Since table_state is a view to a compaction group, it makes sense
to rename it as so.

With upcoming incremental repair, each replica::compaction_group
will be actually two compaction groups, so there will be two
views for each replica::compaction_group.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-08-08 06:51:28 +03:00
Asias He
acc367c522 tests: adjust for incremental repair
The separatation of sstables into the logical repaired and unrepaired
virtual sets, requires some adjustments for certain tests, in particular
for those that look at number of compaction tasks or number of sstables.
The following tests need adjustment:
* test/cluster/tasks/test_tablet_tasks.py
* test/boost/memtable_test.cc

The adjustments are done in such a way that they accomodate both the
case where there is separate repaired/unrepaired states and when there
isn't.
2025-08-08 06:49:17 +03:00
Benny Halevy
3f44dba014 sstables: make_entry_descriptor: make regex non-greedy
With greedy matching, an sstable path in a snapshot
directory with a tag that resembles a name-<uuid>
would match the dir regular expression as the longest match,
while a non-greedy regular expression would correctly match
the real keyspace and table as the shortest match.

Also, add a regression unit test reproducing the issue and
validating the fix.

Fixes #25242

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#25323
2025-08-07 15:35:11 +03:00
Avi Kivity
8164f72f6e Merge 'Separate local_effective_replication_map from vnode_effective_replication_map' from Benny Halevy
Derive both vnode_effective_replication_map
and local_effective_replication_map from
static_effective_replication_map as both are static and per-keyspace.

However, local_effective_replication_map does not need vnodes
for the mapping of all tokens to the local node.

Refs #22733

* No backport required

Closes scylladb/scylladb#25222

* github.com:scylladb/scylladb:
  locator: abstract_replication_strategy: implement local_replication_strategy
  locator: vnode_effective_replication_map: convert clone_data_gently to clone_gently
  locator: abstract_replication_map: rename make_effective_replication_map
  locator: abstract_replication_map: rename calculate_effective_replication_map
  replica: database: keyspace: rename {create,update}_effective_replication_map
  locator: effective_replication_map_factory: rename create_effective_replication_map
  locator: abstract_replication_strategy: rename vnode_effective_replication_map_ptr et. al
  locator: abstract_replication_strategy: rename global_vnode_effective_replication_map
  keyspace: rename get_vnode_effective_replication_map
  dht: range_streamer: use naked e_r_m pointers
  storage_service: use naked e_r_m pointers
  alternator: ttl: use naked e_r_m pointers
  locator: abstract_replication_strategy: define is_local
2025-08-07 12:51:43 +03:00
Nadav Har'El
6f415b2f10 Merge 'test/cqlpy: Adjust test_describe.py to work against Cassandra' from Dawid Mędrek
We adjust most of the tests in `cqlpy/test_describe.py`
so that they work against both Scylla and Cassandra.
This PR doesn't cover all of them, just those I authored.

Refs scylladb/scylladb#11690

Backport: not needed. This is effectively a code cleanup.

Closes scylladb/scylladb#25060

* github.com:scylladb/scylladb:
  test/cqlpy/test_describe.py: Adjust test_create_role_with_hashed_password_authorization to work with Cassandra
  test/cqlpy/test_describe.py: Adjust test_desc_restore to work with Cassandra
  test/cqlpy/test_describe.py: Mark Scylla-only tests as such
2025-08-07 12:43:04 +03:00
Patryk Jędrzejczak
161521b674 test: properly unset recovery_leader in the recovery procedure tests
After changing the type of the `recovery_leader` config option from
`sstring` to `UUID` in #25032, setting `recovery_leader` to an empty
string became an incorrect way to unset it. The following error started
to appear in the recovery procedure tests:
```
init - marshaling error: UUID string size mismatch: '' : recovery_leader
```
We fix it in this commit by removing `recovery_leader` from the config
file.
2025-08-07 11:20:00 +02:00
Patryk Jędrzejczak
31372843e4 test: manager_client: allow removing a config option
Currently, there is no simple way to remove an option from the server's
config file in tests. One example when this is needed is removing the
`recovery_leader` option on all servers during the recovery procedure.

In this commit, we add a new method to `ManagerClient` that removes
an option from the given server's config file.
2025-08-07 11:20:00 +02:00
Patryk Jędrzejczak
ce26896704 test: manager_client: add docstring to server_update_config 2025-08-07 11:19:54 +02:00
Avi Kivity
90eb6e6241 Merge 'sstables/trie: implement BTI node format serialization and traversal' from Michał Chojnowski
This is the next part in the BTI index project.

Overarching issue: https://github.com/scylladb/scylladb/issues/19191
Previous part: https://github.com/scylladb/scylladb/pull/25154
Next part: implementing a trie cursor (the "set to key, step forwards, step backwards" thing) on top of the `node_reader` added here.

The new code added here is not used for anything yet, but it's posted as a separate PR
to keep things reviewably small.

This part implements the BTI trie node encoding, as described in https://github.com/apache/cassandra/blob/trunk/src/java/org/apache/cassandra/io/sstable/format/bti/BtiFormat.md#trie-nodes.
It contains the logic for encoding the abstract in-memory `writer_node`s (added in the previous PR)
into the on-disk format, and the logic for traversing the on-disk nodes during a read.

New functionality, no backporting needed.

Closes scylladb/scylladb#25317

* github.com:scylladb/scylladb:
  sstables/trie: add tests for BTI node serialization and traversal
  sstables/trie: implement BTI node traversal
  sstables/trie: implement BTI serialization
  utils/cached_file: add get_shared_page()
  utils/cached_file: replace a std::pair with a named struct
2025-08-07 12:15:42 +03:00
Benny Halevy
02b922ac40 test: cql_query_test: add test_sstable_load_mixed_generation_type
Test that we can load sstables with mixed, numerical and uuid
generation types, and verify the expected data.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-07 12:04:23 +03:00
Benny Halevy
9b65856a26 test: sstable_datafile_test: move copy_directory helper to test/lib/test_utils
It's a generic helper that can be used by all tests.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-07 12:04:23 +03:00
Benny Halevy
7c9ce235d7 test: database_test: move table_dir helper to test/lib/test_utils
It's a generic helper that can be used by all tests.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-07 12:04:23 +03:00
Nadav Har'El
d632599a92 Merge 'test.py: native pytest repeats' from Andrei Chekun
Previous way of execution repeat was to launch pytest for each repeat.
That was resource consuming, since each time pytest was doing discovery
of the tests. Now all repeats are done inside one pytest process.

Backport for 2025.3 is needed, since this functionality is framework only, and 2025.3 affected with this slow repeats as well.

Closes scylladb/scylladb#25073

* github.com:scylladb/scylladb:
  test.py: add repeats in pytest
  test.py: add directories and filename to the log files
  test.py: rename log sink file for boost tests
  test.py: better error handling in boost facade
2025-08-06 18:18:03 +03:00
Benny Halevy
5e5e63af10 scylla-sstable: print_query_results_json: continue loop if row is disengaged
Otherwise it is accessed right when exiting the if block.
Add a unit test reproducing the issue and validating the fix.

Fixes #25325

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#25326
2025-08-06 16:44:51 +03:00
Szymon Malewski
eb11485969 test/alternator: enable more relevant logs in CI.
This patch sets, for alternator test suite, all 'alternator-*' loggers and 'paxos' logger to trace level. This should significantly ease debugging of failed tests, while it has no effect on test time and increases log size only by 7%.
This affects running alternator tests only with `test.py`, not with `test/alternator/run`.

Closes #24645

Closes scylladb/scylladb#25327
2025-08-06 16:37:25 +03:00
Nikos Dragazis
ee92fcc078 encryption_at_rest_test: Preserve tmpdir from failing KMIP tests
The KMIP tests start a local PyKMIP server and configure it to write
logs in the test's temporary directory (`tmpdir`). However, the tmpdir
is a RAII object that deletes the directory once it goes out of scope,
causing PyKMIP server logs to be lost on test failures.

To assist with debugging, preserve the whole directory if the test
failed with an exception. Allow the user to disable this by setting the
SCYLLA_TEST_PRESERVE_TMP_ON_EXCEPTION environment variable.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-08-06 16:29:19 +03:00
Benny Halevy
6dbbb80aae locator: abstract_replication_strategy: implement local_replication_strategy
Derive both vnode_effective_replication_map
and local_effective_replication_map from
static_effective_replication_map as both are static and per-keyspace.

However, local_effective_replication_map does not need vnodes
for the mapping of all tokens to the local node.

Note that everywhere_replication_strategy is not abstracted in a similar
way, although it could, since the plan is to get rid of it
once all system keyspaces areconverted to local or tablets replication
(and propagated everywhere if needed using raft group0)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 16:05:11 +03:00
Benny Halevy
babb4a41a8 locator: abstract_replication_map: rename calculate_effective_replication_map
to calculate_vnode_effective_replication_map since
it is specific to vnode-based range calculations.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 16:03:53 +03:00
Benny Halevy
cbad497859 locator: abstract_replication_strategy: rename vnode_effective_replication_map_ptr et. al
to static_effective_replication_map_ptr, in preparation
for separating local_effective_replication_map from
vnode_effective_replication_map.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 16:03:53 +03:00
Benny Halevy
bd62421c05 keyspace: rename get_vnode_effective_replication_map
to get_static_effective_replication_map, in preparation
for separating local_effective_replication_map from
vnode_effective_replication_map (both are per-keyspace).

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-08-06 13:40:43 +03:00
Nikos Dragazis
1eb99fb5f5 test/lib: Add option to preserve tmpdir on exception
Extend the tmpdir class with an option to preserve the directory if the
destructor is called during stack unwinding (i.e., uncaught exception).
To be used in tests where the tmpdir contains non-temporary resources
that may help in diagnosing test failures (e.g., logs from external
services such as PyKMIP).

This will be used in the next patch.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-08-06 13:07:52 +03:00
Pavel Emelyanov
0616407be5 Merge 'rest_api: add endpoint which drops all quarantined sstables' from Taras Veretilnyk
Added a new POST endpoint `/storage_service/drop_quarantined_sstables` to the REST API.
This endpoint allows dropping all quarantined SSTables either globally or
for a specific keyspace and tables.
Optional query parameters `keyspace` and `tables` (comma-separated table names) can be
provided to limit the scope of the operation.

Fixes scylladb/scylladb#19061

Backport is not required, it is new functionality

Closes scylladb/scylladb#25063

* github.com:scylladb/scylladb:
  docs: Add documentation for the nodetool dropquarantinedsstables command
  nodetool: add command for dropping quarantine sstables
  rest_api: add endpoint which drops all quarantined sstables
2025-08-06 11:55:15 +03:00
Nadav Har'El
10588958e0 test/alternator: add regression test for keep-alive support
An Alternator user complained about suspiciously many new connections being
opened, which raised a suspicion that maybe Alternator doesn't support
HTTP and HTTPS keep-alive (allowing a client to reuse the same connection
for multiple requests). It turns out that we never had a regression test
that this feature actually works (and doesn't break), so this patch adds
one.

The test confirms that Alternator's connection reuse (keep-alive) feature
actually works correctly. Of course, only if the driver really tries to
reuse a connection - which is a separate question and needs testing on
the driver side (scylladb/alternator-load-balancing#82).

The test sends two requests using Python's "requests" library which can
normally reuse connections (it uses a "connection pool"), and checks if the
connection was really reused. Unfortunately "requests" doesn't give us
direct knowledge of whether or not it reused a connection, so we check
this using simple monkey-patching. I actually tried multiple other
approaches before settling on this one. The approach needs to work
on both HTTP and HTTPS, and also on AWS DynamoDB.

Importantly, the test checks both keep-alive and non-keep-alive cases.
This is very important for validating the test itself and its tricky
monkey-patching code: The test is meant to detect when the socket is not
reused for the second request, so we want to also check the non-keep-
alive case where we know the socket isn't reused, to see the test code
really detected this situation.

By default, this test runs (like all of Alternator's test suite) on HTTP
sockets. Running this test with "test/alternator/run --https" will run
it on HTTPS sockets. The test currently passes on both HTTP and HTTPS.
It also passes on AWS DynamoDB ("test/alternator/run --aws")

Fixes #23067

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25202
2025-08-06 11:41:21 +03:00
Karol Nowacki
032e8f9030 test/boost/vector_store_client_test.cc: Fix flaky tests
The vector_store_client_test was observed to be flaky, sometimes hanging while waiting for a response from HTTP server.

Problem:
The default load balancing algorithm (in Seastar's posix_server_socket_impl::accept) could route an incoming connection to a different shard than the one executing the test.
Because the HTTP server is a non-sharded service running only on the test's originating shard, any connection submitted to another shard would never be handled, causing the test client to hang waiting for response.

Solution:
The patch resolves the issue by explicitly setting fixed cpu load balancing algorithm.
This ensures that incoming connections are always handled on the same shard where the HTTP server is running.

Closes scylladb/scylladb#25314
2025-08-06 11:24:51 +03:00
Nadav Har'El
fa86405b1f test/alternator: utility functions for changing configuration
Now that the previous patch made it possible to write to system tables
in Alternator tests, this patch introduces utility functions for changing
the configuration - scylla_config_write() in addition to the
scylla_config_read() we already had, and scylla_config_temporary() to
temporarily change a configurable parameter and then restore it to its
old value.

This patch adds a silly test that temporarily modifies the
query_tombstone_page_limit configuration parameter. Later we can
add more tests that use the new test functions for more "serious"
testing of real features. In particular, we don't have an Alternator
test for the max_concurrent_requests_per_shard configuration - and
I want to write one.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-08-06 10:02:24 +03:00
Nadav Har'El
a896e2dbb9 alternator: add optional support for writing to system table
In commit 44a1daf we added the ability to read system tables through
the DynamoDB API (actually, the Scan and Query requests only).
This ability is useful for tests, and can also be useful to users who
want to read information that is only available through system tables.

This patch adds support also for *writing* into system tables. This will
be useful for Alternator tests, were we want to temporarily change
some live-updatable configuration option - and so far haven't been
able to do that like we did do in some cql-pytest tests.

For reasons explained in issue #23218, only superuser roles are allowed to
write to system tables - it is not enough for the role to be granted
MODIFY permissions on the system table or on ALL KEYSPACES. Moreover,
the ability to modify system tables carries special risks, so this
patch only allows writes to the system tables if a new configuration
option "alternator_allow_system_table_write" turned on. This option is
turned off by default.

This patch also includes a test for this new configuration-writing
capability. The test scripts test/alternator/run and test.py now
run Scylla with alternator_allow_system_table_write turned on, but
the new test can also run without this option, and will be skipped
in that case (to allow running the test suite against some manually-
run instance of Scylla).

Fixes: #12348

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-08-06 10:00:04 +03:00
Nadav Har'El
5913498fff test/alternator: reduce duplicated code
Four tests had almost identical code to read an item from Scylla
configuration (using the system.config system table). It's time
to make this into a new utility function, scylla_config_read().

This is a good time to do it, because in a later patch I want
to also add a similar function to *write* into the configuration.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-08-06 09:56:47 +03:00
Nadav Har'El
d46dda0840 Merge 'cql, vector_search: implement read path' from null
This pull request is an addition of ANN OF queries.

The patch contains:

- CQL syntax for ORDER BY `vector_column_name` ANN OF `vector_literal` clause of SELECT statements.
- implementation of external ANN queries (using vector-store service)
- tests

Example syntax:

```
SELECT comment
    FROM cycling.comments_vs
    ORDER BY comment_vector ANN OF [0.1, 0.15, 0.3, 0.12, 0.05]
    LIMIT 3;
```
Limit can be between 1 and 1000 - same as for Cassandra.

Co-authored-by: @janpiotrlakomy @smoczy123
Fixes: VECTOR-48
Fixes: VECTOR-46

Closes scylladb/scylladb#24444

* github.com:scylladb/scylladb:
  cql3/statements: implement external `ANN OF` queries
  vector_store_client: implement ann_error_visitor
  test/cqlpy: check ANN queries disallow filtering properly
  cassandra_tests: translate vector_invalid_query_test
  cassandra_tests: copy vector_invalid_query_test from Cassandra
  vector_index: make parameter names case insensitive
  cql3/statements: add `ANN OF` queries support to select statements
  cql/Cql.g: extend the grammar to allow for `ANN OF` queries
  cql3/raw: add ANN ordering to the raw statement layer
2025-08-06 09:53:38 +03:00
Avi Kivity
bb922b2aa9 Merge 'truncate: change check for write during truncate into a log warning' from Ferenc Szili
TRUNCATE TABLE performs a memtable flush and then discards the sstables of the table being truncated. It collects the highest replay position for both of these. When the highest replay position of the discarded sstables is higher than the highest replay position of the flushed memtable, that means that we have had writes during truncate which have been flushed to disk independently of the truncate process. We check for this and trigger an on_internal_error() which throws an exception, informing the user that writing data concurrently with TRUNCATE TABLE is not advised.

The problem with this is that truncate is also called from DROP KEYSPACE and DROP TABLE. These are raft operations and exceptions thrown by them are caught by the (...) exception handler in the raft applier fiber, which then exits leaving the node without the ability to execute subsequent raft commands.

This commit changes the on_internal_error() into a warning log entry. It also outputs to keyspace/table names, and the offending replay positions which caused the check to fail.

This PR also adds a test which validates that TRUNCATE works correctly with concurrent writes. More specifically, it checks that:
- all data written before TRUNCATE starts is deleted
- none of the data after TRUNCATE completes is deleted

Fixes: #25173
Fixes: #25013

Backport is needed in versions which check for truncate with concurrent writes using `on_internal_error()`: 2025.3 2025.2 2025.1

Closes scylladb/scylladb#25174

* github.com:scylladb/scylladb:
  truncate: add test for truncate with concurrent writes
  truncate: change check for write during truncate into a log warning
2025-08-06 00:03:37 +03:00
Michał Chojnowski
9930cd59eb sstables/trie: add tests for BTI node serialization and traversal
Adds tests which check that nodes serialized by `bti_node_sink`
are readable by `bti_node_reader` with the right result.

(Note: there are no tests which check compatibility of the encoded nodes
with Cassandra or with handwritten hexdumps. There are only tests
for mutual compatibility between Scylla's writers and readers.
This can be considered a gap in testing.)
2025-08-05 21:48:24 +02:00
Pavel Emelyanov
10056a8c6d Merge 'Simplify credential reload: remove internal expiration checks' from Ernest Zaslavsky
This PR introduces a refinement in how credential renewal is triggered. Previously, the system attempted to renew credentials one hour before their expiration, but the credentials provider did not recognize them as expired—resulting in a no-op renewal that returned existing credentials. This led the timer fiber to immediately retry renewal, causing a renewal storm.

To resolve this, we remove expiration (or any other checks) in `reload` method, assuming that whoever calls this method knows what he does.

Fixes: https://github.com/scylladb/scylladb/issues/25044

Should be backported to 2025.3 since we need this fix for the restore

Closes scylladb/scylladb#24961

* github.com:scylladb/scylladb:
  s3_creds: code cleanup
  s3_creds: Make `reload` unconditional
  s3_creds: Add test exposing credentials renewal issue
2025-08-05 17:49:13 +03:00
Michael Litvak
faebfdf006 test/cluster/test_tablets_colocation: fix flaky test
When restarting the server in the test, wait for it to become ready
before requesting tablet repair.

Fixes scylladb/scylladb#25261

Closes scylladb/scylladb#25263
2025-08-05 15:36:03 +02:00
Avi Kivity
4c785b31c7 Merge 'List Alternator clients in system.clients virtual table' from Nadav Har'El
Before this series, the "system.clients" virtual table lists active connections (and their various properties, like client address, logged in username and client version) only for CQL requests. This series adds also Alternator clients to system.clients. One of the interesting use cases of this new feature is understanding exactly which SDK a user is using -without inspecting their application code.  Different SDKs pass different "User-Agent" headers in requests, and that User-Agent will be visible in the system.clients entries for Alternator requests as the "driver_name" field.

Unlike CQL where logged in username, driver name, etc. applies to a complete connection, in the Alternator API, different requests can theoretically be signed by different users and carry different headers but still arrive over the same HTTP connection. So instead of listing the currently open Alternator *connections*, we will list the currently active *requests*.

The first three patches introduce utilities that will be useful in the implementation. The fourth patch is the implementation itself (which is quite simple with the utility introduced in the second patch), and the fifth patch a regression test for the new feature. The sixth patch adds documentation, the seventh patch refactors generic_server to use the newly introduced utility class and reduce code duplication, and the eighth patch adds a small check to an existing check of CQL's system.clients.

Fixes #24993

This patch adds a new feature, so doesn't require a backport. Nevertheless, if we want it to get to existing customers more quickly to allow us to better understand their use case by reading the system.clients table, we may want to consider backporting this patch to existing branches. There is some risk involved in this patch, because it adds code that gets run on every Alternator request, so a bug on it can cause problems for every Alternator request.

Closes scylladb/scylladb#25178

* github.com:scylladb/scylladb:
  test/cqlpy: slightly strengthen test for system.clients
  generic_server: use utils::scoped_item_list
  docs/alternator: document the system.clients system table in Alternator
  alternator: add test for Alternator clients in system.clients
  alternator: list active Alternator requests in system.clients
  utils: unit test for utils::scoped_item_list
  utils: add a scoped_item_list utility class
  utils: add "fatal" version of utils::on_internal_error()
2025-08-05 15:55:41 +03:00
Ferenc Szili
33488ba943 truncate: add test for truncate with concurrent writes
test_validate_truncate_with_concurrent_writes checks if truncate deletes
all the data written before the truncate starts, and does not delete any
data after truncate completes.
2025-08-05 13:54:14 +02:00
Dawid Pawlik
74f603fe99 test/cqlpy: check ANN queries disallow filtering properly
Add tests checking if filtering with clustering column
or using index is disallowed while performing ANN query.
2025-08-05 12:34:48 +02:00
Artsiom Mishuta
4b975668f6 tiering (test.py): introduce tiering labels
introduce tiering marks
1 “unstable” - For unstable tests that will be will continue runing every night and generate up-to-date statistics with failures without failing the “Main” verification path(scylla-ci, Next)

2 “nightly” - for tests that are quite old, stable, and test functionality that rather not be changed or affected by other features, are partially covered in other tests, verify non-critical functionality, have not found any issues or regressions, too long to run on every PR,  and can be popped out from the CI run.

set 7 long tests(according to statistic in elastic) as nightly(theses 8 tests took 20% of CI run,
about 4 hours without paralelization)
1 test as unstable(as exaple ot marker usage)

Closes scylladb/scylladb#24974
2025-08-04 15:38:16 +03:00