Commit Graph

49457 Commits

Author SHA1 Message Date
Nadav Har'El
5bd503ad43 test/alternator: reproducer tests for faux GSI range key problem
In issue #5320 we noticed that when we have a GSI with a hash key only
(no range key) but the implementation's MV needs to add a clustering
key for the original base key columns, the output of DescribeTable
wrongly lists that exta "range key" - which isn't a real range key of
the GSI.

It turns out that the fact that the extra attribute is not a real GSI
range key has another implication: It should not be allowed in
KeyConditions or KeyConditionExpression - which should allow only real
key columns of the GSI.

This patch adds two reproducing tests for this issue (issue #2601),
both pass on DynamoDB but xfail on Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-09-18 12:17:21 +03:00
Nadav Har'El
0b30688641 test/alternator: fix test "test_17119a" to pass on DynamoDB
As noticed in issue #26079 the Alternator test test_gsi.py::test_17119a
fails on DynamoDB. The problem was that the test added to KeyConditions
reading from a GSI an unnecessary attribute - one which was accidentally
allowed by Alternator (Refs #26103) but not allowed by DynamoDB.

This is easy to fix - just remove the unnecessary attribute from
KeyConditions, and the test still works properly and passes on both
DynamoDB and Alternator.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-09-18 11:38:35 +03:00
Nadav Har'El
22f88bff30 test/alternator: fix test to pass on DynamoDB
As noticed in issue #26079, the Alternator test
test_number.py::test_invalid_numbers failed on DynamoDB, because one of
the things it did, as a "sanity check", was to check that the number
0e1000 was a valid number. But it turns out it isn't allowed by
DynamoDB.

So this patch removes 0e1000 from the list of *valid* numbers in
test_invalid_numbers, and instead creates a whole new test for the
case of 0e1000.

It turns out that DynamoDB has a bug (it appears to be a regression,
because test_invalid_numbers used to pass on DynamoDB!) where it
allows 0.0e1000 (since it's just zero, really!) but forbids 0e1000
which is incorrectly considered to have a too-large magnitude.

So we introduce a test that confirms that Alternator correctly allows
both 0.0e1000 and 0e1000. DynamoDB fails this test (it allows the
first, forbidding the second), making it the first Alternator test
tagged as a "dynamodb_bug".

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-09-18 10:28:01 +03:00
Wojciech Mitros
f17beba834 load_balancer: include dead nodes when calculating rack load
Load balancer aims to preserve a balance in rack loads when generating
tablet migrations. However, this balance might get broken when dead nodes
are present. Currently, these nodes aren't include in rack load calculations,
even if they own tablet replicas. As a result, load balancer treats racks
with dead nodes as racks with a lower load, so I generates migrations to these
racks.
This is incorrect, because a dead node might come back alive, which would result
in having multiple tablet replicas on the same rack. It's also inefficient
even if we know that the node won't come back - when it's being replaced or removed.
In that case we know we are going to rebuild the lost tablet replicas
so migrating tablets to this rack just doubles the work. Allowing such migrations
to happen would also require adjustments in the materialized view pairing code
because we'd temporarily allow having multiple tablet replicas on the same rack.
So in this patch we include dead nodes when calculating rack loads in the load
balancer. The dead nodes still aren't treated as potential migration sources or
destinations.
We also add a test which verifies that no migrations are performed by doing a node
replace with a mv workload in parallel. Before the patch, we'd get pairing errors
and after the patch, no pairing errors are detected.

Fixes https://github.com/scylladb/scylladb/issues/24485

Closes scylladb/scylladb#26028
2025-09-17 20:49:18 +02:00
Avi Kivity
3acfc577d8 Merge 'tools/scylla-sstable: extract json mutation stream parser into own hh,cc' from Botond Dénes
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is 1K. Extract into own hh and cc. Since this class was already using pimpl, the header remains nice and small.

Code cleanup, no backport needed.

Closes scylladb/scylladb#26064

* github.com:scylladb/scylladb:
  tools: extract json_mtuation_stream_parser to its own hh,cc files
  tools/scylla-sstable: fix indentation
  tools/scylla-sstable: prepare for extracting json_mutation_stream_parser
2025-09-17 18:30:30 +03:00
Ernest Zaslavsky
54aa552af7 treewide: Move type related files to a type directory As requested in #22110, moved the files and fixed other includes and build system.
Moved files:
- duration.hh
- duration.cc
- concrete_types.hh

Fixes: #22110

This is a cleanup, no need to backport

Closes scylladb/scylladb#25088
2025-09-17 17:32:19 +03:00
Ernest Zaslavsky
a1f18a8883 treewide: Move schema related files to a schema directory As requested in #22111, moved the files and fixed other includes and build system.
Moved files:
- frozen_schema.hh
- frozen_schema.cc
- schema_mutations.hh
- schema_mutations.cc
- column_computation.hh

Fixes: #22111

Closes scylladb/scylladb#25089
2025-09-17 17:31:05 +03:00
Botond Dénes
bde7d8ddbd Merge 'service: pass current session_id to repair rpc' from Aleksandra Martyniuk
Currently, in repair_tablet we retrieve session_id from tablet
map (and throw if it isn't specified). In case of topology
coordinator failover, we may end up in a situation where a node
runs outdated repair, treating session of a different operation
as the repair's session:

- topology coordinator starts repair transition (A);
- topology coordinator sends tablet repair rpc to node1;
- topology coordinator is separated from the cluster;
- new topology coordinator is elected;
- new topology coordinator sees waiting repair request (A_2)
  and executes it;
- new repair of the same tablet is requested (B);
- new topology coordinator starts repair transition (B);
- new topology coordinator sends tablet repair rpc to node2;
- node2 starts repair (B) as repair master;
- node1 starts repair (A), checks the current session (B), proceeds
  with repair (B) as repair master.

Send current session_id in repair_tablet rpc. If this session_id
and session id got from tablet map don't match, an exception
is thrown.

Fixes: https://github.com/scylladb/scylladb/issues/23318.

No backport; changes in rpc signatures

Closes scylladb/scylladb#25369

* github.com:scylladb/scylladb:
  test: check that repair with outdated session_id fails
  service: pass current session_id to repair rpc
2025-09-17 17:28:35 +03:00
Botond Dénes
cc5153ef8c Merge 'db: cache: consider preempting after each partition' from Aleksandra Martyniuk
Currently, during cache invaldation we check if we need to preempt
only after the partition gets invaldaited. This may lead to stalls
if we have a chain of filtered out partitions.

Check for preemption even if the partition does not get invaldated.

Refs: https://github.com/scylladb/scylladb/issues/9136.

Optimization; no backport

Closes scylladb/scylladb#26053

* github.com:scylladb/scylladb:
  db: fix indentation
  db: cache: consider preempting after each partition
2025-09-17 17:26:29 +03:00
Botond Dénes
a8d22a66fa Merge 'Improve Encryption at Rest documentation' from Nikos Dragazis
This PR introduces a major rewrite of the EaR document. The initial motivation for this PR was to fully cover all our supported key providers with working examples, and to add instructions for key rotation. However, many other improvements were made along the way.

Main changes in this PR:
* Add a high-level description for every key provider. Mention limitations.
* Better organize existing provider-specific instructions by placing them into clearly separated, tabbed sections.
* Add instructions for the replicated key provider. Mention explicitly that it cannot be used as default option for user or system encryption, and that it does not support key rotation.
* Add more examples for KMS and GCP to cover all credential types.
* Document missing configuration options.
* Add a new section for key rotation.

Notes:
* Some of the patches in this series have been cherry-picked from Laszlo's wip branch.
* This PR is expected to conflict with the Azure Key Vault PR, which should be merged first. (https://github.com/scylladb/scylladb/pull/23920/)
* Support for KMIP system keys in the Replicated Key Provider is currently broken. (https://github.com/scylladb/scylladb/issues/24443)

Fixes scylladb/scylla-enterprise#3535.
Refs scylladb/scylla-enterprise#3183.

Only doc changes. No backport is needed.

Closes scylladb/scylladb#24558

* github.com:scylladb/scylladb:
  encryption-at-rest.rst: add "Rotate Encryption Keys" section
  encryption-at-rest.rst: rewrite "Encrypt System Resources" section
  encryption-at-rest.rst: rewrite "Update Encryption Properties of Existing Tables" section
  encryption-at-rest.rst: rewrite "Encrypt a Single Table" section
  encryption-at-rest.rst: rewrite "Encrypt Tables" section
  encryption-at-rest.rst: update "Set the Azure Host" section
  encryption-at-rest.rst: update "Set the GCP Host" section
  encryption-at-rest.rst: update "Set the KMS Host" section
  encryption-at-rest.rst: update "Set the KMIP Host" section
  encryption-at-rest.rst: rewrite "Create Encryption Keys" section
  encryption-at-rest.rst: rewrite "Key Providers" section
  encryption-at-rest.rst: hoist and update "Cipher Algorithm Descriptors"
  encryption-at-rest.rst: rewrite/replace section "Encryption Key Types"
  encryption-at-rest.rst: About: describe high-level operation more precisely
  encryption-at-rest.rst: improve wording / formatting in About intro
  encryption-at-rest.rst: users (plural) typo fix
  encryption-at-rest.rst: rewrap
  encryption-at-rest.rst: strip trailing whitespace
2025-09-17 17:25:25 +03:00
Nadav Har'El
d63fdd1e8b test/cqlpy: fix run-cassandra to run with Java 21
The script test/cqpy/run-cassandra aims to make it easy to run
any version of Cassandra using whatever version of Java the user
has installed. Sadly, the fact that Java keeps changing and the
Cassandra developers are very slow to adapt to new Javas makes
doing this non-trivial.

This patch makes it possible for run-cassandra to run Cassandra 5
on the Java 21 that is now the default on Fedora 42. Fedora 42
no longer carries antique version of Java (like Java 8 or 11), not
even as an optional package.

Sadly, even with this patch it is not possible to run older
versions of Cassandra (4 and 3) with Java 21, because the new
Java is missing features such as Netty that the older Cassandra
require. But at least it restores the ability to run our cqlpy
tests against Cassandra 5.

Also, this patch adds to test/cqlpy/README.md simple instructions on
how to install Java 11 (in addition to the system's default Java 21)
on Fedora 42. Doing this is very easy and very recommended because
it restores the ability to run Cassandra 3 and 4, not just Cassandra 5.

Fixes #25822.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25825
2025-09-17 17:24:47 +03:00
Botond Dénes
85f6eeda30 Merge 'compaction/scrub: register sstables for compaction before validation' from Lakshmi Narayanan Sreethar
compaction/scrub: register sstables for compaction before validation

When `scrub --validate` runs, it collects all candidate sstables at the
start and validates them one by one in separate compaction tasks.
However, scrub in validate mode does not register these sstables for
compaction, which allows regular compaction to pick them up and
potentially compact them away before validation begins. This leads to
scrub failures because the sstables can no longer be found.

This patch fixes the issue by first disabling compaction, collecting the
sstables, and then registering them for compaction before starting
validation. This ensures that the enqueued sstables remain available for
the entire duration of the scrub validation task.

Fixes #23363

This reported scrub failure occurs on all versions that have the
checksum/digest validation feature for uncompressed sstables.
So, backport it to older versions.

Closes scylladb/scylladb#26034

* github.com:scylladb/scylladb:
  compaction/scrub: register sstables for compaction before validation
  compaction/scrub: handle exceptions when moving invalid sstables to quarantine
2025-09-17 17:22:00 +03:00
Piotr Smaron
bdb90ee15c set ssl_* columns in system.clients
Depends on https://github.com/scylladb/seastar/pull/2651

Missing columns have been present since probably forever -
they were added to the schema but never assigned any value:
```
cqlsh> select * from system.clients;
------------------+------------------------
...
 ssl_cipher_suite | null
 ssl_enabled      | null
 ssl_protocol     | null
...
```

This patch sets values of these columns:
- with a TLS connection, the 3 TLS-related fields are filled in,
- without TLS, `ssl_enabled` is set to `false` and other columns are
  `null`,
- if there's an error while inspecting TLS values, the connection is
  dropped.

We want to save the TLS info of a connection just after accepting it,
but without waiting for a TLS handshake to complete, so once the
connection is accepted, we're inspecting it in the background for the
server to be able to accept next connections immediately.
Later, when we construct system.clients virtual table, the previously
saved data can be instantaneously assigned to client_data, which is a
struct representing a row in system.clients table. This way we don't
slow down constructing this table by more than necessary, which is
relevant for cases with plenty of connections.

Fixes: #9216

Closes scylladb/scylladb#22961
2025-09-17 16:29:55 +03:00
Nadav Har'El
3c0032deb4 alternator: fix bug in combination of AttributeUpdates + ReturnValues
In test/alternator/test_returnvalues.py we had tests for the
ReturnValues feature on UpdateItem requests - but we only tested
UpdateItem requests with the "modern" UpdateExpression, and forgot to
test the combination of ReturnValues with the old AttributeUpdates API.

It turns out this combination is buggy: when both ReturnValues=ALL_OLD
and AttributeUpdates need the previous value of the item, we may wrongly
std::move() the value out, and the operation will fail with a strange
error:

    An error occurred (ValidationException) when calling the UpdateItem
    operation: JSON assert failed on condition 'IsObject()'

The fix in this patch is trivial - just move the std::move() to the
correct place, after both UpdateExpression and AttributeUpdates
handling is done.

This patch also includes a reproducing test, which fails before this
patch and passes with it - and of course passes on DynamoDB. This
test reproduces two cases where the bug happened, as well as one
case where it didn't (to make sure we don't regress in what already
worked).

Fixes #25894

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#25900
2025-09-17 16:04:01 +03:00
Piotr Dulikowski
6a90a1fd29 Merge 'db/view/view_building_worker: split batch's data preparation and execution' from Michał Jadwiszczak
The view building batch lives on shard0 but it might be doing
work on shard which owns the tablet replica.
Until now the batch data was accessed from multiple shards (shard0 and
where the batch was executed).

This patch fixes this by splitting tasks execution into:
- preparation which is always happening on shard0
- actual execution of the tasks on relevant shard, but all necessary
  data is copied to the shard and batch object isn't accessed.

Fixes https://github.com/scylladb/scylladb/issues/25804

View building coordinator hasn't been released yet, so no backport needed.

Closes scylladb/scylladb#26058

* github.com:scylladb/scylladb:
  db/view/view_building_worker: move try-catch outside `invoke_on()`
  db/view/view_building_worker: split batch's data preparation and execution
2025-09-17 14:17:25 +02:00
Botond Dénes
30a3f61fa0 Merge 'compaction: handle exception in expected_total_workload' from Aleksandra Martyniuk
expected_total_workload methods of scrub compaction tasks create a vector of table_info based on table names. If any table was already dropped, then the exception is thrown. It leaves table_info in corrupted state and node crashes with `free(): invalid size`.

Return std::nullopt if an exception was thrown to indicate that total workload cannot be found.

Fixes: #25941.

No release branches affected

Closes scylladb/scylladb#25944

* github.com:scylladb/scylladb:
  tasks: get progress of failed task based on children
  compaction: handle exception in expected_total_workload
2025-09-17 15:10:19 +03:00
Nadav Har'El
e322902506 Merge 'index, metrics: add per-index metrics' from Michał Hudobski
This patch adds the possibility to track metrics
per secondary index. Currently, only a histogram
of query latencies is tracked, but more metrics
can be added in the future. To add a new metric,
it needs to be added to the index_metrics struct
in index/secondary_index_manager.hh and then
initialized in index/secondary_index_manager.cc
in the constructor of the index_metrics struct.
The metrics are created when the index is created
and removed when the index is dropped.

First lines of the new metric:
\# HELP scylla_index_query_latencies Index query latencies
\# TYPE scylla_index_query_latencies histogram
scylla_index_query_latencies_sum{idx="test_i_idx",ks="test"} 640
scylla_index_query_latencies_count{idx="test_i_idx",ks="test"} 1
scylla_index_query_latencies_bucket{idx="test_i_idx",ks="test",le="640.000000"} 1
scylla_index_query_latencies_bucket{idx="test_i_idx",ks="test",le="768.000000"} 1

Fixes: https://github.com/scylladb/scylladb/issues/25970

Closes scylladb/scylladb#25995

* github.com:scylladb/scylladb:
  test: verify that the index metric is added
  index, metrics: add per-index metrics
2025-09-17 14:54:12 +03:00
Szymon Malewski
776f90e2f8 alternator/expressions.g: Fix antlr3 missing token leak
This patch overrides the antlr3 function that allocates the missing
tokens that would eventually leak. The override stores these tokens in
a vector, ensuring memory is freed whenever the parser is destroyed.
Solution is copied from CQL implementation.

A unit test to reproduce the issue is added - leak would be reported
by ASAN, when running this test in debug mode - the test passed but
the leak is discovered when the test file exits.

Fixes #25878

Closes scylladb/scylladb#25930
2025-09-17 13:05:24 +03:00
Botond Dénes
2fa0f82910 tools: extract json_mtuation_stream_parser to its own hh,cc files
tools/scylla-sstable.cc has 3.5k SLOC, out of which this class alone is
1K. Extract into own hh and cc, just a copy-paste after the preparation
commit.
2025-09-17 12:18:07 +03:00
Botond Dénes
ffe8918522 tools/scylla-sstable: fix indentation
Left broken by previous patch.
2025-09-17 12:16:22 +03:00
Botond Dénes
8c36a983cc tools/scylla-sstable: prepare for extracting json_mutation_stream_parser
Make methods out-of-line, so class declaration stands on its own,
without definition of impl.
Move auxiliary structures, used only by impl, out of the class scope.
Move parser to tools namespace, and auxiliaries to anonymous namespace
within the tools one.
Pass down logger ref to parser impl and below, to prepare for sst_log
not being available in scope.
Add comment to parser class explaining what it does.
2025-09-17 12:16:21 +03:00
Benny Halevy
3a6208b319 utils: stall_free: clear_gently: release wrapped objects
As discussed in https://github.com/scylladb/scylladb/pull/24606#discussion_r2281870939
clear_gently of shared pointers should release the wrapped
object reference and when the object's use_count reaches 1,
the object itself would be cleared_gently, before it's destroyed.

This behavior is similar to the way we clear gently containers
like arrays or vectors, and so it is extended in this patch
to smart pointers like unique_ptr and foreign_ptr.

The unit tests are adjusted respectively to expect the
smart pointers to be reset after clear_gently, plus
the use of `reset()` for `foreign_ptr<shared_ptr<>>` was
replaced by `clear_gently().get()` which now ensures the
reference to a shared object is released, and awaited for,
if it happens on a foreign owner shard, unlike reset of
a foreign_ptr that kicks off destroy of that shared object
in the background on the owner shard - causing flakiness.

Fixes #25723

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#25759
2025-09-17 11:44:26 +03:00
Patryk Jędrzejczak
454eb08cb4 Merge 'group0: remove obsolete "stop_before_becoming_raft_voter" error injection' from Emil Maskovsky
The Raft topology workflow was changed by the limited voters feature: nodes no longer request votership themselves. As a result, the "stop_before_becoming_raft_voter" error injection is now obsolete and has been removed.

Fixes: scylladb/scylladb#23418

No backport: This re-enables a test, only needed for master.

Closes scylladb/scylladb#26042

* https://github.com/scylladb/scylladb:
  group0: remove obsolete "stop_before_becoming_raft_voter" error injection
  test/random_failures: preserve test repeatability when removing error injections
2025-09-17 10:38:32 +02:00
Michał Jadwiszczak
d98237b33c db/view/view_building_worker: move try-catch outside invoke_on()
It's just stylist change, to me doing `invoke_on()` in try-catch
block looks better than the other way.
2025-09-16 23:15:44 +02:00
Michał Jadwiszczak
9458ceff8f db/view/view_building_worker: split batch's data preparation and execution
The view building batch lives on shard0 but it might be doing
work on shard which owns the tablet replica.
Until now the batch data was accessed from multiple shards (shard0 and
where the batch was executed).

This patch fixes this by splitting tasks execution into:
- preparation which is always happening on shard0
- actual execution of the tasks on relevant shard, but all necessary
  data is copied to the shard and batch object isn't accessed.

Fixes scylladb/scylladb#25804
2025-09-16 23:13:36 +02:00
Patryk Jędrzejczak
368d70ee15 Merge 'LWT: implement fencing' from Petr Gusev
This PR consists of three parts:
* Small refactoring of the fencing APIs in storage_proxy (renames + comments + some functions were extracted)
* Implement the fencing for LWT verbs itself. This includes checking the fencing token before and after local replica data accesses.
* Two new `test.py` tests in `test_fencing.py`, which check the fencing in some real-world scenarios.

Backport: no need -- fencing for LWT requests is needed primarily for LWT over tablets, which is not released yet.

Fixes scylladb/scylladb#22332

Closes scylladb/scylladb#25550

* https://github.com/scylladb/scylladb:
  test_tablets_lwt: eliminate redundant disable_tablet_balancing
  test_fencing: add test_lwt_fencing_upgrade
  pylib: extract upgrade helpers from test_sstable_compression_dictionaries_upgrade.py
  test_fencing: add test_fenced_out_on_tablet_migration_while_handling_paxos_verb
  test_fencing: test_fence_lwt_during_bootstap
  pylib/rest_client.py: encode injection name
  storage_proxy_stats: add fenced_out_requests metric
  storage_proxy: add fencing to Paxos verbs
  storage_proxy::apply_fence: add overload that throws on failure
  storage_proxy: extract apply_fence_result
  sp::apply_fence: rename to apply_fence_on_ready
  sp::apply_fence: rename to check_fence
  sp::apply_fence: make non-generic
2025-09-16 23:40:48 +03:00
Ernest Zaslavsky
d624413ddd treewide: Move query related files to a new query directory
As requested in #22120, moved the files and fixed other includes and build system.

Moved files:
- query.cc
- query-request.hh
- query-result.hh
- query-result-reader.hh
- query-result-set.cc
- query-result-set.hh
- query-result-writer.hh
- query_id.hh
- query_result_merger.hh

Fixes: #22120

This is a cleanup, no need to backport

Closes scylladb/scylladb#25105
2025-09-16 23:40:47 +03:00
Pavel Emelyanov
6fb66b796a s3: Add metrics to show S3 prefetch bytes
The chunked download source sends large GET requests and then consumes data
as it arrives. Sometimes it can stop reading from socket early and drop the
in-flight data. The existing read-bytes metrics show only the number of
consumed bytes, we we also want to know the number of requested bytes

Refs #25770 (accounting of read-bytes)
Fixes #25876

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25877
2025-09-16 23:40:47 +03:00
Sergey Zolotukhin
2640b288c2 raft: disable caching for raft log.
This change disables caching for raft log table due to the following reasons:
* Immediate reason is a deficiency in handling emerging range tombstones in the cache, which causes stalls.
* Long-term reason is that sequential reads from the raft log do not benefit from the cache, making it better to bypass it to free up space and avoid stalls.

Fixes scylladb/scylladb#26027

Closes scylladb/scylladb#26031
2025-09-16 23:40:47 +03:00
Pavel Emelyanov
d69a51f42a compaction: Use function when filtering compaction tasks for stopping
The compaction_manager::stop_compaction() method internally walks the
list of tasks and compares each task's compacting_table (which is
compaction group view pointer) with the given one. In case this
stop_compaction() method is called via API for a specific table, the
method walks the list of tasks for every compaction group from the
table, thus resulting in nr_groups * nr_tasks complexity. Not terrible,
but not nice either.

The proposal is to pass filtering function into the inner
do_stop_ongoing_compactions() method. Some users will pass a simple
"return true" lambda, but those that need to stop compactions for a
specitif table (e.g. -- the API handler) will effectively walk the
list of tasks once comparing the given compaction group's schema with
the target table one (spoiler: eventually this place will also be
simplified not to mess with replica::table at all).

One ugliness with the change is the way "scope" for logging message is
collected. If all tasks belong to the same table, then "for table ..."
is printed in logs. With the change the scope is no longer known
instantly and is evaluated dynamically while walking the list of tasks.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#25846
2025-09-16 23:40:47 +03:00
Michał Chojnowski
68e6141211 scylla-gdb: add scylla prepared-statements
Add a helper which prints all prepared statements currently
present in the query processor.

Example output:
```
(gdb) scylla prepared-statements
(cql3::cql_statement*)(0x600003d71050): SELECT * FROM ks.ks WHERE pk = ?
(cql3::cql_statement*)(0x600003972b50): SELECT pk FROM ks.ks WHERE pk = ?
```

Closes scylladb/scylladb#26007
2025-09-16 23:40:47 +03:00
Botond Dénes
0cf6a648bb Merge 'Default create keyspace syntax' from Dario Mirovic
Allow for the following CQL syntax:

```
CREATE KEYSPACE [IF NOT EXISTS] <name>;
```
for example:
```
CREATE KEYSPACE test_keyspace;
```

With this syntax all the keyspace's parameters would be defaulted to:

replication strategy = `NetworkTopologyStrategy`,
replication factor = number of racks , but excluding racks that only have arbiter nodes
storage options, durable writes = defaults we normally would use,
tablets enabled if they are enabled in the db configuration, e.g. scylla.yaml or db/config.cc by default.

Options besides `replication` already have defaults. `replication` had to be specified, but it could be an empty set, where defaults for sub-options (replication strategy and replication factor) would be used - `replication = {}`. Now there is no need for specifying an empty set - omitting `replication = {}` has the same effect as `replication = {}`.

Since all the options now have defaults, `WITH` is optional for `CREATE KEYSPACE` statement.

Fixes #25145

This is an improvement, no backport needed.

Closes scylladb/scylladb#25872

* github.com:scylladb/scylladb:
  docs: cql: default create keyspace syntax
  test: cqlpy: add test for create keyspace with no options specified
  cql: default `CREATE KEYSPACE` syntax
2025-09-16 23:40:47 +03:00
Emil Maskovsky
943af1ef1c topology_coordinator: consistently rethrow raft::request_aborted for direct/global commands
Ensure all direct and global topology commands rethrow the
`raft::request_aborted` exception when aborted, typically due to
leadership changes. This makes abortion explicit to callers, enabling
proper handling such as retries or workflow termination.

This change completes the work started in PR scylladb/scylladb#23962,
covering all remaining cases where the exception was not rethrown.

Fixes: scylladb/scylladb#23589

No backport: No related issues observed in previous versions; backport
not required.

Closes scylladb/scylladb#26021
2025-09-16 23:40:47 +03:00
Emil Maskovsky
87bd328873 group0: remove obsolete "stop_before_becoming_raft_voter" error injection
The Raft topology workflow was changed by the limited voters feature:
nodes no longer request votership themselves. As a result, the
"stop_before_becoming_raft_voter" error injection is now obsolete and
has been removed.

Fixes: scylladb/scylladb#23418
2025-09-16 18:24:27 +02:00
Emil Maskovsky
0453052d66 test/random_failures: preserve test repeatability when removing error injections
The order of entries in the ERROR_INJECTIONS list determines test
repeatability for a given random seed.

To allow removing error injections without affecting the order of the
remaining ones, removed injections are now renamed with a "REMOVED_"
prefix instead of being deleted.

This ensures they are ignored by the tests, while the sequence of active
injections—and thus test reproducibility—remains unchanged.
2025-09-16 18:22:45 +02:00
Michał Hudobski
3364cc96f5 test: verify that the index metric is added
This commit adds a test that performs
a sanity check that the implemented metric
is actually being added to Scylla's metrics
and has the correct value.
2025-09-16 18:10:01 +02:00
Aleksandra Martyniuk
3324f08e9c tasks: get progress of failed task based on children
Currently, for failed tasks task_manager::task::impl::get_progress
attempts to find expected_total_workload. However, if the task has
finished long time ago, the state might have totally changed, e.g.
some tables might have been dropped or have changed their sizes.
Due to that, the result of expected_total_workload might be irrelevant.

Count the progress of a finish task based on children only, regardless
whether the task has succeeded or failed.
2025-09-16 17:15:01 +02:00
Aleksandra Martyniuk
17e9ec11d7 db: fix indentation 2025-09-16 14:49:54 +02:00
Aleksandra Martyniuk
0024339a71 db: cache: consider preempting after each partition
Currently, during cache invaldation we check if we need to preempt
only after the partition gets invaldaited. This may lead to stalls
if we have a chain of filtered out partitions.

Check for preemption even if the partition does not get invaldated.

Refs: #9136.
2025-09-16 14:45:28 +02:00
Michał Hudobski
b09d1f0a98 index, metrics: add per-index metrics
This patch adds the possibility to track metrics
per secondary index. Currently, only a histogram
of query latencies is tracked, but more metrics
can be added in the future. To add a new metric,
it needs to be added to the index_metrics struct
in index/secondary_index_manager.hh and then
initialized in index/secondary_index_manager.cc
in the constructor of the index_metrics struct.
The metrics are created when the index is created
and removed when the index is dropped.

First lines of the new metric:
\# HELP scylla_index_query_latencies Index query latencies
\# TYPE scylla_index_query_latencies histogram
scylla_index_query_latencies_sum{idx="test_i_idx",ks="test"} 640
scylla_index_query_latencies_count{idx="test_i_idx",ks="test"} 1
scylla_index_query_latencies_bucket{idx="test_i_idx",ks="test",le="640.000000"} 1
scylla_index_query_latencies_bucket{idx="test_i_idx",ks="test",le="768.000000"} 1
2025-09-16 14:03:43 +02:00
Lakshmi Narayanan Sreethar
7cdda510ee compaction/scrub: register sstables for compaction before validation
When `scrub --validate` runs, it collects all candidate sstables at the
start and validates them one by one in separate compaction tasks.
However, scrub in validate mode does not register these sstables for
compaction, which allows regular compaction to pick them up and
potentially compact them away before validation begins. This leads to
scrub failures because the sstables can no longer be found.

This patch fixes the issue by first disabling compaction, collecting the
sstables, and then registering them for compaction before starting
validation. This ensures that the enqueued sstables remain available for
the entire duration of the scrub validation task.

Fixes #23363

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-09-16 15:29:57 +05:30
Patryk Jędrzejczak
9efe250a8f Merge 'gossiper: ensure gossiper operations are executed in gossiper scheduling group' from Sergey Zolotukhin
Sometimes gossiper operations invoked from storage_service and other components run under a non-gossiper scheduling group. If these operations acquire gossiper locks, priority inversion can occur: higher-priority gossiper tasks may wait behind lower-priority tasks (e.g. streaming), which can cause gossiper slowness or even failures.
This patch ensures that gossiper operations requiring locks on gossiper structures are explicitly executed in the gossiper scheduling group. To help detect similar issues in the future, a warning is logged whenever a gossiper lock is acquired under a non-gossiper scheduling group.

Fixes scylladb/scylladb#25907
Refs: scylladb/scylladb#25702

Backport: this patch fixes an issue with gossiper operations scheduling group, that might affect topology operations, therefore backport is needed to 2025.1, 2025.2, 2025.3

Closes scylladb/scylladb#25981

* https://github.com/scylladb/scylladb:
  gossiper: ensure gossiper operations are executed in gossiper scheduling group
  gossiper: fix wrong gossiper instance used in `force_remove_endpoint`
2025-09-16 10:14:15 +02:00
Asias He
54162a026f scylla-nodetool: Add --incremental-mode option to cluster repair
The `--incremental-mode` option specifies the incremental repair mode.
Can be 'disabled', 'regular', or 'full'.

'regular': The incremental repair logic is enabled. Unrepaired sstables
will be included for repair.  Repaired sstables will be skipped. The
incremental repair states will be updated after repair.

'full': The incremental repair logic is enabled. Both repaired and
unrepaired sstables will be included for repair. The incremental repair
states will be updated after repair.

'disabled': The incremental repair logic is disabled completely. The
incremental repair states, e.g., repaired_at in sstables and
sstables_repaired_at in the system.tablets table, will not be updated
after repair.

When the option is not provided, it defaults to regular.

Fixes #25931

Closes scylladb/scylladb#25969
2025-09-16 10:23:22 +03:00
Botond Dénes
ee7c85919e Revert "treewide: seastar module update and fix broken rest client"
This reverts commit 44d34663bc of PR
https://github.com/scylladb/scylladb/pull/25915.

Breaks articact tests on ARM, blocking us from building new images from
master.
2025-09-16 08:31:08 +03:00
Lakshmi Narayanan Sreethar
84f2e99c05 compaction/scrub: handle exceptions when moving invalid sstables to quarantine
In validate mode, scrub moves invalid sstables into the quarantine
folder. If validation fails because the sstable files are missing from
disk, there is nothing to move, and the quarantine step will throw an
exception. Handle such exceptions so scrub can return a proper
compaction_result instead of propagating the exception to the caller.
This will help the testcase for #23363 to reliably determine if the
scrub has failed or not.

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>
2025-09-15 23:10:23 +05:30
Sergey Zolotukhin
6c2a145f6c gossiper: ensure gossiper operations are executed in gossiper scheduling group
Sometimes gossiper operations invoked from storage_service and other components
run under a non-gossiper scheduling group. If these operations acquire gossiper
locks, priority inversion can occur: higher-priority gossiper tasks may wait
behind lower-priority tasks (e.g. streaming), which can cause gossiper slowness
or even failures.
This patch ensures that gossiper operations requiring locks on gossiper
structures are explicitly executed in the gossiper scheduling group.
To help detect similar issues in the future, a warning is logged whenever
a gossiper lock is acquired under a non-gossiper scheduling group.

Fixes scylladb/scylladb#25907
2025-09-15 12:49:07 +02:00
Petr Gusev
1d270020f2 test_tablets_lwt: eliminate redundant disable_tablet_balancing
This is a refactoring commit.
2025-09-15 12:40:10 +02:00
Petr Gusev
7060265d5f test_fencing: add test_lwt_fencing_upgrade
This test verifies that upgrading to a Scylla version
with LWT fencing does not disrupt existing LWT workloads.
2025-09-15 12:34:45 +02:00
Petr Gusev
49b036cf2b pylib: extract upgrade helpers from test_sstable_compression_dictionaries_upgrade.py
We want to reuse them to test upgade for LWT fencing
2025-09-15 12:34:45 +02:00
Petr Gusev
82f0235e4b test_fencing: add test_fenced_out_on_tablet_migration_while_handling_paxos_verb
This test verifies that the fencing token is checked on replicas
after the local Paxos state is updated. This ensures that if we failed
to drain an LWT request during topology changes the replicas
where paxos verbs got stuck won't contributed to the target CLs.
2025-09-15 12:34:45 +02:00