Commit Graph

11801 Commits

Author SHA1 Message Date
Pavel Emelyanov
daea284072 Merge 'Make compaction module more self contained' from Botond Dénes
There is still some compaction related code left in `sstables/`, move this to `compaction/` to make the compaction module more self-contained.

Code cleanup, no backport.

Closes scylladb/scylladb#26277

* github.com:scylladb/scylladb:
  sstables,compaction: move make_sstable_set() implementations to compactions/
  sstables,compaction: move compaction exceptions to compaction/
2025-09-29 11:38:30 +03:00
Nadav Har'El
1aef733d48 Merge 'Alternator/cache expressions' from Szymon Malewski
Before this patch, every expression in Alternator's requests was parsed from string to adequate structure.
This patch enables caching, where input expression strings are mapped to parsed template structures.
Every new valid (parsable) expression is added to the cache. The cache has limited (configurable) size - when it is reached, the least recently used entry is removed.
When requested expression is in the cache, the copy of the template is returned - individual instances still need to be resolved (placeholders substituted with names and values).
Caching is implemented for all expression types. The cache is per shard - shared for all operations, expression types, tables, users.
Default cache size is 2000 entries per shard and it has configuration option `alternator_max_expression_cache_entries_per_shard` (0 means cache disabled).
Basic metrics (total count of hits and misses for each expression type and number of evicted enries) are implemented.
Cache features are tested in boost unit tests and overall expression caching is tested with Python tests - both mostly rely on metrics.

refs #5023
`perf-alternator` test shows improvement (median):
| test | throughput | instructions_per_op | cpu_cycles_per_op | allocs_per_op |
| ------ | ---------------- | ----------------------------- | --------------------------- | -------------------  |
| read | +6.0% | -8.5% | -7.0% | -4.9% |
| write | +13.4% | -17.6% | -14.7% | -7.4% |
| write(lwt) | +12.7% | -7.9% | -6.9% | -2.8% |
| write_rwm | +5.4% | -10.5% | -7.3% | -4.1% |

"read" had a ProjectionExpression with 10 column names, "write" had a UpdateExpression with 10 column names and "write_rmw" had both ConditionExpression and UpdateExpression.

This patch also includes minor refactoring of other expressions related tests (https://github.com/scylladb/scylladb/issues/22494) - use `test_table_ss` instead of `test_table`.
Fixes #25855.

This is new feature - no backporting.

Closes scylladb/scylladb#25176

* github.com:scylladb/scylladb:
  alternator: use expression caching
  alternator: adds expression cache implementation
  utils: extend lru_string_map
  utils: add lru_string_map
  alternator/expressions: error on parsing empty update expression
  alternator/expressions: fix single value condition expression parsing
  test/alternator: use `test_table_ss` instead of `test_table` in expressions related tests.
2025-09-29 11:36:31 +03:00
Nadav Har'El
8c99f807d6 test/alternator: regression test for DescribeTable's index schema
In issue #5320, we reported a bug where DescribeTable returns the wrong
schema for a GSI - it returned as a sort key an attribute which the user
didn't actually ask to be a sort key, and was only added because of a
requirement of Scylla's materialized-views implementation.

We already had a test, test_gsi_2_describe_table_schema, that reproduces
that issue. But that test only exercised the specific case that we knew
had a bug. There is a risk that the fix to #5320 (which was recently
merged) will actually break other cases - different combinations of base
and GSI keys, or even LSI keys - and we won't have tests reproducing it.

So this patch adds comprehensive regression tests for how DescribeTable
shows GSIs and LSIs for all possible combinations of base keys and
GSI/LSI keys. As we prove in test comments (and in code) we need to
test 15 GSIs and 2 LSIs to test every possible combination. These
tests aren't very slow, because we only need to create three base
tables to test all these combinations.

As usual, the new tests pass on DynamoDB. The new GSI test failed on
Alternator before #5320 was fixed, but now passes. The fact all of
its cases pass shows that the fix to #5320 didn't cause regressions
in other types of GSIs or LSIs.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#26047
2025-09-29 08:50:30 +03:00
Piotr Dulikowski
7b91dd9297 Merge 'vector_store_client: Add support for multiple URIs' from Karol Nowacki
vector_store_client: Add support for multiple URIs

The vector store client now supports a comma-separated list of URIs in
the `vector_store_primary_uri` configuration option.

It uses the vector store nodes from these URIs for load balancing and high
availability, querying the next node if the current one fails.

References: VECTOR-187

No backport is needed as this is a new feature.

Closes scylladb/scylladb#26212

* github.com:scylladb/scylladb:
  vector_store_client: Rename host_port struct to uri
  vector_store_client: Add support for multiple URIs
  vector_store_client: Remove methods used only in tests
2025-09-29 07:40:45 +02:00
Pavel Emelyanov
c029afc6d8 Merge 'test.py: dtest: port cfid_test.py' from Evgeniy Naydanov
As a part of the porting process remove unused markers.

Explicitly enable auto snapshots for the test, as they are required for it.

Enable the test in suite.yaml (run in dev mode only)

Closes scylladb/scylladb#26248

* github.com:scylladb/scylladb:
  test.py: dtest: make cfid_test.py run using test.py As a part of the porting process remove unused markers.
  test.py: dtest: copy unmodified cfid_test.py
2025-09-29 08:26:21 +03:00
Botond Dénes
9c85046f93 sstables,compaction: move compaction exceptions to compaction/
sstables/exceptions.hh still hosts some compaction specific exception
types. Move them over to the new compaction/exceptions.hh, to make the
compaction module more self-contained.
2025-09-29 06:49:14 +03:00
Botond Dénes
2b4a140610 replica: move querier code to replica namespace
The query namespace is used for symbols which span the coordinator and
replica, or that are mostly coordinator side. The querier is mainly in
this namespace due to its similar name, but this is a mistake which
confuses people. Now that the code was moved to replica/, also fix the
namespace to be namespace replica.
2025-09-29 06:44:52 +03:00
Botond Dénes
ee3d2f5b43 root,replica: mv querier to replica/
The querier object is a confusing one. Based on its name it should be in
the query/ module and it is already in the query namespace. But this is
actually a completely replica-side logic, implementing the caching of
the readers on the replica. Move it to the replica module to make this
more clear.
2025-09-29 06:33:53 +03:00
Michał Chojnowski
ff5add4287 sstables/trie: BTI-translate the entire partition key at once
Delaying the BTI encoding of partition keys is a good idea,
because most of the time they don't have to be encoded.
Usually the token alone is enough for indexing purposes.

But for the translation of the `partition_key` part itself,
there's no good reason to make it lazy,
especially after we made the translation of clustering keys
eager in a previous commit. Let's get rid of the `std::generator`
and convert all cells of the partition key in one go.
2025-09-29 04:10:40 +02:00
Michał Chojnowski
3703197c4c test/perf: add perf_bti_key_translation
Add a microbenchmark for translating keys to BTI encoding.
2025-09-29 04:08:00 +02:00
Avi Kivity
5b40d4d52b Merge 'root,replica: mv multishard_mutation_query -> replica/multishard_query' from Botond Dénes
The code in `multishard_mutation_query.cc` implements the replica-side of range scans and as such it belongs in the replica module. Take the opportunity to also rename it to `multishard_query`, the code implements both data and mutation queries for a long time now.

Code cleanup, no backport required.

Closes scylladb/scylladb#26279

* github.com:scylladb/scylladb:
  test/boost: rename multishard_mutation_query_test to multishard_query_test
  replica/multishard_query: move code into namespace replica
  replica/multishard_query.cc: update logger name
  docs/paged-queries.md: update references to readers
  root,replica: move multishard_mutation_query to replica/
2025-09-28 20:24:46 +03:00
Avi Kivity
5b6570be52 Merge 'db/config: Add SSTable compression options for user tables' from Nikos Dragazis
ScyllaDB offers the `compression` DDL property for configuring compression per user table (compression algorithm and chunk size). If not specified, the default compression algorithm is the LZ4Compressor with a 4KiB chunk size. The same default applies to system tables as well.

This series introduces a new configuration option to allow customizing the default for user tables. It also adds some tests for the new functionality.

Fixes #25195.

Closes scylladb/scylladb#26003

* github.com:scylladb/scylladb:
  test/cluster: Add tests for invalid SSTable compression options
  test/boost: Add tests for SSTable compression config options
  main: Validate SSTable compression options from config
  db/config: Add SSTable compression options for user tables
  db/config: Prepare compression_parameters for config system
  compressor: Validate presence of sstable_compression in parameters
  compressor: Add missing space in exception message
2025-09-28 20:23:23 +03:00
Artsiom Mishuta
eedd61f43f test.py: remove 'sudo' from resource_gather.py
The container now runs as root (4c1f4c419c), so sudo it's not needed
anymore

Closes scylladb/scylladb#26294
2025-09-28 16:51:19 +03:00
Szymon Malewski
6ce7843774 alternator: use expression caching
Before this patch, every expression in Alternator's requests was parsed from string to adequate structure.
This patch enables caching - all calls to parse an expression (all types) are proxied through the cache.
New expression is added to the cache, the least recently used entry (above cache size) is removed.
For existing entries the copy of the template is returned - individual instances still need to be resolved (placeholders substituted with names and values).
The cache is per shard - shared for all operations, expression types, tables, users.
Default cache size is 2000 entries per shard and it has configuration option `alternator_max_expression_cache_entries_per_shard` (0 means cache disabled).
Added Python tests are based on metrics.
2025-09-28 04:27:44 +02:00
Szymon Malewski
b75c6c9ef7 alternator: adds expression cache implementation
Every expression in Alternator's requests is parsed from string to adequate structure.
This patch implements a caching structure (input expression strings mapping to parsed 'template' structures), which will be used for handling requests in following commits.
If the reqested expression is valid (parsable) the cache will always return a value - if it is not already in the cache it will be created and stored.
The cache has limited (live configurable) size - when it is reached, the least recently used entry is removed.
The copy of the template in cache is returned - individual instances still need to be resolved (placeholders substituted with names and values).
Invalid requests will have no effect on the cache - the parser throws an exception.
Caching is implemented for all expression types. Internally it is based on helper structure `lru_string_map`.
Basic metrics (total count of hits and misses for each expression type and number of evictions) are implemented.
Metrics are used in boost unit tests.
2025-09-28 04:27:44 +02:00
Szymon Malewski
5332ceb24e utils: add lru_string_map
Adds a lru_string_map definition.
This structure maps a string keys to templated arguments, allowing efficient lookup and adding keys.
Each lookup (and adding) puts the keys on internal LRU list and the entires may be efficiently removed in a LRU order.
It will be a base for the expression cache in Alternator.
2025-09-28 04:06:00 +02:00
Szymon Malewski
be159acc03 alternator/expressions: fix single value condition expression parsing
Primitive conditions usually use operator with two or more values.
The only case of a "single value" condition is a function call -
DynamoDB does not accept other general values (i.e., attribute or value references).
In Alternator single general value was parsed as correct and only failed
later when the calculated value ended up to not be a boolean.
This works, but not when attribute or value actually is boolean.

What is more, when a parsed (but not resolved) expression is cached, this invalid expression could pollute cache.
This would be also the only case where the same string can be parsed both as a condition and a projection expression.

The issue is fixed by explicitly checking this case at primitive condition parsing.
Updated test confirms consistence between Alternator and DynamoDB.

Fixes #25855.
2025-09-28 04:06:00 +02:00
Szymon Malewski
7ed38155a3 test/alternator: use test_table_ss instead of test_table in expressions related tests.
This patch includes minor refactoring of expressions related tests (#22494) - use `test_table_ss` instead of `test_table`.
2025-09-28 04:06:00 +02:00
Botond Dénes
34cc7aafae tools/scylla-sstable: introduce the upgrade command
An offline, scylla-sstable variant of nodetool upgradesstables command.
Applies latest (or selected) sstable version and latest schema.

Closes scylladb/scylladb#26109
2025-09-27 16:53:14 +03:00
Avi Kivity
24b5d08731 Merge 'Remove table::for_all_partitions_slow()' from Pavel Emelyanov
This method was once implemented by calling table::for_all_partitions(), which was supposed to be non-slow version. Then callers of "non-slow" method were updated and the method itself was renamed into "_slow()" one. Nowadays only one test still uses it.

At the same time the method itself mostly consists of a boilerplate code that moves bits around to call lambda on the partitions read from reader. Open-coding the method into the calling test results in much shorter and simpler to follow code.

Code cleanup, no backport needed

Closes scylladb/scylladb#26283

* github.com:scylladb/scylladb:
  test: Fix indentation after previous patch
  test: Opencode for_all_partitions_slow()
  test: Coroutinize test_multiple_memtables_multiple_partitions inner lambda
  table: Move for_all_partitions_slow() to test
2025-09-27 16:26:18 +03:00
Karol Nowacki
27f6459766 vector_store_client: Add support for multiple URIs
The vector store client now supports a comma-separated list of URIs in
the `vector_store_primary_uri` configuration option.

It uses the vector store nodes from these URIs for load balancing and high
availability, querying the next node if the current one fails.
2025-09-27 09:04:45 +02:00
Karol Nowacki
a310cb4c64 vector_store_client: Remove methods used only in tests
The `vector_store_client::port()` and `vector_store_client::host()` methods
were only used in the test code.

Moreover, these tests are no longer needed, as the proper parsing of the
URI is already tested in other tests that perform requests to the
vector store server mock.
2025-09-27 08:47:00 +02:00
Piotr Dulikowski
39145ff1d0 Merge 'vector_store_client: Add support for load balancing' from Karol Nowacki
This change introduces a load balancing mechanism for the vector store client.
The client can now distribute requests across multiple vector store nodes.
The distribution mechanism performs random selection of nodes for each request.

References: VECTOR-187

No backport is needed as this is a new feature.

Closes scylladb/scylladb#26205

* github.com:scylladb/scylladb:
  vector_store_client: Add support for load balancing
  vector_store_client_test: Introduce vs_mock_server
  vector_store_client_test: Relocate to a dedicated directory
2025-09-26 18:55:14 +02:00
Petr Gusev
c1cc52c8c8 lwt: prohibit for tablet-based views and cdc logs
SELECT commands with SERIAL consistency level are historically allowed
for vnode-based views, even though they don't provide linearizability
guarantees. We prohibit LWTs for tablet-based views, but preserve old
behavior for vnode-based view for compatibility. Similar logic is
applied to CDC log tables.

Fixes scylladb/scylladb#26258
2025-09-26 17:06:58 +02:00
Szymon Wasik
ccfe80ab97 cql3: Update error messages to be in line with documentation.
ANN (aproximate nearest neighborhood) is just the name of the type
of algorithm used to perform vector search. For this reason the error
messages should refer to vector queries rather than ANN queries.
2025-09-26 17:01:10 +02:00
Petr Gusev
8adbb6c4dd tablets: disallow chains of colocated tables 2025-09-26 16:52:43 +02:00
Pavel Emelyanov
04a40b08f7 test: Fix indentation after previous patch
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-26 16:39:09 +03:00
Pavel Emelyanov
813619f939 test: Opencode for_all_partitions_slow()
The method is a large boilerplate that moves stuff around to do simple
thing -- read mutations from reader in a row and "check" them with a
lambda, optionally breaking the loop if lambda wants it.

The whole thing is much shorter if the caller kicks reader itsown.

One thing to note -- reader is not closed if something throws in
between, but that's test anyway, if something throws, test fails and not
closed reader is not a big deal.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-26 16:36:58 +03:00
Pavel Emelyanov
c1ebf987a9 test: Coroutinize test_multiple_memtables_multiple_partitions inner
lambda

The only place where it needs futures is to call the
for_all_partitions_slow() from a table

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-26 16:35:59 +03:00
Pavel Emelyanov
f3c57f7dd0 table: Move for_all_partitions_slow() to test
It's now only used by a single test, so move it there and remove from
public table API.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-26 16:33:25 +03:00
Karol Nowacki
a0e62ef8de vector_store_client: Add support for load balancing
This change introduces a load balancing mechanism for the vector store client.
The client can now distribute requests across multiple vector store nodes.
The distribution mechanism performs random selection of nodes for each request.
2025-09-26 13:44:28 +02:00
Karol Nowacki
ee90530c31 vector_store_client_test: Introduce vs_mock_server
Introduce the `vs_mock_server` test class, which is capable of counting
incoming requests. This will be used in subsequent tests to verify
load balancing logic.
2025-09-26 12:27:06 +02:00
Nikos Dragazis
8410532fa0 test/cluster: Add tests for invalid SSTable compression options
Complementary to the previous patch. It triggers semantic validation
checks in `compression_parameters::validate()` and expects the server to
exit. The tests examine both command line and YAML options.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-09-26 12:02:42 +03:00
Nikos Dragazis
6ba0fa20ee test/boost: Add tests for SSTable compression config options
Since patch 03461d6a54, all boost unit tests depending on `cql_test_env`
are compiled into a single executable (`combined_tests`). Add the new
test in there.

Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
2025-09-26 12:02:42 +03:00
Botond Dénes
52c05d89aa test/boost: rename multishard_mutation_query_test to multishard_query_test 2025-09-26 11:15:38 +03:00
Botond Dénes
fb16c0a6d4 root,replica: move multishard_mutation_query to replica/
It belongs there, it is a completely replica-side thing. Also take the
opportunity to rename it to multishard_query.{hh,cc}, it is not just
mutation anymore (data query is also implemented).
2025-09-26 11:15:38 +03:00
Avi Kivity
0f4363cc8d Merge 'sstable: add more complete schema to scylla component' from Botond Dénes
Sstables store a basic schema in the statistics component. The scylla-sstable tool uses this to be able to read and dump sstables in a self-contained manner, without requiring an external schema source.
The problem is that the schema stored int he statistics component is incomplete: it doesn't store column names for key columns, so these have placeholder names in dump outputs where column names are visible.
This is not a disaster but it is confusing and it can cause errors in scripts which want to check the content of sstables, while also knowing the schema and expecting the proper names for key columns.

To make sstables truly self-contained w.r.t. the schema, add a complete schema to the scylla component. This schema contains the names and types of all columns, as well as some basic information about the schema: keyspace name, table name, id and version.
When available, scylla-sstable's schema loader will use this new more complete schema and fall-back to the old method of loading the (incomplete) schema from the statistics component otherwise.

New feature, no backport required.

Closes scylladb/scylladb#24187

* github.com:scylladb/scylladb:
  test/boost/schema_loader_test: add specific test with interesting types
  test/lib/random_schema: add random_schema(schema_ptr) constructor
  test/boost/schema_loader_test: test_load_schema_from_sstable: add fall-back test
  tools/schema_loader: add support for loading from scylla-metadata
  tools/schema_loader: extract code which load schema from statistics
  sstables: scylla_metadata: add schema member
2025-09-26 00:21:17 +03:00
Artsiom Mishuta
f23d19e248 test.py: fix dumping big logs to output
1. Remove dumping cluster logs and print only the link to the log.
2. Fail the test (to fail CI and not ignore the problem) and mark the cluster as dirty (to avoid affecting subsequent tests) in case setup/teardown fails.
3. Add 2 cqlpy tests that fail after applying step 2 to the dirties_cluster list so the cluster is discarded afterward.

Closes scylladb/scylladb#26183
2025-09-25 22:36:46 +03:00
Pavel Emelyanov
78d32c52f2 test: Use map_reduce0 in sstable_directory_test.cc (and coroutinize)
There's a code that tries to accumulate some counter across a sharded
service by hand. Using map_reduce0() looks nicer and avoids the smp-safe
atomic counter.

Also -- coroutinuze the thing while at it

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#26259
2025-09-25 21:22:12 +03:00
Michael Litvak
3c3dd4cf9d auth: add query_state parameter to query functions
add a query_state parameter to several auth functions that execute
internal queries. currently the queries use the
internal_distributed_query_state() query state, and we maintain this as
default, but we want also to be able to pass a query state from the
caller.

in particular, the auth queries currently use a timeout of 5 seconds,
and we will want to set a different timeout when executed in some
different context.
2025-09-25 16:46:50 +02:00
Pavel Emelyanov
6670090581 Merge 'compaction: move code to namespace compaction' from Botond Dénes
The namespace usage in this directory is very inconsistent, with files and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables

With cases, where all three used in the same file. This code used to live in sstables/ and some of it still retains namespace sstables as a heritage of that time. The mismatch between the dir (future module) and the namespace used is confusing, so finish the migration and move all code in compaction/ to namespace compaction too.

This patch, although large, is mechanic and only the following kind of changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace context

This refactoring revealed some awkward leftover coupling between sstables and compaction, in sstables/sstable_set.cc, where the make_sstable_set() methods of compaction strategies are implemented.

Code cleanup, no backport.

Closes scylladb/scylladb#26214

* github.com:scylladb/scylladb:
  compaction: remove using namespace {compaction,sstables}
  compaction: move code to namespace compaction
2025-09-25 15:10:11 +03:00
Karol Nowacki
c4e13959ab vector_store_client_test: Relocate to a dedicated directory
The `vector_store_client_test` is moved from `test/boost` to a new
`test/vector_search` directory.

This change prepares a dedicated location for all upcoming tests
related to the vector search feature.
2025-09-25 14:04:28 +02:00
Botond Dénes
1999d8e3d3 compaction: remove using namespace {compaction,sstables}
Some files in compaction/ have using namespace {compaction,sstables}
clauses, some even in headers. This is considered bad practice and
muddies the namespace use. Remove them.
2025-09-25 15:03:57 +03:00
Botond Dénes
86ed627fc4 compaction: move code to namespace compaction
The namespace usage in this directory is very inconsistent, with files
and classes scattered in:
* global namespace
* namespace compaction
* namespace sstables

With cases, where all three used in the same file. This code used to
live in sstables/ and some of it still retains namespace sstables as a
heritage of that time. The mismatch between the dir (future module) and
the namespace used is confusing, so finish the migration and move all
code in compaction/ to namespace compaction too.

This patch, although large, is mechanic and only the following kind of
changes are made:
* replace namespace sstable {} with namespace compaction {}
* add namespace compaction {}
* drop/add sstables::
* drop/add compaction::
* move around forward-declarations so they are in the correct namespace
  context

This refactoring revealed some awkward leftover coupling between
sstables and compaction, in sstables/sstable_set.cc, where the
make_sstable_set() methods of compaction strategies are implemented.
2025-09-25 15:03:56 +03:00
Pavel Emelyanov
b30c8a1f25 test: Add validation of data returned by /storage_service endpoints
The test compares the ranges that are returned from /describe_ring and
/range_to_endpoint_map with the information obtained from system.tablets
and makes sure that

- the number of ranges
- the boundary tokens
- the target replicas (nodes only)

match.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-25 14:53:22 +03:00
Nadav Har'El
265660d22f test/cluster: greatly speed up test_localnodes_joining_nodes
The test cluster/test_alternator::test_localnodes_joining_nodes takes
a whopping 2 minutes and 9 seconds to run before this patch. After this
patch, it takes just 7 seconds.

The slowness of this test was caused by booting a second node that hangs
during boot for 2 minutes, deliberately. We never intended for this boot
to finish (the whole point of this test is to run before it finishes),
but unfortunately had to wait for it to avoid all sort of nasty problems
with unwaited futures.

As comments already explained in the code, the solution to this problem
is to *kill* the server at the end of the test - after we kill it, we
can wait for it - this wait will very quickly notice that the server
addition failed, and not need to wait 2 minutes. But until the previous
patch, we had no API to find the server which is starting (not yet
running), or to kill it. After the previous patch, we do have such an
API, and can now use it, and see this test finish in 7 seconds instead
of 2 minutes and 9 seconds.
2025-09-25 14:00:16 +03:00
Nadav Har'El
aa8d6e9e74 test/pylib: add the ability to stop currently-starting servers
Some tests need the ability to abruptly stop a server in the test
cluster before it fully booted - e.g., because the test knows (and
perhaps even expects) that the boot is hung. But before this patch,
manager.server_stop() could only kill servers in "running" state.

This patch adds to pylib tracking of "starting" servers - servers which
we are starting but haven't finished booting - their list can be
returned by the manager.starting_servers(). The manage.server_stop
function can now kill a server which is just starting - not just
"running" servers.

To avoid breaking existing tests, manager.all_servers() continues to
return just running and stopped servers - not "starting" servers.
By the way, when a starting server is killed, it is not listed as stopped -
it just behaves as a normal failure to add the server, and not as a
server which successfully joined the cluster but was later stopped.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-09-25 14:00:16 +03:00
Łukasz Paszkowski
62e27e0f77 test_out_of_space_prevention.py: Fix flaky test_node_restart_while_tablet_split test
The test starts a 3-node cluster and immediately creates a big file
on the first nodes in order to trigger the out of space prevention to
disable compaction, including the SPLIT compaction.

In order to trigger a SPLIT compaction, a keyspace with 1 initial tablet
is created followed by alter statement with `tablets = {'min_tablet_count': 2}`.
This triggers a resize decision that should not finalize due to
disabled compaction on the first node.

The test is flaky because, the keyspace is created with RF=1 and there
is no guarantee that the tablet replica will be located on the first node
with critical disk utilization. If that is not the case, the split
is finalized and the test fails, because it expect the split to be
blocked.

Change to RF=3. This ensures there is exactly one tablet replica on
each node, including the one with critical disk utilization. So SPLIT
is blocked until the disk utilization on the first node, drops below
the critical level.

Fixes: https://github.com/scylladb/scylladb/issues/25861

Closes scylladb/scylladb#26225
2025-09-25 11:54:48 +03:00
Nadav Har'El
f65998cd39 alternator: improve error handling of incorrect ARN
Before this patch, if an ARN that is passed to Alternator requests
like TagResource is well-formatted but points to non-existent table,
Alternator returns the unhelpful error:

  (AccessDeniedException) when calling the TagResource operation:
  Incorrect resource identifier

This patch modifies this error to be:

  (ResourceNotFoundException) when calling the TagResource operation:
  ResourceArn 'arn:scylla:alternator:alternator_alternator_Test_
  1758532308880:scylla:table/alternator_Test_1758532308880x' not found

This is the same error type (ResourceNotFoundException) that DynamoDB
returns in that case - and a more helpful error message.

This patch also includes a regression test that checks the error
type in this case. The new test fails on Alternator before this
patch, and passes afterwards (and also passes on DyanamoDB).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#26179
2025-09-25 11:51:17 +03:00
Botond Dénes
80bc1af05a test/boost/schema_loader_test: add specific test with interesting types
Although the existing random schema test should cover all types, it is
good to have targeted tests for more interesting types.
2025-09-25 11:28:35 +03:00