Commit Graph

11801 Commits

Author SHA1 Message Date
Botond Dénes
f10af4b5eb test/lib/random_schema: add random_schema(schema_ptr) constructor
Allow using the convenient random data generation facilities, for any
schema.
2025-09-25 11:28:34 +03:00
Botond Dénes
4c9da11bfb test/boost/schema_loader_test: test_load_schema_from_sstable: add fall-back test
The test now tests loading the schema from the scylla component by
default. Force testing the fall-back (read schema from statistics) by
deleting the Scylla.db component.
Also improve the test by comparing the column names and types, to check
that when loaded from the scylla component, the key names are also
correct.
2025-09-25 11:28:34 +03:00
Botond Dénes
50038ef2cc Merge 'alternator: update references to alternator streams issue' from Michael Litvak
update all the references about the issue of tablets support for
alternator streams to issue https://github.com/scylladb/scylladb/issues/23838 instead of https://github.com/scylladb/scylladb/issues/16317.

The issue https://github.com/scylladb/scylladb/issues/16317 is about support of CDC with tablets, but it is now
closed and it didn't address alternator streams. the remaining issues
about alternator streams should be addressed as part of https://github.com/scylladb/scylladb/issues/23838, so fix
the references in order for them not to be missed.

backport is not needed

Closes scylladb/scylladb#26178

* github.com:scylladb/scylladb:
  test/cqlpy/test_permissions: unskip test for tablets
  alternator: update references to alternator streams issue
2025-09-25 11:05:52 +03:00
Botond Dénes
f7fd12c2f5 Merge 'test: fix test_one_big_mutation_corrupted_on_startup' from Cezar Moise
The commitlog in the tests with big mutations were corrupted by overwriting 10 chunks of 1KB with random data, which could not be enough due to randomness and the big size of the commitlog (~65MB).

- change `corrupt_file` to overwrite a based on a percentage of the file's size instead of fixed number of chunks
- fix typos
- cleanup comments for clarity

Closes: #25627

Closes scylladb/scylladb#25979

* github.com:scylladb/scylladb:
  test: cleanup big mutation commitlog tests
  test: fix test_one_big_mutation_corrupted_on_startup
2025-09-25 11:05:52 +03:00
Pavel Emelyanov
8f815de1e0 Merge 'treewide: move away from accessing httpd::request::query_parameters' from Botond Dénes
Acecssing this member directly is deprecated, migrate code to use {get,set}_query_param() and friends instead.

Fixes: https://github.com/scylladb/scylladb/issues/26023

Preparation for seastar update, no backport required.

Closes scylladb/scylladb#26024

* github.com:scylladb/scylladb:
  treewide: move away from accessing httpd::request::query_parameters
  test/pylib/s3_server_mock.py: better handle empty query params
2025-09-25 11:05:50 +03:00
Łukasz Paszkowski
5f6df4eb97 test/storage: Properly mount/clear volumes
Due to a missing functionality in PythonTest, `unshare` is never used
to mount volumes. As a consequence:
+ volumes are created with sudo which is undesired
+ they are not cleared automatically

Even having the missing support in place, the approach with mounting
volumes with `unshare` would not work as http server, a pool of clusters,
and scylla cluster manager are started outside of the new namespace.
Thus cluster would have no access to volumes created with `unshare`.

The new approach that works with and without dbuild and does not require
sudo, uses the following three commands to mount a volume:

truncate -s 100M /tmp/mydevice.img
mkfs.ext4 /tmp/mydevice.img
fuse2fs /tmp/mydevice.img test/

Additionally, a proper cleanup is performed, i.e. servers are stopped
gracefully and and volumes are unmounted after the tests using them are
completed.

Fixes: https://github.com/scylladb/scylladb/issues/25906

Closes scylladb/scylladb#26065
2025-09-25 11:05:50 +03:00
Evgeniy Naydanov
eea166c809 test.py: dtest: make cfid_test.py run using test.py
As a part of the porting process remove unused markers.

Explicitly enable auto snapshots for the test, as they are required for it.

Enable the test in suite.yaml (run in dev mode only)
2025-09-25 11:04:00 +03:00
Evgeniy Naydanov
18723b41cf test.py: dtest: copy unmodified cfid_test.py 2025-09-25 10:33:18 +03:00
Łukasz Paszkowski
29de947851 test_out_of_space_prevention.py: Fix flaky test_user_writes_rejection test
The test starts a 3-node cluster and immediately creates a big file
on one of the nodes, to trigger the out of space prevention to start
rejecting writes on this node. Then a write is executed and checked it
did not reach the node with critical disk utilization but reached
the remaining nodes (it should, RF=3 is set)

However, when not specified, a default LOCAL_ONE consistency level
is used. This means that only one node is required to acknowledge the
write.

After the write, the test checks if the write
+ did NOT reach the node with critical disk utilization (works)
+ did reach the remaning nodes

This can cause the test to fail sporadically as the write might not
yet be on the last node.

Use CL=QUORUM instead.

Fixes: https://github.com/scylladb/scylladb/issues/26004

Closes scylladb/scylladb#26030
2025-09-25 08:05:45 +03:00
Piotr Dulikowski
5d5244abaf Merge 'vector_store_client: Add support for multiple IPs in DNS responses' from Karol Nowacki
vector_store_client: Add support for multiple IPs in DNS responses

The DNS resolution logic now processes all IP addresses returned in a DNS
response, not just the primary one.

The client will iterate through the list of resolved IPs, attempting to
query the next one if a request fails. This improves high availability
by allowing the client to query other available nodes if one is down.

References: VECTOR-187

As this is a new feature no backport is needed.

Closes scylladb/scylladb#26055

* github.com:scylladb/scylladb:
  vector_store_client: Rename HTTP_REQUEST_RETRIES to ANN_RETRIES
  vector_store_client: Format with clang-format
  vector_store_client: Add support for multiple IPs in DNS responses
  vector_store_client_test: Extract `make_vs_server` helper function
  vector_store_client_test: Ensure cleanup on exception
  vector_store_client_test: Fix unreliable unavailable port tests
2025-09-24 16:24:19 +02:00
Ferenc Szili
c6c9c316a7 load_balancer: fix std::out_of_bounds when decommissioning with empty nodes
Consider the following:

The tablet load balancer is working on:

- node1: an empty node (no tablets) with a large disk capacity
- node2: an empty node (no tablets) with a lower disk capacity then node1
- node3: is being decommissioned and contains tablet replicas

In load_balancer::make_internode_plan() the initial destination
node/shard is selected like this:

// Pick best target shard.
auto dst = global_shard_id {target, _load_sketch->get_least_loaded_shard(target)};

load_sketch::get_least_loaded_shard(host_id) calls ensure_node() which
adds the host to load_sketch's internal hash maps in case the node was
not yet seen by load_sketch.

Let's assume dst is a shard on node1.

Later in load_balancer::make_internode_plan() we will call
pick_candidate() to try to find a better destination node than the
initial one:

// May choose a different source shard than src.shard or different destination host/shard than dst.
auto candidate = co_await pick_candidate(nodes, src_node_info, target_info, src, dst, nodes_by_load_dst,
                                            drain_skipped);
auto source_tablets = candidate.tablets;
src = candidate.src;
dst = candidate.dst;

If pick_candidate() selects some other empty destination (due to larger
capacity: node1) node, and that node has not yet been seen by
load_sketch (because it was empty), a subsequent call to
load_sketch::pick() will search for the node using
std::unordered_map::at(), and because the node is not found it will
throw a std::out_of_bounds() exception crashing the load balancer.

This problem is fixed by changing load_sketch::populate() to initialize
its internal maps with all the nodes which populate()'s arguments
filter for.

Fixes: #26203

Closes scylladb/scylladb#26207
2025-09-24 15:27:19 +02:00
Pavel Emelyanov
b85673e9b0 test,lib: Add range_to_endpoint_map() method to rest client
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-09-24 15:44:57 +03:00
Alex Dathskovsky
5e89a78c8f raft: refactor can_vote logic and type
This PR refactors the can_vote function in the Raft algorithms for improved clarity and maintainability by providing safer strong boolean types to the raft algorithm.

Fixes: #21937

Backport: No backport required

Closes scylladb/scylladb#25787
2025-09-24 13:55:05 +02:00
Botond Dénes
761a32927e Merge 'scrub: Handle malformed_sstable_exception in scrub skip mode' from Taras Veretilnyk
This PR improves the handling of malformed SSTables during scrub and adds tests to validate the updated behavior.

When scrub is used, there is an increased chance of encountering malformed SSTables. These should not be retried as in regular compaction. Instead, they must be handled according to the selected scrub mode: in skip mode, in case of malformed_sstable_exception, invalid data or whole SSTable should be removed, in abort and segregate modes, the scrub process should abort.

Previously, all modes treated malformed_sstable_exception the same way, causing scrub to abort even when skip mode was selected. This PR updates the scrub logic to properly handle malformed SSTable exceptions based on the selected mode.

Unit tests are added to verify the intended behavior.

Fixes scylladb/scylladb#19059

Backport is not required, it is an improvement

Closes scylladb/scylladb#25828

* github.com:scylladb/scylladb:
  sstable_compaction_test: add scrub tests for malformed SSTables
  scrub: skip sstable on malformed sstable exception in skip mode
2025-09-24 14:28:43 +03:00
Ernest Zaslavsky
5ba5aec1f8 treewide: Move mutation related files to a mutation directory
As requested in #22104, moved the files and fixed other includes and build system.

Moved files:
 - combine.hh
 - collection_mutation.hh
 - collection_mutation.cc
 - converting_mutation_partition_applier.hh
 - converting_mutation_partition_applier.cc
 - counters.hh
 - counters.cc
 - timestamp.hh

Fixes: #22104

This is a cleanup, no need to backport

Closes scylladb/scylladb#25085
2025-09-24 13:23:38 +03:00
Botond Dénes
1ac7b4c35e treewide: move away from accessing httpd::request::query_parameters
Acecssing this member directly is deprecated, migrate code to use
{get,set}_query_param() and friends instead.

Fixes: https://github.com/scylladb/scylladb/issues/26023
2025-09-24 11:52:15 +03:00
Botond Dénes
5891aeb1fb test/pylib/s3_server_mock.py: better handle empty query params
Instead of re-inventing empty param handling, use the built-in
keep_blank_values=True param of the urllib.parse.parse_qs().
Handles correctly the case where the `=` is also present but no value
follows, this is the sytnax used by the new query_params in
seastar::http::request.

Also add an exception to build_POST_response(). Better than a cryptic
message about encode() not callable on NoneType.
2025-09-24 11:52:15 +03:00
Karol Nowacki
706eeee1bd vector_store_client: Rename HTTP_REQUEST_RETRIES to ANN_RETRIES
Rename `HTTP_REQUEST_RETRIES` to `ANN_RETRIES` in `vector_store_client`,
as it now applies to all vector store nodes, not just HTTP requests.

Also, remove an unused test setter function.
2025-09-24 10:51:43 +02:00
Karol Nowacki
57d1b601a8 vector_store_client: Add support for multiple IPs in DNS responses
The DNS resolution logic now processes all IP addresses returned in a DNS
response, not just the primary one.

The client will iterate through the list of resolved IPs, attempting to
query the next one if a request fails. This improves high availability
by allowing the client to query other available nodes if one is down.
2025-09-24 10:41:37 +02:00
Karol Nowacki
cc616252a4 vector_store_client_test: Extract make_vs_server helper function
The `make_vs_server` function is refactored into a standalone helper
to allow its reuse in upcoming test cases.
2025-09-24 10:41:37 +02:00
Karol Nowacki
6da598fa4a vector_store_client_test: Ensure cleanup on exception
Move the mock/test server shutdown into a `finally()` block to
guarantee cleanup even if the test case throws an exception.
2025-09-24 10:41:37 +02:00
Karol Nowacki
381586f1b8 vector_store_client_test: Fix unreliable unavailable port tests
The `generate_unavailable_localhost_port` function is not robust because it
can suffer from a race condition. It finds an available port but does not
keep it occupied, meaning another process could bind to it before the test
can use it.

The `unavailable_server` helper is a more robust solution. It creates a
server that listens on a port for its entire lifetime and immediately
closes any incoming connections. This guarantees the port remains
unavailable, making the test more reliable.
2025-09-24 10:23:24 +02:00
Piotr Dulikowski
bfb8e807be Merge 'streaming/stream_blob: generate view updates from staging sstables' from Michał Jadwiszczak
After https://github.com/scylladb/scylladb/pull/22034, staging status of sstables streamed
via file streaming was ignored and view updates were never generated.

This patch fixes it and now staging sstables are registered to
`view_building_worker`. Then, the worker create view building tasks
for those sstables, so the view building coordinator can schedule them
once the tablet migration is finished.

Fixes https://github.com/scylladb/scylla-enterprise/issues/4572

This fix affects only views on tablets, which are still experimental, so no backport needed.

Closes scylladb/scylladb#25776

* github.com:scylladb/scylladb:
  test/test_view_building_coordinator: add reproducer for file streaming
  streaming/stream_blob: register staging sstables to process them
2025-09-24 09:15:33 +02:00
Wojciech Mitros
eb92f50413 hinted_handoff: drain hints after the target node stops owning tokens
When a node is being replaced, it enters a "left" state while still
owning tokens. Before this patch, this is also the time when we start
draining hints targeted to this node, so the hints may get sent before
the token ownership gets migrated to another replica, and these hints
may get lost.
In this patch we postpone the hint draining for the "left" nodes to
the time when we know that the target nodes no longer hold ownership
of any tokens - so they're no longer referenced in topology. I'm
calling such nodes "released".

Before this change, when we were starting draining hints, we knew the
IP addresses of the target nodes. We lose this information after entering
"left" stage, so when draining hints after a node is "released", we
can't drain the hints targeted to a specific IP instead of host_id.
We may have hints targeted to IPs if the migration rom IP-based to
host_ID-based hints didn't happen yet. The migration happens
when enabling a cluster feature since 2024.2.0, so such hints can
only exist if we perform a direct upgrade from a version before
2024.2.0 to a version that has this change (2025.4.0+).
To avoid losing hints completely when such an upgrade is combined
with a node removal/replacement, we still drain hints when the node
enters a "left" state and the migration of hints to host_id wasn't
performed yet. For these drains, the problematic scenario can't
occur because it only affects tablets, and when upgrading from
a version before 2024.2.0, no tablets can exist yet.
If we perform such a drain, we no longer need to drain hints when
entering the "released" state, so we only drain when entering that
state if the migration was already completed.
With this setup, we'll always drain hints at least once when a node
is leaving. However, if the migration to host_ids finishes between
entering the "left" state and the "released" state, we'll attempt
to drain the hints twice. This shouldn't be problem though because
each `drain_for()` is performed with the `_drain_lock` and after
a `hint_endpoint_manger` is drained, it's removed, so we won't try
to drain it twice.

This patch also includes a test for verifying that hints are properly
replayed after a node replace.

Fixes https://github.com/scylladb/scylladb/issues/24980

Closes scylladb/scylladb#24981
2025-09-24 07:11:59 +02:00
Botond Dénes
7c6fb131f3 Merge 'compaction: ensure that all compaction executors are stopped' from Aleksandra Martyniuk
Currently, while stopping the compaction_manager, we stop task_manager
compaction module and concurrently run compaction_manager::really_do_stop.
really_do_stop stops and waits for all task_executors that are kept
in compaction_manager::_tasks, but nothing ensures that no more tasks will
be added there. Due to leftover tasks, we trigger  on_fatal_internal_error.

Modify the order of compaction_manager::stop. After the change, we stop
compaction tasks in the following order:
- abort module abort source;
- close module gate in the background;
- stop_ongoing_compactions (kept in compaction_manager::_tasks);
- wait until module gate is closed.

Check module abort source before creating compaction executor and
adding it to _tasks.

Thanks to the above, we can be sure that:
- after module::stop there will be no tasks in _tasks;
- compaction_manager::stop aborts all tasks; we don't wait for any whole
  compaction to finish.

Fixes: https://github.com/scylladb/scylladb/issues/25806.

Fixes shutdown bug; Needs backports to all version

Closes scylladb/scylladb#25885

* github.com:scylladb/scylladb:
  compaction: move _tasks check
  compaction: stop compaction module in really_do_stop
2025-09-24 06:49:52 +03:00
Lakshmi Narayanan Sreethar
82c95699ea types/comparable_bytes: add compatability test data for DateTpe
Byte comparable encoding for DateType was introduced in bf90018b8e. This
PR updates the compatibility test data to include the type in the test
coverage.

Refs #19407

Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com>

Closes scylladb/scylladb#26208
2025-09-24 06:42:24 +03:00
Aleksandra Martyniuk
48bbe09c8b test: fix test_two_tablets_concurrent_repair_and_migration_repair_writer_level
test_two_tablets_concurrent_repair_and_migration_repair_writer_level waits
for the first node that logs info about repair_writer using asyncio.wait.
The done group is never awaited, so we never learn about the error.

The test itself is incorrect and the log about repair_writer is never
printed. We never learn about that and tests finishes successfully
after 10 minutes timeout.

Fix the test:
- disable hinted handoff;
- repair tablets of the whole table:
  - new table is added so that concurrent migration is possible;
- use wait_for_first_completed that awaits done group;
- do some cleanups.

Remove nightly mark.

Fixes: #26148.

Closes scylladb/scylladb#26209
2025-09-24 06:40:45 +03:00
Avi Kivity
2239474a87 Merge 'tablets: scheduler: Balance racks separately when rf_rack_valid_keyspaces is true' from Tomasz Grabiec
Greatly improves performance of plan making, because we don't consider
candidates in other racks, most of which will fail to be selected due
to replication constraints (no rack overload). Also (but minor)
reduces the overhead of candidate evaluation, as we don't have to
evaluate rack load.

Enabled only for rf_rack_valid_keyspaces because such setups guarantee
that we will not need (because we must not) move tablets across racks,
and we don't need to execute the general algorithm for the whole DC.

Tested with perf-load-balancing, which performs a single scale-out
operation on a cluster which initially has 10 nodes 88 shards each, 2
racks, RF=2, 70 tables, 256 tablets per table. Scale out adds 6 new
nodes (same shard count). Time to reballance the cluster (plan making
only, sum of all iterations, no streaming):

Before:  16 min 25 s
After:    0 min 25 s

Before, plan making cost (single incremental iteration) alternated
between fast (0.1 [s]) and slow (14.1 [s]):

  testlog - Rebalance iteration 7 took 14.156 [s]: mig=88, bad=88, first_bad=17741, eval=93874484, skiplist=0, skip: (load=0, rack=17653, node=0)
  testlog - Rebalance iteration 8 took 0.143 [s]: mig=88, bad=88, first_bad=88, eval=865407, skiplist=0, skip: (load=0, rack=0, node=0)

The slow run chose min and max nodes in different racks, hence the
fast path failed to find any candidates and we switched to exhaustive
search of candidates in other nodes.

After, all iterations are fast (0.1 [s] per rack, 0.2 [s] per plan-making). The plan is twice as large because it combines the output of two subsequent (pre-patch) plan-making calls.

Fixes #26016

Closes scylladb/scylladb#26017

* github.com:scylladb/scylladb:
  test: perf: perf-load-balancing: Add parallel-scaleout scenario
  test: perf: perf-load-balancing: Convert to tool_app_template
  tablets: scheduler: Balance racks separately when rf_rack_valid_keyspaces is true
2025-09-23 22:45:35 +03:00
Tomasz Grabiec
981592bca5 tablet: scheduler: Do not emit conflicting migrations in the plan
Plan-making is invoked independently for different DCs (and in the
future, racks) and then plans are merged. It could be that the same
tablets are selected for migration in different DCs. Only one
migration will prevail and be committed to group0, so it's not a
correctness problem. Next cycle will recognize that the tablet is in
transition and will not be selected by plan-maker. But it makes
plan-making less efficient.

It may also surprise consumers of the plan, like we saw in #25912.

So we should make plan-maker be aware of already scheduled transitions
and not consider those tablets as candidates.

Fixes #26038

Closes scylladb/scylladb#26048
2025-09-23 22:40:08 +03:00
Michał Jadwiszczak
73ce19939e test/test_view_building_coordinator: add reproducer for file streaming
The test reproduces issue scylladb/scylla-enterprise#4572.

Before the fix, file-streaming didn't respect staging status of a
sstable and view updates weren't generated, leading to base-view
inconsistency.
The test creates the inconsistency in the view, triggers file-streaming
of staging sstables and verifies that the updates are generated.
2025-09-23 15:34:42 +02:00
Taras Veretilnyk
b81caf3f54 sstable_compaction_test: add scrub tests for malformed SSTables
Add unit tests for scrub behavior with malformed SSTables:
- sstable_scrub_abort_mode_malformed_sstable_test(verifies scrub aborts on malformed SSTables)
- sstable_scrub_skip_mode_malformed_sstable_test(verifies scrub skips malformed SSTables without aborting)
2025-09-23 14:34:09 +02:00
Aleksandra Martyniuk
17707d0e6b compaction: stop compaction module in really_do_stop
Currently, compaction::task_manager_module is stopped in compaction_manager::stop,
concurrently to really_do_stop. We can't predict the order of the two.

Do not set _task_manager_module to nullptr at stop, because
compaction_manager::really_do_stop() may be called before the actual
shutdown, while other components still try to use it.
compaction::task_manager_module does not keep a pointer to compaction_manager,
so we won't end up with memory leak.

Stop compaction module in really_do_stop, after ongoing compactions
are stopped.

It's a preparation for further patches.
2025-09-23 14:21:15 +02:00
Andrzej Jackowski
c8f45dbbb2 test: speed up test_long_query_timeout_erm
`test_long_query_timeout_erm` is slow because it has many parameterized
variants, and it verifies timeout behavior during ERM operations, which
are slow by nature.

This change speeds the test up by roughly 3× (319s -> 114s) by:
 - Removing two of the five scenarios that were near duplicates.
 - Shortening timeout values to reduce waiting time.
 - Parallelizing waiting on server_log with asyncio.TaskGroup().

The two removed scenarios (`("SELECT", True, False)`,
`("SELECT_WHERE", True, False)`) were near duplicates to
`("SELECT_COUNT_WHERE", True, False)` scenario, because all three
scenarios use non-mapreduce query and triggers basically the same
system behavior. It is sufficient to keep only one of them, so the test
verifies three cases:
 - One with nodes shutdown
 - One with mapreduce query
 - One with non-mapreduce query

Fixes: scylladb/scylla#24127

Closes scylladb/scylladb#25987
2025-09-23 10:28:07 +03:00
Piotr Dulikowski
482ddfb3b4 Merge 'mv: handle mismatched base/view replica count caused by RF change' from Wojciech Mitros
During an ALTER KEYSPACE statement execution where a table with a view
is present, we need to perform tablet migrations for both tables.
These migrations are not synchronized, so at some point the base may
have a different number of non-pending replicas than the view. Because
of that, we can't pair them correctly. If there is more non-pending
base replicas than view replicas, we don't need to do anything because
the view replica that didn't finish migrating is a pending replica
and will get view updates from all base replicas. But if there is more
non-pending view replicas than base replicas, we may currently lose
view updates to the new view replica.

This patch adds a workaround for this scenario. If after one migration
we have too more non-pending view replicas than base replicas, we add
it to the pending replica list so that it gets an update anyway.

This patch will also take effect if the base and view replica counts
differ due to some other bug. To track that, a new metric is added
to count such occurrences.

This patch also includes a test for this exact scenario, which is enforced by an injection.

Fixes https://github.com/scylladb/scylladb/issues/21492

Closes scylladb/scylladb#24396

* github.com:scylladb/scylladb:
  mv: handle mismatched base/view replica count caused by RF change
  mv: save the nodes used for pairing calculations for later reuse
  mv: move the decision about simple rack-aware pairing later
2025-09-23 08:10:08 +02:00
Dawid Mędrek
35f7d2aec6 db/batchlog: Drop batch if table has been dropped
If there are pending mutations in the batchlog for a table that
has been dropped, we'll keep attempting to replay them but with
no success -- `db::no_such_column_family` exceptions will be thrown,
and we'll keep trying again and again.

To prevent that, we drop the batch in that case just like we do
in the case of a non-existing keyspace.

A reproducer test has been included in the commit. It fails without
the changes in `db/batchlog_manager.cc`, and it succeeds with them.

Fixes scylladb/scylladb#24806

Closes scylladb/scylladb#26057
2025-09-23 07:48:59 +02:00
Tomasz Grabiec
2b03a69065 test: perf: perf-load-balancing: Add parallel-scaleout scenario
Simulates reblancing on a single scale-out involving simultaneous
addition of multiple nodes per rack.

Default parameters create a cluster with 2 racks, 70 tables, 256
tablets/table, 10 nodes, 88 shards/node.
Adds 6 nodes in parallel (3 per rack).

Current result on my laptop:

  testlog - Rebalance took 21.874 [s] after 82 iteration(s)
2025-09-23 00:31:31 +02:00
Tomasz Grabiec
0dcaaa061e test: perf: perf-load-balancing: Convert to tool_app_template
To support sub-commands for testing different scenarios.

The current scenario is given the name "rolling-add-dec".
2025-09-23 00:30:38 +02:00
Tomasz Grabiec
c9f0a9d0eb tablets: scheduler: Balance racks separately when rf_rack_valid_keyspaces is true
Greatly improves performance of plan making, because we don't consider
candidates in other racks, most of which will fail to be selected due
to replication constraints (no rack overload). Also (but minor)
reduces the overhead of candidate evaluation, as we don't have to
evaluate rack load.

Enabled only for rf_rack_valid_keyspaces because such setups guarantee
that we will not need (because we must not) move tablets across racks,
and we don't need to execute the general algorithm for the whole DC.

Tested with perf-load-balancing, which performs a single scale-out
operation on a cluster which initially has 10 nodes 88 shards each, 2
racks, RF=2, 70 tables, 256 tablets per table. Scale out adds 6 new
nodes (same shard count). Time to rebalance the cluster (plan making
only, sum of all iterations, no streaming):

Before: 16 min 25 s
After: 0 min 25 s

Before, plan making cost (single incremental iteration) alternated
between fast (0.1 [s]) and slow (14.1 [s]):

  Rebalance iteration 7 took 14.156 [s]: mig=88, bad=88, first_bad=17741, eval=93874484, skiplist=0, skip: (load=0, rack=17653, node=0)
  Rebalance iteration 8 took 0.143 [s]: mig=88, bad=88, first_bad=88, eval=865407, skiplist=0, skip: (load=0, rack=0, node=0)

The slow run chose min and max nodes in different racks, hence the
fast path failed to find any candidates and we switched to exhaustive
search of candidates in other nodes.

After, all iterations are fast (0.1 [s] per rack, 0.2 [s] per plan-making).
The plan is twice as large because it combines the output of two subsequent (pre-patch)
plan-making calls.

Fixes #26016
2025-09-23 00:30:37 +02:00
Patryk Jędrzejczak
a56115f77b test: deflake driver reconnections in the recovery procedure tests
All three tests could hit
https://github.com/scylladb/python-driver/issues/295. We use the
standard workaround for this issue: reconnecting the driver after
the rolling restart, and before sending any requests to local tables
(that can fail if the driver closes a connection to the node that
restarted last).

All three tests perform two rolling restarts, but the latter ones
already have the workaround.

Fixes #26005

Closes scylladb/scylladb#26056
2025-09-22 17:21:06 +02:00
Andrzej Jackowski
15e71ee083 test: audit: stop using datetime.datetime.now() in syslog converter
`line_to_row` is a test function that converts `syslog` audit log to
the format of `table` audit log so tests can use the same checks
for both types of audit. Because `syslog` audit doesn't have `date`
information, the field was filled with the current date. This behavior
broke the tests running at 23:59:59 because `line_to_row` returned
different results on different days.

Fixes: scylladb/scylladb#25509

Closes scylladb/scylladb#26101
2025-09-22 15:31:33 +03:00
Pavel Emelyanov
b23aab882a Merge 'test/alternator: multiple fixes for tests so they would pass on DynamoDB' from Nadav Har'El
Issue #26079 noted that multiple Alternator tests are failing when run against DynamoDB. This pull request fixes many of them, in several small patches. In one case we need to avoid a DynamoDB bug that wasn't even the point of the original test (and we create a new test specifically for that DynamoDB bug). Another test exposed a real incompatibility with Alternator (#26103) but didn't need to be exposed in this specific test so again we split the test to one that passes, and another one that xfails on Alternator (not on DynamoDB). A bigger changed had to be done to the tags feature test - since August 2024, the TagResource operation became asynchronous which broke our tests, so we fix this.

Each of these changes are described in more detail in the individual patches.

Refs #26079. It doesn't fix it completely because there are some tests which remain flaky, and some tests which, surprisingly, pass on us-east-1 but fail on eu-north-1. We'll need to address the rest later.

No backports needed, we only run tests against DynamDB from master (when we rarely do...), not on old branches.

Closes scylladb/scylladb#26114

* github.com:scylladb/scylladb:
  test/alternator: fix test_list_tables_paginated on DynamoDB
  test/alternator: fix tests in test_tag.py on DynamoDB
  test/alternator: fix test_health_only_works_for_root_path on DynamoDB
  test/alternator: reproducer tests for faux GSI range key problem
  test/alternator: fix test "test_17119a" to pass on DynamoDB
  test/alternator: fix test to pass on DynamoDB
2025-09-22 15:30:40 +03:00
Avi Kivity
29032213c8 test: avoid #include <boost/test/included/...>
The boost/test/included/... directory is apparently internal and not
intended for user consumption.

Including it caused a One-Definition-Rule violation, due to
boost/test/impl/unit_test_parameters.ipp containing code like this:

```c++
namespace runtime_config {

// UTF parameters
std::string btrt_auto_start_dbg    = "auto_start_dbg";
std::string btrt_break_exec_path   = "break_exec_path";
std::string btrt_build_info        = "build_info";
std::string btrt_catch_sys_errors  = "catch_system_errors";
std::string btrt_color_output      = "color_output";
std::string btrt_detect_fp_except  = "detect_fp_exceptions";
std::string btrt_detect_mem_leaks  = "detect_memory_leaks";
std::string btrt_list_content      = "list_content";
```

This is defining variables in a header, and so can (and in fact does)
create duplicate variable definitions, which later cause trouble.

So far, we were protected from this trouble by -fvisibility=hidden, which
caused the duplicate definitions to be in fact not duplicate.

Fix this by correcting the include path away from <boost/test/included/>.

Closes scylladb/scylladb#26161
2025-09-22 15:26:06 +03:00
Wojciech Mitros
d9b8278178 mv: handle mismatched base/view replica count caused by RF change
During an ALTER KEYSPACE statement execution where a table with a view
is present, we need to perform tablet migrations for both tables.
These migrations are not synchronized, so at some point the base may
have a different number of non-pending replicas than the view. Because
of that, we can't pair them correctly. If there is more non-pending
base replicas than view replicas, we don't need to do anything because
the view replica that didn't finish migrating is a pending replica
and will get view updates from all base replicas. But if there is more
non-pending view replicas than base replicas, we may currently lose
view updates to the new view replica.

This patch adds a workaround for this scenario. If after one migration
we have too more non-pending view replicas than base replicas, we add
it to the pending replica list so that it gets an update anyway.

This patch will also take effect if the base and view replica counts
differ due to some other bug. To track that, a new metric is added
to count such occurrences.

This patch also includes a test for this exact scenario, which is enforced by an injection.

Fixes https://github.com/scylladb/scylladb/issues/21492
2025-09-22 12:50:16 +02:00
Nadav Har'El
b205e1a3da Merge 'vector_store_client: Extract DNS logic into a dedicated class' from Karol Nowacki
Vector search related implementation moved to a new module vector_search.
As the vector search functionality is going to be extended, it is better to keep it in a separate module.

The DNS resolution logic and its background task are moved out of the `vector_store_client` and into a new, dedicated class `vector_search::dns`.

This refactoring is the first step towards supporting DNS hostnames that resolve to multiple IP addresses.

References: VECTOR-187

No backport needed as this is refactoring.

Closes scylladb/scylladb#26052

* github.com:scylladb/scylladb:
  vector_store_client_test: Verify DNS is not refreshed when disabled
  vector_store_client: Extract DNS logic into a dedicated class
  vector_search: Apply clang-format
  vector_store_client: Move to vector_search module
2025-09-22 13:24:34 +03:00
Michael Litvak
beb11760e0 test/cqlpy/test_permissions: unskip test for tablets
the test was skipped for tablets because CDC wasn't supported with
tablets, but now it is supported and the issue is closed, so the test
should be unskipped.
2025-09-22 10:03:32 +02:00
Michael Litvak
65351fda29 alternator: update references to alternator streams issue
update all the references about the issue of tablets support for
alternator streams to issue #23838 instead of #16317.

The issue #16317 is about support of CDC with tablets, but it is now
closed and it didn't address alternator streams. the remaining issues
about alternator streams should be addressed as part of #23838, so fix
the references in order for them not to be missed.
2025-09-22 09:56:23 +02:00
Avi Kivity
1258e7c165 Revert "Merge 'transport: service_level_controller: create and use driver service level' from Andrzej Jackowski"
This reverts commit fe7e63f109, reversing
changes made to b5f3f2f4c5. It is causing
test.py failures around cqlpy.

Fixes #26163

Closes scylladb/scylladb#26174
2025-09-22 09:32:46 +03:00
Piotr Dulikowski
b382531d99 Merge 'cdc: fix create table with cdc if not exists' from Michael Litvak
Fix an issue where executing a CREATE TABLE IF NOT EXISTS statement with
CDC enabled fails with an error if the table already exists. Instead,
the query should succeed and be a no-op.

This regression was introduced by commit fed1048059. Previously, when
executing the query, we would first check if the table exists in
do_prepare_new_column_families_announcement. If it did, we would throw
an already_exists_exception, which was handled correctly; otherwise, we
would continue and create the CDC table in the
before_create_column_families notification.

The order of operations was changed in fed1048059, causing the
regression. Now, we first create the CDC schema and add it to the schema
list for creation, and then check for each of them if they already
exist. The problem is that when we create the CDC schema in
on_pre_create_column_families, it also checks if the CDC table already
exists. If it does, it throws an invalid_request_exception, which is not
caught and handled as expected.

This patch restores the previous order of operations: we first check if
the tables exist, and only then add the CDC schema in pre_create.

Fixes https://github.com/scylladb/scylladb/issues/26142

no backport - recent regression, not released yet

Closes scylladb/scylladb#26155

* github.com:scylladb/scylladb:
  test: add test for creating table with CDC enabled if not exists
  cdc: fix create table with cdc if not exists
2025-09-22 08:18:26 +02:00
Piotr Dulikowski
591a67c7e7 Merge 'view_builder: register view on all shards atomically' from Michael Litvak
When the view builder starts to build a new view, each shard registers
itself by writing the shard id and current token to the
scylla_views_builds_in_progress table.

Previously, this happened independently by each shard. We change it now
to register all shards "atomically" - when a shard registers itself, it
also registers all other shards with an empty status, if they aren't
registered yet. This ensures that we don't have a partial state in the
table where only some of the shards are registered, but we always have a
status for all shards.

The reason we want to register all shards atomically is that if it
happens that only some of the shards were registered, then we restart
and load the status from table, this doesn't work well for multiple
reasons.

One example is that to know how many shards we had previously, we take
the maximum shard id we see in the table. If it's different than the
current shard count, we will execute the reshard code. But of course, if
the last shard is missing from the table because it didn't register
itself, this calculation will be wrong, and we can't know the previous
number of shards.

This is a problem because suppose we have two shards, and shard 0
finished building the view but shard 1 didn't start. When we come up, we
will think that previously we had only a single shard and it completed
building everything, when in fact we built only half the view
approximately. The problem is that we don't have enough information in
the tables to know that.

There are additional problems related to reshard. In the reshard
function, whether it is executed because we actually do node reshard or
because we calculated the wrong number of previous shards, if the status
of some shard is missing then the calculation of new ranges will be
wrong. When some shard didn't make progress we should start building the
view from scratch. However, this doesn't happen if we don't have a
status for the shard, because the code looks only for shards that have a
status. In effect, this shard is considered complete even though it
didn't start. This could cause the view building to get stuck or
complete without building all tokens ranges.

By registering all shards atomically, this should solve the above
problems because we will always have statuses for all shards.

Fixes https://github.com/scylladb/scylladb/issues/22989

backport not needed - the issue is probably not common and there's a workaround

Closes scylladb/scylladb#25790

* github.com:scylladb/scylladb:
  test: mv: add a test for view build interrupt during registration
  view_builder: register view on all shards atomically
2025-09-22 08:03:44 +02:00
Karol Nowacki
6bd1d7db49 vector_store_client_test: Verify DNS is not refreshed when disabled
Extend the `vector_store_client_uri_update_to_empty` test case to
verify that the DNS resolver stops refreshing when the vector store is
disabled.
2025-09-22 08:02:59 +02:00