Commit Graph

47701 Commits

Author SHA1 Message Date
Dawid Mędrek
5c5911d874 test/cluster/test_tablets: Divide rack into two to adjust tests to RF-rack-validity
Three tests in the file use a multi-DC cluster. Unfortunately, they put
all of the nodes in a DC in the same rack and because of that, they fail
when run with the `rf_rack_valid_keyspaces` configuration option enabled.
Since the tests revolve mostly around zero-token nodes and how they
affect replication in a keyspace, this change should have zero impact on
them.

(cherry picked from commit c8c28dae92)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
6a2e52d250 test/cluster/test_tablets: Adjust test_tablet_rf_change to RF-rack-validity
We reduce the number of nodes and the RF values used in the test
to make sure that the test can be run with the `rf_rack_valid_keyspaces`
configuration option. The test doesn't seem to be reliant on the
exact number of nodes, so the reduction should not make any difference.

(cherry picked from commit 04567c28a3)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
f98c83b92f test/cluster/test_tablet_repair_scheduler.py: Adjust to RF-rack-validity
The change boils down to matching the number of created racks to the number
of created nodes in each DC in the auxiliary function `prepare_multi_dc_repair`.
This way, we ensure that the created keyspace will be RF-rack-valid and so
we can run the test file even with the `rf_rack_valid_keyspaces` configuration
option enabled.

The change has no impact on the tests that use the function; the distribution
of nodes across racks does not affect how repair is performed or what the
tests do and verify. Because of that, the change is correct.

(cherry picked from commit d3c0cd6d9d)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
f5cf4a3893 test/pylib/repair.py: Assign nodes to multiple racks in create_table_insert_data_for_repair
We assign the newly created nodes to multiple racks. If RF <= 3,
we create as many racks as the provided RF. We disallow the case
of  RF > 3 to avoid trying to create an RF-rack-invalid keyspace;
note that no existing test calls `create_table_insert_data_for_repair`
providing a higher RF. The rationale for doing this is we want to ensure
that the tests calling the function can be run with the
`rf_rack_valid_keyspaces` configuration option enabled.

(cherry picked from commit 5d1bb8ebc5)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
12f0136b26 test/cluster/test_zero_token_nodes_topology_ops: Adjust to RF-rack-validity
We assign the nodes to the same DC, but multiple racks to ensure that
the created keyspace is RF-rack-valid and we can run the test with
the `rf_rack_valid_keyspaces` configuration option enabled. The changes
do not affect what the test does and verifies.

(cherry picked from commit 92f7d5bf10)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
4e45ceda21 test/cluster/test_zero_token_nodes_no_replication.py: Adjust to RF-rack-validity
We simply assign the nodes used in the test to seprate racks to
ensure that the created keyspace is RF-rack-valid to be able
to run the test with the `rf_rack_valid_keyspaces` configuration
option set to true. The change does not affect what the test
does and verifies -- it only depends on the type of nodes,
whether they are normal token owners or not -- and so the changes
are correct in that sense.

(cherry picked from commit 4c46551c6b)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
2c8b5143ba test/cluster/test_zero_token_nodes_multidc.py: Adjust to RF-rack-validity
We parameterize the test so it's run with and without enforced
RF-rack-valid keyspaces. In the test itself, we introduce a branch
to make sure that we won't run into a situation where we're
attempting to create an RF-rack-invalid keyspace.

Since the `rf_rack_valid_keyspaces` option is not commonly used yet
and because its semantics will most likely change in the future, we
decide to parameterize the test rather than try to get rid of some
of the test cases that are problematic with the option enabled.

(cherry picked from commit 2882b7e48a)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
474de0f048 test/cluster/test_not_enough_token_owners.py: Adjust to RF-rack-validity
We simply assign DC/rack properties to every node used in the test.
We put all of them in the same DC to make sure that the cluster behaves
as closely to how it would before these changes. However, we distribute
them over multiple racks to ensure that the keyspace used in the test
is RF-rack-valid, so we can also run it with the `rf_rack_valid_keyspaces`
configuration option set to true. The distribution of nodes between racks
has no effect on what the test does and verifies, so the changes are
correct in that sense.

(cherry picked from commit 73b22d4f6b)
2025-05-12 13:10:12 +00:00
Dawid Mędrek
5ac07a6c72 test/cluster/test_multidc.py: Adjust to RF-rack-validity
Instead of putting all of the nodes in a DC in the same rack
in `test_putget_2dc_with_rf`, we assign them to different racks.
The distribution of nodes in racks is orthogonal to what the test
is doing and verifying, so the change is correct in that sense.
At the same time, it ensures that the test never violates the
invariant of RF-rack-valid keyspaces, so we can also run it
with `rf_rack_valid_keyspaces` set to true.

(cherry picked from commit 5b83304b38)
2025-05-12 13:10:11 +00:00
Dawid Mędrek
f88d8edcaf test/cluster/object_store/test_backup.py: Adjust to RF-rack-validity
We modify the parameters of `test_restore_with_streaming_scopes`
so that it now represents a pair of values: topology layout and
the value `rf_rack_valid_keyspaces` should be set to.

Two of the already existing parameters violate RF-rack-validity
and so the test would fail when run with `rf_rack_valid_keyspaces: true`.
However, since the option isn't commonly used yet and since the
semantics of RF-rack-valid keyspaces will most likely change in
the future, let's keep those cases and just run them with the
option disabled. This way, we still test everything we can
without running into undesired failures that don't indicate anything.

(cherry picked from commit 9281bff0e3)
2025-05-12 13:10:11 +00:00
Dawid Mędrek
05c70b0820 test/cluster: Adjust simple tests to RF-rack-validity
We adjust all of the simple cases of cluster tests so they work
with `rf_rack_valid_keyspaces: true`. It boils down to assigning
nodes to multiple racks. For most of the changes, we do that by:

* Using `pytest.mark.prepare_3_racks_cluster` instead of
  `pytest.mark.prepare_3_nodes_cluster`.
* Using an additional argument -- `auto_rack_dc` -- when calling
  `ManagerClient::servers_add()`.

In some cases, we need to assign the racks manually, which may be
less obvious, but in every such situation, the tests didn't rely
on that assignment, so that doesn't affect them or what they verify.

(cherry picked from commit dbb8835fdf)
2025-05-12 13:10:11 +00:00
Patryk Jędrzejczak
2b1b4d1dfc Merge '[Backport 2025.2] Correctly skip updating node's own ip address due to oudated gossiper data ' from Scylladb[bot]
Used host id to check if the update is for the node itself. Using IP is unreliable since if a node is restarted with different IP a gossiper message with previous IP can be misinterpreted as belonging to a different node.

Fixes: #22777

Backport to 2025.1 since this fixes a crash. Older version do not have the code.

- (cherry picked from commit a2178b7c31)

- (cherry picked from commit ecd14753c0)

- (cherry picked from commit 7403de241c)

Parent PR: #24000

Closes scylladb/scylladb#24089

* https://github.com/scylladb/scylladb:
  test: add reproducer for #22777
  storage_service: Do not remove gossiper entry on address change
  storage_service: use id to check for local node
2025-05-12 09:31:20 +02:00
Gleb Natapov
827563902c test: add reproducer for #22777
Add sleep before starting gossiper to increase a chance of getting old
gossiper entry about yourself before updating local gossiper info with
new IP address.

(cherry picked from commit 7403de241c)
2025-05-09 12:56:15 +00:00
Gleb Natapov
ccf194bd89 storage_service: Do not remove gossiper entry on address change
When gossiper indexed entries by ip an old entry had to be removed on an
address change, but the index is id based, so even if ip was change the
entry should stay. Gossiper simply updates an ip address there.

(cherry picked from commit ecd14753c0)
2025-05-09 12:56:15 +00:00
Gleb Natapov
9b735bb4dc storage_service: use id to check for local node
IP may change and an old gossiper message with previous IP may be
processed when it shouldn't.

Fixes: #22777
(cherry picked from commit a2178b7c31)
2025-05-09 12:56:15 +00:00
Michał Chojnowski
f29b87970a test/boost/mvcc_test: fix an overly-strong assertion in test_snapshot_cursor_is_consistent_with_merging
The test checks that merging the partition versions on-the-fly using the
cursor gives the same results as merging them destructively with apply_monotonically.

In particular, it tests that the continuity of both results is equal.
However, there's a subtlety which makes this not true.
The cursor puts empty dummy rows (i.e. dummies shadowed by the partition
tombstone) in the output.
But the destructive merge is allowed (as an expection to the general
rule, for optimization reasons), to remove those dummies and thus reduce
the continuity.

So after this patch we instead check that the output of the cursor
has continuity equal to the merged continuities of version.
(Rather than to the continuity of merged versions, which can be
smaller as described above).

Refs https://github.com/scylladb/scylladb/pull/21459, a patch which did
the same in a different test.
Fixes https://github.com/scylladb/scylladb/issues/13642

Closes scylladb/scylladb#24044

(cherry picked from commit 746ec1d4e4)

Closes scylladb/scylladb#24083
2025-05-09 13:00:34 +02:00
Botond Dénes
17a76b6264 Merge '[Backport 2025.2] test/cluster/test_read_repair.py: improve trace logging test (again)' from Scylladb[bot]
The test test_read_repair_with_trace_logging wants to test read repair with trace logging. Turns out that node restart + trace-level logging + debug mode is too much and even with 1 minute timeout, the read repair     times out sometimes. Refactor the test to use injection point instead of restart. To make sure the test still tests what it supposed to test, use tracing to assert that read repair did indeed happen.

Fixes: scylladb/scylladb#23968

Needs backport to 2025.1 and 6.2, both have the flaky test

- (cherry picked from commit 51025de755)

- (cherry picked from commit 29eedaa0e5)

Parent PR: #23989

Closes scylladb/scylladb#24051

* github.com:scylladb/scylladb:
  test/cluster/test_read_repair.py: improve trace logging test (again)
  test/cluster: extract execute_with_tracing() into pylib/util.py
2025-05-08 11:01:18 +03:00
Aleksandra Martyniuk
ab45df1aa1 streaming: skip dropped tables
Currently, stream_session::prepare throws when a table in requests
or summaries is dropped. However, we do not want to fail streaming
if the table is dropped.

Delete table checks from stream_session::prepare. Further streaming
steps can handle the dropped table and finish the streaming successfully.

Fixes: #15257.

Closes scylladb/scylladb#23915

(cherry picked from commit 20c2d6210e)

Closes scylladb/scylladb#24053
2025-05-08 11:00:27 +03:00
Botond Dénes
97f0f312e0 test/cluster/test_read_repair.py: improve trace logging test (again)
The test test_read_repair_with_trace_logging wants to test read repair
with trace logging. Turns out that node restart + trace-level logging
+ debug mode is too much and even with 1 minute timeout, the read repair
times out sometimes.
Refactor the test to use injection point instead of restart. To make
sure the test still tests what it supposed to test, use tracing to
assert that read repair did indeed happen.

(cherry picked from commit 29eedaa0e5)
2025-05-07 13:26:08 +00:00
Botond Dénes
4df6a17d30 test/cluster: extract execute_with_tracing() into pylib/util.py
To allow reuse in other tests.

(cherry picked from commit 51025de755)
2025-05-07 13:26:08 +00:00
Anna Mikhlin
b3dbfaf27a Update ScyllaDB version to: 2025.2.0-rc0 scylla-2025.2.0-rc0-candidate-20250508122345 scylla-2025.2.0-rc0-candidate-20250508120337 scylla-2025.2.0-rc0 scylla-2025.2.0-rc0-candidate-20250508124206 2025-05-07 11:41:33 +03:00
Botond Dénes
0a9ca52cfd replica/database: memtable_list: save ref to memtable_table_shared_data
This is passed by reference to the constructor, but a copy is saved into
the _table_shared_data member. A reference to this member is passed down
to all memtable readers. Because of the copy, the memtable readers save
a reference to the memtable_list's member, which goes away together with
the memtable_list when the storage_group is destroyed.
This causes use-after-free when a storage group is destroyed while a
memtable read is still ongoing. The memtable reader keeps the memtable
alive, but its reference to the memtable_table_shared_data becomes
stale.
Fix by saving a reference in the memtable_list too, so memtable readers
receive a reference pointing to the original replica::table member,
which is stable accross tablet migrations and merges.
The copy was introduced by 2a76065e3d.
There was a copy even before this commit, but in the previous vnode-only
world this was fine -- there was one memtable_list per table and it was
around until the table itself was. In the tablet world, this is no
longer given, but the above commit didn't account for this.

A test is included, which reproduces the use-after-free on memtable
migration. The test is somewhat artificial in that the use-after-free
would be prevented by holding on to an ERM, but this is done
intentionaly to keep the test simple. Migration -- unlike merge where
this use-after-free was originally observed -- is easy to trigger from
unit tests.

Fixes: #23762

Closes scylladb/scylladb#23984
2025-05-06 22:13:17 +03:00
David Garcia
b1ee0e2a6a docs: fix AttributeError with 'myst_enable_extensions' in publication workflow
Rolled back some dependencies in `poetry.lock` to previous versions while we investigate how to make the extension `sphinx_scylladb_markdown` compatible with the latest versions.

This should fix the error in https://github.com/scylladb/scylladb/actions/runs/14708656912/job/41275115239, which currently prevents publishing new versions of https://opensource.docs.scylladb.com/

Closes scylladb/scylladb#23969
2025-05-06 16:33:00 +03:00
Pavel Emelyanov
1b5bbc2433 Merge 'test.py: split boost pytest integration' from Andrei Chekun
This PR contains changes that do not add new functionality, and have small refactoring of the existing code.
The most significant change is the refactoring of resource gathering, so it will not create another cgroup to put itself in. So there will be no nested redundant 'initial' groups, e.x. `/sys/fs/cgroup/user.slice/user-1000.slice/user@1000.service/initial/initial/initial.../initial`
This is part two of splitting the original PR.

This PR is an extraction of several commits from https://github.com/scylladb/scylladb/pull/22894 as reviewer https://github.com/scylladb/scylladb/pull/22894?notification_referrer_id=NT_kwDOACiLR7MxNDg0ODk2MDU1MjoyNjU3MDk1&notifications_query=reason%3Aparticipating#pullrequestreview-2778582278.

Closes scylladb/scylladb#23882

* github.com:scylladb/scylladb:
  test.py: add awareness of extra_scylla_cmdline_options
  test.py: increase timeout for C++ tests in pytest
  test.py: switch method of finding the root repo directory
  test.py: move get_combined_tests to the correct facade
  test.py: add common directory for reports
  test.py: add the possibility to provide additional env vars
  test.py: move setup cgroups to the generic method
  test.py: refactor resource_gather.py
2025-05-06 16:22:49 +03:00
Botond Dénes
3c3f6ca233 tools/scylla-sstable: scrub: use UUID sstable identifiers
Much easier to avoid sstable collisions. Makes it possible to scrub
multiple sstables, with multiple calls to scylla-sstable, reusing the
same output directory. Previously, each new call to scylla-sstable
scrub, would start from generation 0, guaranteeing collision.

Remove the unit test for generation clash -- with UUID generations, this
is no longer possible to reproduce in practice.

Refs: #21387

Closes scylladb/scylladb#23990
2025-05-06 15:09:53 +03:00
Patryk Jędrzejczak
7f843e0a5c Merge 'raft: make sure to retain the existing voters including the current leader (topology coordinator)' from Emil Maskovsky
Fix an issue in the voter calculator where existing voters were not retained across data centers and racks in certain scenarios. This occurred when voters were distributed across more data centers and racks than the maximum allowed number of voters.
Previously, the prioritization logic for data centers and racks did not consider the number of existing assigned voters. It only prioritized nodes within a single data center or rack, which could result in unnecessary reassignment of voters.
Improved the prioritization logic to account for the number of existing assigned voters in each data center and rack.

Additionally, the limited voters feature did not account for the existing topology coordinator (Raft leader) when selecting voters to be removed. As a result, the limited voters calculator could inadvertently remove the votership of the topology coordinator, triggering unnecessary Raft leader re-election.
To address this, the topology coordinator's votership status is now preserved unless absolutely necessary. When choosing between otherwise equivalent voters, the node other than the existing topology coordinator is prioritized for removal.

This change ensures a more stable voter distribution and reduces unnecessary voter reassignments.

The limited voters calculator is refactored to use a priority queue for sorting nodes by their priorities. This change simplifies the voter selection logic and makes it more extensible for future enhancements, such as supporting more complex priority calculations.

Fixes: scylladb/scylladb#23950
Fixes: scylladb/scylladb#23588
Fixes: scylladb/scylladb#23786

No backport: The limited voters feature is currently only present in master.

Closes scylladb/scylladb#23888

* https://github.com/scylladb/scylladb:
  raft: ensure topology coordinator retains votership
  raft: retain existing voters across data centers and racks
  raft: refactor limited voters calculator to prioritize nodes
  raft: replace pointer with reference for non-null output parameter
  raft: reduce code duplication in group0 voter handler
  raft: unify and optimize datacenter and rack info creation
2025-05-06 13:49:55 +02:00
Nadav Har'El
252c5b5c9d Merge 'Alternator batch_write_item wcu' from Amnon Heiman
This series adds support for WCU tracking in batch_write_item and tests it.

The patches include:

Switch the metrics (RCU and WCU) to count units vs half-units as they were, to make the metrics clearer for users.

Adding a public static get_half_units function to wcu_consumed_capacity_counter for use by batch write item, which cannot directly use the counter object.

Adding WCU calculation support to batch_write_item, based on item size for puts and a fixed 1 WCU for deletes. WCU metrics are updated, and consumed capacity is returned per table when requested.

The return handling was refactored to be coroutine-like for easier management of the consumed capacity array.

Adding tests that validate WCU calculation for batch put requests on a single table and across multiple tables, ensuring delete operations are counted correctly.

Adding a test that validates that WCU metrics are updated correctly during batch write item operations, ensuring the WCU of each item is calculated independently.

**Need backport, WCU is partially supported, and is missing from batch_write_item**

Fixes #23940

Closes scylladb/scylladb#23941

* github.com:scylladb/scylladb:
  alternator/test_metrics.py: batch_write validate WCU
  alternator/test_returnconsumedcapacity.py: Add tests for batch write WCU
  alternator/executor: add WCU for batch_write_items
  alternator/consumed_capacity: make wcu get_units public
  Alternator: Change the WCU/RCU to use units
2025-05-06 13:31:53 +03:00
Avi Kivity
fc2204cea0 Merge ' test/boost/multishard_mutation_query_test: fix test_read_with_partition_row_limits' from Botond Dénes
This test has multiple problems:
* has 3 embedded loops to run different scenarios, ignores variable from 2 of these, running with hardcoded settings instead
* initializes misses and lookups to 0 at the start of each scenario, this throws off per-page increment checks, when the previous scenario moved these metrics and they don't start from 0; this causes the test to sometimes fail
* duplicate check of drops == 0 (just cosmetic)

Fix all three problems, the second is especially important because it made the test flaky.
Additionally, ensure the test will keep using vnodes in the future, by explicitly creating a vnodes keyspace for them.

Fixes: #16794

Test fix, not a backport candidate normally, we can backport to 2025.1 if the test becomes too unstable there

Closes scylladb/scylladb#23783

* github.com:scylladb/scylladb:
  test/boost/multishard_mutation_query_test: ensure test runs with vnodes
  test/boost/multishard_mutation_query_test: fix test_read_with_partition_row_limits
2025-05-05 20:49:03 +03:00
Emil Maskovsky
24dfd2034b raft: ensure topology coordinator retains votership
The limited voters feature did not account for the existing topology
coordinator (Raft leader) when selecting voters to be removed.
As a result, the limited voters calculator could inadvertently remove
the votership of the current topology coordinator, triggering
an unnecessary Raft leader re-election.

This change ensures that the existing topology coordinator's votership
status is preserved unless absolutely necessary. When choosing between
otherwise equivalent voters, the node other than the topology coordinator
is prioritized for removal. This helps maintain stability in the cluster
by avoiding unnecessary leader re-elections.

Additionally, only the alive leader node is considered relevant for this
logic. A dead existing leader (topology coordinator) is excluded from
consideration, as it is already in the process of losing leadership.

Fixes: scylladb/scylladb#23588
Fixes: scylladb/scylladb#23786
2025-05-05 16:58:34 +02:00
Emil Maskovsky
2ae59e8a87 raft: retain existing voters across data centers and racks
Fix an issue in the voter calculator where existing voters were not
retained across data centers and racks in certain scenarios. This
occurred when voters were distributed across more data centers and racks
than the maximum allowed number of voters.

Previously, the prioritization logic for data centers and racks did not
consider the number of existing assigned voters. It only prioritized
nodes within a single data center or rack, which could result in
unnecessary reassignment of voters.

Improved the prioritization logic to account for the number of existing
voters in each data center and rack.

This change ensures a more stable voter distribution and reduces
unnecessary voter reassignments.

Fixes: scylladb/scylladb#23950
2025-05-05 16:51:48 +02:00
Emil Maskovsky
018fb63305 raft: refactor limited voters calculator to prioritize nodes
Refactor the limited voters calculator to use a priority queue for
sorting nodes by their priorities. This change simplifies the voter
selection logic and makes it more extensible for future enhancements,
such as supporting more complex priority calculations.

The priority value is determined based on the node's existing status,
including whether it is alive, a voter, or any further criteria.
2025-05-05 16:36:17 +02:00
Emil Maskovsky
26fdc7b8f8 raft: replace pointer with reference for non-null output parameter
The output parameter cannot be `null`. Previously, a pointer was used to
make it explicit that the parameter is an output parameter being
modified. However, this is unnecessary, as references are more
appropriate for parameters that cannot be `null`.

Switching to a reference improves code readability and ensures the
parameter's non-null constraint is enforced at the type level.
2025-05-05 16:12:00 +02:00
Emil Maskovsky
f0468860a3 raft: reduce code duplication in group0 voter handler
Refactor the group0 voter handler by introducing a helper lambda to
handle the common logic for adding a node. This eliminates unnecessary
code duplication.

This refactor does not introduce any functional changes but prepares
the codebase for easier future modifications.
2025-05-05 16:09:53 +02:00
Botond Dénes
855411caad test/boost/multishard_mutation_query_test: ensure test runs with vnodes
All tests in this suite use the default "ks" keyspace from cql_test_env.
This keyspace has tablet support and at any time we might decide to make
it use tablets by default. This would make all these tests use the
tablet path in multishard_mutation_query.cc. These tests were created to
test the vastly more complex vnodes code path in said file. The tablet
path is much simpler and it is only used by SELECT * FROM
MUTATION_FRAGMENTS() and which has its own correctness tests.
So explicitely create a vnodes keyspace and use it in all the tests to
restore the test functionality.
2025-05-05 09:22:54 -04:00
Botond Dénes
1175e1ed49 test/boost/multishard_mutation_query_test: fix test_read_with_partition_row_limits
This test has multiple problems:
* has 3 embedded loops to run different scenarios, ignores variable from
  2 of these, running with hardcoded settings instead
* initializes misses and lookups to 0 at the start of each scenario,
  this throws off per-page increment checks, when the previous scenario
  moved these metrics and they don't start from 0; this causes the test
  to sometimes fail
* duplicate check of drops == 0 (just cosmetic)

Fix all three problems, the second is especially important because it
made the test flaky.
2025-05-05 09:22:53 -04:00
Emil Maskovsky
2ef654149f raft: unify and optimize datacenter and rack info creation
Refactor the code to use a consistent pattern for creating the
datacenter info list and the rack info list.

Both now use a map of vectors, which improves efficiency by reducing
temporary conversions to maps/sets during node list processing.

Also ensure the node descriptor is passed by reference instead of by
copy, leveraging the guaranteed lifetime of the descriptors.
2025-05-05 15:15:17 +02:00
Pavel Emelyanov
cf1ffd6086 Merge 'sstables_loader: fix the racing between get_progress() and release_resources()' from Kefu Chai
This change addresses a critical race condition in the sstables_loader where `get_progress()` could access invalid `progress_holder` instances after `release_resources()` destroyed them.

Problem:
- Progress tracking uses two components: `_progress_state` (tracks state) and `_progress_per_shard` (sharded service with actual progress data)
- `get_progress()` first checks if `_progress_state` is initialized, then accumulates progress from `_progress_per_shard`
- As both functions are coroutines, `get_progress()` could be preempted after state check but before accessing `_progress_per_shard`
- If `release_resources()` runs during this preemption, it destroys the `progress_holder` instances in `_progress_per_shard`, causing `get_progress()` to access invalid memory.

Solution:
- Implemented shared/exclusive locking to protect access to both state and sharded progress data
- Multiple `get_progress()` calls can execute in parallel (shared access)
- `release_resources()` acquires exclusive access before modifying resources
- This prevents potential memory corruption and ensures consistent progress reporting

Fixes #23801

---

this change addresses a racing related to tracking the restore progress from S3 using scylla's native API, which is not used in production yet, hence no need to backport.

Closes scylladb/scylladb#23808

* github.com:scylladb/scylladb:
  sstables_loader: fix the indent
  sstables_loader: fix the racing between get_progress() and release_resources()
2025-05-05 15:45:15 +03:00
Avi Kivity
e688e89430 tools: toolchain: clear .cache and .cargo directories
The .cache and .cargo directories are used during pip and rust builds
when preparing the toolchain, but aren't useful afterwards. Remove them
to save a bit of space.

Closes scylladb/scylladb#23955
2025-05-05 14:43:14 +03:00
Avi Kivity
4c1f4c419c tools: toolchain: dbuild: run as root in container under podman
Running as root enables nested containers under podman without
trouble from uid remapping. Unlike docker, under podman uid 0 in
the container is remapped to the host uid for bind mounts, so writes
to the build directory do not end up owned by root on the host.

Nested containers will allow us to consume opensearch, cassandra-stress,
and minio as containers rather than embedding them into the frozen
toolchain.

Closes scylladb/scylladb#23954
2025-05-05 14:40:43 +03:00
Amnon Heiman
2ab99d7a07 alternator/test_metrics.py: batch_write validate WCU
This patch adds a test that verifies the WCU metrics are updated
correctly during a batch_write_item operation.
It ensures that the WCU of each item is calculated independently.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2025-05-05 13:20:24 +03:00
Amnon Heiman
14570f1bb5 alternator/test_returnconsumedcapacity.py: Add tests for batch write WCU
This patch adds two tests:
A test that validates WCU calculation for batch put requests on a single table.

A test that validates WCU calculation for batch requests across multiple
tables, including ensuring that delete operations are counted as 1 WCU.

Both tests verify that the consumed capacity is reported correctly
according to the WCU rules.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2025-05-05 13:20:23 +03:00
Amnon Heiman
68db77643f alternator/executor: add WCU for batch_write_items
This patch adds consumed capacity unit support to batch_write_item.

It calculates the WCU based on an item's length (for put) or a static 1
WCU (for delete), for each item on each table.

The WCU metrics are always updated. if the user requests consumed
capacity, a vector of consumed capacity is returned with an entry for
each of the tables.

For code simplicity, the return part of batch_write_item was updated to
be coroutine-like; this makes it easier to manage the life cycle of the
returned consumed_capacity array.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2025-05-05 13:20:14 +03:00
Amnon Heiman
f2ade71f4f alternator/consumed_capacity: make wcu get_units public
This patch adds a public static get_units function to
wcu_consumed_capacity_counter.  It will be used by the batch write item
implementation, which cannot use the wcu_consumed_capacity_counter
directly.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>

consume_capacity need merge
2025-05-05 13:19:04 +03:00
Amnon Heiman
5ae11746fa Alternator: Change the WCU/RCU to use units
This patch changes the RCU/WCU Alternator metrics to use whole units
instead of half units. The change includes the following:

Change the metrics documentation. Keep the RCU counter internally in
half units, but return the actual (whole unit) value.
Change the RCU name to be rcu_half_units_total to indicates that it
counts half units.
Change the WCU to count in whole units instead of half units.

Update the tests accordingly.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2025-05-05 13:18:09 +03:00
Anna Stuchlik
851a433663 doc: add a link to the previous Enterprise documentation
This commit adds a link to the docs for previous Enterprise versions
at https://enterprise.docs.scylladb.com/ to the left menu.

As we still support versions 2024.1 and 2024.2, we need to ensure
easier access to those docs sets.

Fixes https://github.com/scylladb/scylladb/issues/23870

Closes scylladb/scylladb#23945
2025-05-05 12:16:47 +03:00
Avi Kivity
04fb2c026d config: decrease default large allocation warning threshold to 128k
Back in 2017 (5a2439e702), we introduced a check for large
allocations as they can stall the memory allocator. The warning
threshold was set at 1 MB. Since then many fixes for large allocations
went in and it is now time to reduce the threshold further.

We reduce it here to 128 kB, the natural allocation size for the
system. A quick run showed no warnings.

Closes scylladb/scylladb#23975
2025-05-05 12:13:48 +03:00
Pavel Emelyanov
b56d6fbb84 Merge 'sstables: Fix quadratic space complexity in partitioned_sstable_set' from Raphael Raph Carvalho
Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with.

A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range.

Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled).

There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried.
And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them.

Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario.

It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly.

Fixes #23634.

We can backport this into 2024.2, 2025.1, but we should let this cook in master for 1 month or so.

Closes scylladb/scylladb#23806

* github.com:scylladb/scylladb:
  test: Verify partitioned set store split and unsplit correctly
  sstables: Fix quadratic space complexity in partitioned_sstable_set
  compaction: Wire table_state into make_sstable_set()
  compaction: Introduce token_range() to table_state
  dht: Add overlap_ratio() for token range
2025-05-05 11:28:38 +03:00
David Garcia
4ba7182515 docs: fix md redirections for multiversion support
This change resolves an issue where selecting a version from the multiversion dropdown on Markdown pages (e.g. https://docs.scylladb.com/manual/stable/alternator/getting-started.html) incorrectly redirected users to the main page instead of the corresponding versioned page.

The underlying cause was that the `multiversion` extension relies on `source_suffix` to identify available pages for URL mapping. Without this configuration, proper redirection fails for `.md` files.

This fix should be backported to `2025.1` to ensure correct behavior. Otherwise, the fix will only take effect in future releases.

Testing locally is non-trivial: clone the repository, apply the changes to each relevant branch, set `smv_remote_whitelist` to "", then run `make multiversionpreview`. Afterward, switch between versions in the dropdown to verify behavior. I've tested it locally, so the best next step is to merge and confirm that it works as expected in the live environment.

Closes scylladb/scylladb#23957
2025-05-05 10:39:39 +03:00
Pavel Emelyanov
7b786d9398 topology_coordinator: Use this->_feature_service directly
This dependency is already there, topology coordinator doesn't need
to use database reference to get to the features.

Previous patch of the same kind: b79137eaa4

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#23777
2025-05-05 09:37:29 +02:00
Piotr Dulikowski
05c797795f Merge 'Simplify test/sstable_assertions class API' from Pavel Emelyanov
It had recently been patched to re-use the sstables::test class functionality (scylladb/scylladb#23697), now it can be put on some more strict diet.

Closes scylladb/scylladb#23815

* github.com:scylladb/scylladb:
  test: Remove sstable_assertions::get_stats_metadata()
  test: Add sstable_assertions::operator->()
2025-05-05 09:33:45 +02:00