Almost all of the tests have been adjusted to be able to be run with
the `rf_rack_valid_keyspaces` configuration option enabled, while
the rest, a minority, create nodes with it disabled. Thanks to that,
we can enable it by default, so let's do that.
Some of the tests in the test suite have proven to be more problematic
in adjusting to RF-rack-validity. Since we'd like to run as many tests
as possible with the `rf_rack_valid_keyspaces` configuration option
enabled, let's disable it in those. In the following commit, we'll enable
it by default.
Three tests in the file use a multi-DC cluster. Unfortunately, they put
all of the nodes in a DC in the same rack and because of that, they fail
when run with the `rf_rack_valid_keyspaces` configuration option enabled.
Since the tests revolve mostly around zero-token nodes and how they
affect replication in a keyspace, this change should have zero impact on
them.
We reduce the number of nodes and the RF values used in the test
to make sure that the test can be run with the `rf_rack_valid_keyspaces`
configuration option. The test doesn't seem to be reliant on the
exact number of nodes, so the reduction should not make any difference.
The change boils down to matching the number of created racks to the number
of created nodes in each DC in the auxiliary function `prepare_multi_dc_repair`.
This way, we ensure that the created keyspace will be RF-rack-valid and so
we can run the test file even with the `rf_rack_valid_keyspaces` configuration
option enabled.
The change has no impact on the tests that use the function; the distribution
of nodes across racks does not affect how repair is performed or what the
tests do and verify. Because of that, the change is correct.
We assign the nodes to the same DC, but multiple racks to ensure that
the created keyspace is RF-rack-valid and we can run the test with
the `rf_rack_valid_keyspaces` configuration option enabled. The changes
do not affect what the test does and verifies.
We simply assign the nodes used in the test to seprate racks to
ensure that the created keyspace is RF-rack-valid to be able
to run the test with the `rf_rack_valid_keyspaces` configuration
option set to true. The change does not affect what the test
does and verifies -- it only depends on the type of nodes,
whether they are normal token owners or not -- and so the changes
are correct in that sense.
We parameterize the test so it's run with and without enforced
RF-rack-valid keyspaces. In the test itself, we introduce a branch
to make sure that we won't run into a situation where we're
attempting to create an RF-rack-invalid keyspace.
Since the `rf_rack_valid_keyspaces` option is not commonly used yet
and because its semantics will most likely change in the future, we
decide to parameterize the test rather than try to get rid of some
of the test cases that are problematic with the option enabled.
We simply assign DC/rack properties to every node used in the test.
We put all of them in the same DC to make sure that the cluster behaves
as closely to how it would before these changes. However, we distribute
them over multiple racks to ensure that the keyspace used in the test
is RF-rack-valid, so we can also run it with the `rf_rack_valid_keyspaces`
configuration option set to true. The distribution of nodes between racks
has no effect on what the test does and verifies, so the changes are
correct in that sense.
Instead of putting all of the nodes in a DC in the same rack
in `test_putget_2dc_with_rf`, we assign them to different racks.
The distribution of nodes in racks is orthogonal to what the test
is doing and verifying, so the change is correct in that sense.
At the same time, it ensures that the test never violates the
invariant of RF-rack-valid keyspaces, so we can also run it
with `rf_rack_valid_keyspaces` set to true.
We modify the parameters of `test_restore_with_streaming_scopes`
so that it now represents a pair of values: topology layout and
the value `rf_rack_valid_keyspaces` should be set to.
Two of the already existing parameters violate RF-rack-validity
and so the test would fail when run with `rf_rack_valid_keyspaces: true`.
However, since the option isn't commonly used yet and since the
semantics of RF-rack-valid keyspaces will most likely change in
the future, let's keep those cases and just run them with the
option disabled. This way, we still test everything we can
without running into undesired failures that don't indicate anything.
We adjust all of the simple cases of cluster tests so they work
with `rf_rack_valid_keyspaces: true`. It boils down to assigning
nodes to multiple racks. For most of the changes, we do that by:
* Using `pytest.mark.prepare_3_racks_cluster` instead of
`pytest.mark.prepare_3_nodes_cluster`.
* Using an additional argument -- `auto_rack_dc` -- when calling
`ManagerClient::servers_add()`.
In some cases, we need to assign the racks manually, which may be
less obvious, but in every such situation, the tests didn't rely
on that assignment, so that doesn't affect them or what they verify.
Some background:
When merge happens, a background fiber wakes up to merge compaction
groups of sibling tablets into main one. It cannot happen when
rebuilding the storage group list, since token metadata update is
not preemptable. So a storage group, post merge, has the main
compaction group and two other groups to be merged into the main.
When the merge happens, those two groups are empty and will be
freed.
Consider this scenario:
1) merge happens, from 2 to 1 tablet
2) produces a single storage group, containing main and two
other compaction groups to be merged into main.
3) take_storage_snapshot(), triggered by migration post merge,
gets a list of pointer to all compaction groups.
4) t__s__s() iterates first on main group, yields.
5) background fiber wakes up, moves the data into main
and frees the two groups
6) t__s__s() advances to other groups that are now freed,
since step 5.
7) segmentation fault
In addition to memory corruption, there's also a potential for
data to escape the iteration in take_storage_snapshot(), since
data can be moved across compaction groups in background, all
belonging to the same storage group. That could result in
data loss.
Readers should all operate on storage group level since it can
provide a view on all the data owned by a tablet replica.
The movement of sstable from group A to B is atomic, but
iteration first on A, then later on B, might miss data that
was moved from B to A, before the iteration reached B.
By switching to storage group in the interface that retrieves
groups by token range, we guarantee that all data of a given
replica can be found regardless of which compaction group they
sit on.
Fixes#23162.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closesscylladb/scylladb#24058
Used host id to check if the update is for the node itself. Using IP is unreliable since if a node is restarted with different IP a gossiper message with previous IP can be misinterpreted as belonging to a different node.
Fixes: #22777
Backport to 2025.1 since this fixes a crash. Older version do not have the code.
Closesscylladb/scylladb#24000
* https://github.com/scylladb/scylladb:
test: add reproducer for #22777
storage_service: Do not remove gossiper entry on address change
storage_service: use id to check for local node
Before this change, if a read executor had just enough targets to
achieve query's CL, and there was a connection drop (e.g. node failure),
the read executor waited for the entire request timeout to give drivers
time to execute a speculative read in a meantime. Such behavior don't
work well when a very long query timeout (e.g. 1800s) is set, because
the unfinished request blocks topology changes.
This change implements a mechanism to thrown a new
read_failure_exception_with_timeout in the aforementioned scenario.
The exception is caught by CQL server which conducts the waiting, after
ERM is released. The new exception inherits from read_failure_exception,
because layers that don't catch the exception (such as mapreduce
service) should handle the exception just a regular read_failure.
However, when CQL server catch the exception, it returns
read_timeout_exception to the client because after additional waiting
such an error message is more appropriate (read_timeout_exception was
also returned before this change was introduced).
This change:
- Rewrite cql_server::connection::process_request_one to use
seastar::futurize_invoke and try_catch<> instead of utils::result_try
- Add new read_failure_exception_with_timeout and throws it in storage_proxy
- Add sleep in CQL server when the new exception is caught
- Catch local exceptions in Mapreduce Service and convert them
to std::runtime_error.
- Add get_cql_exclusive to manager_client.py
- Add test_long_query_timeout_erm
No backport needed - minor issue fix.
Closesscylladb/scylladb#23156
* github.com:scylladb/scylladb:
test: add test_long_query_timeout_erm
test: add get_cql_exclusive to manager_client.py
mapreduce: catch local read_failure_exception_with_timeout
transport: storage_proxy: release ERM when waiting for query timeout
transport: remove redundant references in process_request_one
transport: fix the indentation in process_request_one
transport: add futures in CQL server exception handling
When schema is changed, sstable set is updated according to the compaction strategy of the new schema (no changes to set are actually made, just the underlying set type is updated), but the problem is that it happens without a lock, causing a use-after-free when running concurrently to another set update.
Example:
1) A: sstable set is being updated on compaction completion
2) B: schema change updates the set (it's non deferring, so it happens in one go) and frees the set used by A.
3) when A resumes, system will likely crash since the set is freed already.
ASAN screams about it:
SUMMARY: AddressSanitizer: heap-use-after-free sstables/sstable_set.cc ...
Fix is about deferring update of the set on schema change to compaction, which is triggered after new schema is set. Only strategy state and backlog tracker are updated immediately, which is fine since strategy doesn't depend on any particular implementation of sstable set.
Fixes#22040.
Closesscylladb/scylladb#23680
* github.com:scylladb/scylladb:
replica: Fix use-after-free with concurrent schema change and sstable set update
sstables: Implement sstable_set_impl::all_sstable_runs()
test_tablet_repair_hosts_filter checks whether the host filter
specfied for tablet repair is correctly persisted. To check this,
we need to ensure that the repair is still ongoing and its data
is kept. The test achieves that by failing the repair on replica
side - as the failed repair is going to be retried.
However, if the filter does not contain any host (included_host_count = 0),
the repair is started on no replica, so the request succeeds
and its data is deleted. The test fails if it checks the filter
after repair request data is removed.
Fail repair on topology coordinator side, so the request is ongoing
regardless of the specified hosts.
Fixes: #23986.
Closesscylladb/scylladb#24003
The test test_read_repair_with_trace_logging wants to test read repair with trace logging. Turns out that node restart + trace-level logging + debug mode is too much and even with 1 minute timeout, the read repair times out sometimes. Refactor the test to use injection point instead of restart. To make sure the test still tests what it supposed to test, use tracing to assert that read repair did indeed happen.
Fixes: scylladb/scylladb#23968
Needs backport to 2025.1 and 6.2, both have the flaky test
Closesscylladb/scylladb#23989
* github.com:scylladb/scylladb:
test/cluster/test_read_repair.py: improve trace logging test (again)
test/cluster: extract execute_with_tracing() into pylib/util.py
When schema is changed, sstable set is updated according to the compaction
strategy of the new schema (no changes to set are actually made, just
the underlying set type is updated), but the problem is that it happens
without a lock, causing a use-after-free when running concurrently to
another set update.
Example:
1) A: sstable set is being updated on compaction completion
2) B: schema change updates the set (it's non deferring, so it
happens in one go) and frees the set used by A.
3) when A resumes, system will likely crash since the set is freed
already.
ASAN screams about it:
SUMMARY: AddressSanitizer: heap-use-after-free sstables/sstable_set.cc ...
Fix is about deferring update of the set on schema change to compaction,
which is triggered after new schema is set. Only strategy state and
backlog tracker are updated immediately, which is fine since strategy
doesn't depend on any particular implementation of sstable set, since
patch "sstables: Implement sstable_set_impl::all_sstable_runs()".
Fixes#22040.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Add sleep before starting gossiper to increase a chance of getting old
gossiper entry about yourself before updating local gossiper info with
new IP address.
The test test_read_repair_with_trace_logging wants to test read repair
with trace logging. Turns out that node restart + trace-level logging
+ debug mode is too much and even with 1 minute timeout, the read repair
times out sometimes.
Refactor the test to use injection point instead of restart. To make
sure the test still tests what it supposed to test, use tracing to
assert that read repair did indeed happen.
The limited voters feature did not account for the existing topology
coordinator (Raft leader) when selecting voters to be removed.
As a result, the limited voters calculator could inadvertently remove
the votership of the current topology coordinator, triggering
an unnecessary Raft leader re-election.
This change ensures that the existing topology coordinator's votership
status is preserved unless absolutely necessary. When choosing between
otherwise equivalent voters, the node other than the topology coordinator
is prioritized for removal. This helps maintain stability in the cluster
by avoiding unnecessary leader re-elections.
Additionally, only the alive leader node is considered relevant for this
logic. A dead existing leader (topology coordinator) is excluded from
consideration, as it is already in the process of losing leadership.
Fixes: scylladb/scylladb#23588Fixes: scylladb/scylladb#23786
Interval map is very susceptible to quadratic space behavior when it's flooded with many entries overlapping all (or most of) intervals, since each such entry will have presence on all intervals it overlaps with.
A trigger we observed was memtable flush storm, which creates many small "L0" sstables that spans roughly the entire token range.
Since we cannot rely on insertion order, solution will be about storing sstables with such wide ranges in a vector (unleveled).
There should be no consequence for single-key reads, since upper layer applies an additional filtering based on token of key being queried.
And for range scans, there can be an increase in memory usage, but not significant because the sstables span an wide range and would have been selected in the combined reader if the range of scan overlaps with them.
Anyway, this is a protection against storm of memtable flushes and shouldn't be the common scenario.
It works both with tablets and vnodes, by adjusting the token range spanned by compaction group accordingly.
Fixes#23634.
We can backport this into 2024.2, 2025.1, but we should let this cook in master for 1 month or so.
Closesscylladb/scylladb#23806
* github.com:scylladb/scylladb:
test: Verify partitioned set store split and unsplit correctly
sstables: Fix quadratic space complexity in partitioned_sstable_set
compaction: Wire table_state into make_sstable_set()
compaction: Introduce token_range() to table_state
dht: Add overlap_ratio() for token range
In ScyllaDB, schema modification operations use "optimistic locking":
A schema operation reads the current schema, decides what it wants to do
and prepares changes to the schema, and then attempts to commit those
changes - but only if the schema hasn't changed since the first read.
If the schema has already been changed by some other node - we need to
try again. In a loop.
In Alternator, there are six operations that perform schema modification:
CreateTable, DeleteTable, UpdateTable, TagResource, UntagResource and
UpdateTimeToLive. All of them were missing this loop. We knew about
this - and even had FIXME in all places. So all these operations,
when facing contention of concurrent schema modifications on different
nodes may fail one of these operations with an error like:
Internal server error: service::group0_concurrent_modification
(Failed to apply group 0 change due to concurrent modification).
This problem had very minor effect, if any, on real users because the
DynamoDB SDK automatically retries operations that fail with retryable
errors - like this "Internal server error" - and most likely the schema
operation will succeed upon retry. However, as shown in issue #13152
these failures were annoying in our CI, where tests - which disable
request retries - failed on these errors.
This patch fixes all six operations (the last three operations all
use one common function, db::modify_tags(), so are fixed by one
change) to add the missing loop.
The patch also includes reproducing tests for all these operations -
the new tests all fail before this patch, and pass with it.
These new tests are much more reliable reproducers than the dtests
we had that only sometimes - very rarely - reproduced the problem.
Moreover, the new tests reproduces the bug seperately for each of the
six operations, so if we forget to fix one of the six operations, one
of the tests would have continued to fail. Of course I checked this
during development.
The new tests are in the test/cluster framework, not test/alternator,
because this problem can only be reproduced in a multi-node cluster:
On a single node, it serializes its schema modifications on its own;
The collisions only happen when more than one node attempts schema
modifications at the same time.
Fixes#13152
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#23827
Currently, test_tablet_resize_revoked tries to trigger split revoke
by deleting some rows. This method isn't deterministic and so a test
is flaky.
Use error injection to trigger resize revoke.
Fixes: #22570.
Closesscylladb/scylladb#23966
There were CI runs in which the training happened as planned,
but it was too slow to fit within the timeout.
Raise the timeout to pacify the CI.
Fixesscylladb/scylladb#23964Closesscylladb/scylladb#23965
In case when dht::boot_strapper::get_boostrap_tokens fail to parse the
tokens, the topology coordinator handles the exception and schedules a
rollback. However, the current code tries to continue with the topology
coordinator logic even if an exception occurs, leaving boostrap_tokens
empty. This does not make sense and can actually cause issues,
specifically in prepare_and_broadcast_cdc_generation_data which
implicitly expect that the bootstrap_tokens of the first node in the
cluster will not be empty.
Fix this by adding the missing break.
Fixes: scylladb/scylladb#23897
From the code inspection alone it looks like 2025.1 and 6.2 have this problem, so marking for backport to both of them.
Closesscylladb/scylladb#23914
* https://github.com/scylladb/scylladb:
test: cluster: add test_bad_initial_token
topology coordinator: do not proceed further on invalid boostrap tokens
cdc: add sanity check for generating an empty generation
Check whether a node is alive before making an rpc that gathers children
infos from the whole cluster in virtual_task::impl::get_children.
Fixes: https://github.com/scylladb/scylladb/issues/22514.
Needs backport to 2025.1 and 6.2 as they contain the bug.
Closesscylladb/scylladb#23787
* github.com:scylladb/scylladb:
test: add test for getting tasks children
tasks: check whether a node is alive before rpc
Currently, the base_info may or may not be set in view schemas.
Even when it's set, it may be modified. This necessitates extra
checks when handling view schemas, as we'll as potentially causing
errors when we forget to set it at some point.
Instead, we want to make the base info an immutable member of view
schemas (inside view_info). To achieve this, in this series we remove
all base_info members that can change due to a base schema update,
and we calculate the remaining values during view update generation,
using the most up-to-date base schema version.
To calculate the values that depend on the base schema version, we
need to iterate over the view primary key and find the corresponding
columns, which adds extra overhead for each batch of view updates.
However, this overhead should be relatively small, as when creating
a view update, we need to prepare each of its columns anyway. And
if we need to read the old value of the base row, the relative
overhead is even lower.
After this change, the base info in view schemas stays the same
for all base schema updates, so we'll no longer get issues with
base_info being incompatible with a base schema version. Additionally,
it's a step towards making the schema objects immutable, which
we sometimes incorrectly assumed in the past (they're still not
completely immutable yet, as some other fields in view_info other
than base_info are initialized lazily and may depend on the base
schema version).
Fixes https://github.com/scylladb/scylladb/issues/9059
Fixes https://github.com/scylladb/scylladb/issues/21292
Fixes https://github.com/scylladb/scylladb/issues/22194
Fixes https://github.com/scylladb/scylladb/issues/22410Closesscylladb/scylladb#23337
* github.com:scylladb/scylladb:
test: remove flakiness from test_schema_is_recovered_after_dying
mv: add a test for dropping an index while it's building
base_info: remove the lw_shared_ptr variant
view_info: don't re-set base_info after construction
base_info: remove base_info snapshot semantics
base_info: remove base schema from the base_info
schema_registry: store base info instead of base schema for view entries
base_info: make members non-const
view_info: move the base info to a separate header
view_info: move computation of view pk columns not in base pk to view_updates
view_info: move base-dependent variables into base_info
view_info: set base info on construction
This test shuts down a node and then replaces it with another one while
continuously writing to the cluster. The test has been observed to take
a lot of time in debug mode and time out on the replace operation.
Replace takes very long because rebuilding tablets on the new node is
very slow, and the slowest part is memtable flush which happens at the
beginning of streaming. The slowness seems to be specific to the debug
mode.
Turn off the test in debug mode to deflake the CI. As a follow-up, the
test is planned to be reworked into an quicker error injection test so
that the code path tested by this test will be again exercised in debug
unit tests (scylladb/scylladb#23898)
Fixes: scylladb/scylladb#20316Closesscylladb/scylladb#23900
Wait for cql after rolling restart in test_two_tablets_concurrent_repair_and_migration_repair_writer_level
to prevent failing queries.
Fixes: #23620.
Closesscylladb/scylladb#23796
Providing IP of an ignored node during removenode made the test flaky.
It could happen that the address map contained mappings of two
nodes with the same IP:
1. the node being ignored,
2. the node that expectedly failed replacing earlier in the test.
So, `address_map::find_by_addr()` called in `find_raft_nodes_from_hoeps`
could return the host ID of the second node instead of the first node
and cause removenode to fail.
We fix flakiness in this patch by providing the host ID of the ignored
node instead of its IP. We would have to do it anyway sooner or later
because providing IP is deprecated.
The bug in `find_raft_nodes_from_hoeps` is tracked by
scylladb/scylladb#23846.
The test became flaky because of f0af3f261e.
That patch is not present in 2025.1, so the test isn't flaky outside
master, and hence there is no reason to backport this patch.
Fixesscylladb/scylladb#23499Closesscylladb/scylladb#23863
Currently, flush throws no_such_column_family if a table is dropped. Skip the flush of dropped table instead.
Fixes: #16095.
Needs backport to 2025.1 and 6.2 as they contain the bug
Closesscylladb/scylladb#23876
* github.com:scylladb/scylladb:
test: test table drop during flush
replica: skip flush of dropped table
Dropping an index is a schema change of its base table and
a schema drop of the index's materialized view. This combination
of schema changes used to cause issues during view building, because
when a view schema was dropped, it wasn't getting updated with the
new version of the base schema, and while the view building was
in progress, we would update the base schema for the base table
mutation reader and try generating updates with a view schema that
wasn't compatible with the base schema, failing on an `on_internal_error`.
In this patch we add a test for this scenario. We create an index,
halt its view building process using an injection, and drop it.
If no errors are thrown, the test succeeds.
The test was failing before https://github.com/scylladb/scylladb/pull/23337
and is passing afterwards.
The query may fail also on a no_such_keyspace
exception, which generates the following cql error:
```
Error from server: code=2200 [Invalid query] message="Can\'t find a keyspace test_1745198244144_qoohq"
```
Extend the pytest.raises match expression to include
this error as well.
Fixes#23812
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#23875
The test populates a table with 50k rows, creates a view on that table
and then compares the time spent in streaming vs. gossip scheduling
groups. It only takes 10s in dev mode on my machine, but is much slower
in debug mode in CI - building the view doesn't finish within 2 minutes.
The bigger the view to build, the more accurrate the measurement;
moreover, the test scenario isn't interesting enough to be worth running
it in debug mode as this should be covered by other tests. Therefore,
just skip this test in debug mode.
Fixes: scylladb/scylladb#23862Closesscylladb/scylladb#23866
This commit adds a test to verify that a query with long timeout
doesn't block ERM on failure. The motivation for the test is
fixing scylladb#21831.
This commit:
- add test_long_query_timeout_erm
The test `test_mv_write_to_dead_node` currently uses a timeout of 60
seconds for remove_node, after it was increased from 30 seconds to fix
scylladb/scylladb#22953. Apparently it is still too low, and it was
observed to fail in debug mode.
Normally remove_node uses a default timeout of TOPOLOGY_TIMEOUT = 1000
seconds, but the test requires a timeout which is shorter than 5
minutes, because it is a regression test for an issue where MV updates
hold topology changes for more than 5 minutes, and we want to verify in
the test that the topology change completes in less than 5 minutes.
To resolve the issue, we set the test to skip in debug mode, because the
remove node operation is unpredictably slow, and we increase the timeout
to 180 seconds which is hopefully enough time for remove_node in
non-debug modes, and still sufficient to satisfy the test requirements.
Fixesscylladb/scylladb#22530Closesscylladb/scylladb#23833
Changing DC or rack on a node which was already bootstrapped is, in
case of vnodes, very unsafe (almost guaranteed to cause data loss or
unavailability), and is outright not supported if the cluster has
a tablet-backed keyspaces. Moreover, the possibility of doing that
makes it impossible to uphold some of the invariants promised by
the RF-rack-valid flag, which is eventually going to become
unconditionally enabled.
Get rid of the above problems by removing the possibility of changing
the DC / rack of a node. A node will now fail to start if its snitch
reports a different DC or rack than the one that was reported during the
first boot.
Fixes: scylladb/scylladb#23278Fixes: scylladb/scylladb#22869
Marking for backport to 2025.1, as this is a necessary part of the RF-rack-valid saga
Closesscylladb/scylladb#23800
* github.com:scylladb/scylladb:
doc: changing topology when changing snitches is no longer supported
test: cluster: introduce test_no_dc_rack_change
storage_service: don't update DC/rack in update_topology_with_local_metadata
main: make dc and rack immutable after bootstrap
test: cluster: remove test_snitch_change
Currently log files have information about run_id twice:
cluster.object_store_test_backup.10.test_abort_restore_with_rpc_error.dev.10_cluster.log
However, sometimes the first run_id can be incorrect:
cluster.object_store_test_backup.1.test_abort_restore_with_rpc_error.dev.10_cluster.log
Removing first run_id in the name to not face this issue and because
it's actually redundant.
Removing creation empty file for scylla manager log, since it redundant
and was done as incorrect assumption on the root cause of the fail.
Add extension to the stacktrace file, so it will be opened in the
browser in Jenkins in the new tab instead of downloading it.
Fixes: https://github.com/scylladb/scylladb/issues/23731Closesscylladb/scylladb#23797
Update the regular expression in `check_node_log_for_failed_mutations` to avoid
false test failures when DEBUG-level logging is enabled.
Fixesscylladb/scylladb#23688Closesscylladb/scylladb#23658
This PR enhances S3 throughput by leveraging every available shard to upload backup files concurrently. By distributing the load across multiple shards, we significantly improve the upload performance. Each shard retrieves an SSTable and processes its files sequentially, ensuring efficient, file-by-file uploads.
To prevent uncontrolled fiber creation and potential resource exhaustion, the backup task employs a directory semaphore from the sstables_manager. This mechanism helps regulate concurrency at the directory level, ensuring stable and predictable performance during large-scale backup operations.
Refs #22460fixes: #22520
```
===========================================
Release build, master, smp-16, mem-32GiB
Bytes: 2342880184, backup time: 9.51 s
===========================================
Release build, this PR, smp-16, mem-32GiB
Bytes: 2342891015, backup time: 1.23 s
===========================================
```
Looks like it is faster at least x7.7
No backport needed since it (native backup) is still unused functionality
Closesscylladb/scylladb#23727
* github.com:scylladb/scylladb:
backup: Add test for invalid endpoint
backup_task: upload on all shards
backup_task: integrate sharded storage manager for upload
* During the development phase, the backup functionality broke because we lacked a test that runs backup with an invalid endpoint. This commit adds a test to cover that scenario.
* Add checking for the expected error to be propagated from failing/aborted backup