Commit Graph

118 Commits

Author SHA1 Message Date
Taras Veretilnyk
ca9abcdcbc sstables_loader: synchronously unlink streamed sstables before returning
mark_for_deletion() only set an in-memory flag; the actual file
deletion ran lazily when the last shared_sstable reference dropped,
leaving a window in which a follow-up scan of the upload directory
(e.g. a second 'nodetool refresh --load-and-stream') could observe a
partially-deleted sstable and fail with malformed_sstable_exception.

Force the unlink to complete before stream() returns. For tablet
streaming, partially-contained sstables span multiple per-tablet
batches, so a defer_unlinking flag postpones the unlink until after
all sstables are streamed; for vnodes and fully-contained sstables are streamed
only once and could be removed just after being streamed.

Added a FIXME on object_storage_base::wipe and strengthened the doc on storage::wipe to
make the never-fails contract explicit

(cherry picked from commit 784127c40b)
2026-05-11 18:44:27 +00:00
Ernest Zaslavsky
e5e6608f20 sstables_loader: prevent use-after-free on table drop during streaming
sstables_loader::load_and_stream holds a replica::table& reference via
the sstable_streamer for the entire streaming operation.  If the table
is dropped concurrently (e.g. DROP TABLE or DROP KEYSPACE), the
reference becomes dangling and the next access crashes with SEGV.

This was observed in a longevity-50gb-12h-master test run where a
keyspace was dropped while load_and_stream was still streaming SSTables
from a previous batch.

Fix by acquiring a stream_in_progress() phaser guard in load_and_stream
before creating the streamer.  table::stop() calls
_pending_streams_phaser.close() which blocks until all outstanding
guards are released, keeping the table alive for the duration of the
streaming operation.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1352

Closes scylladb/scylladb#29403
2026-04-20 07:39:51 +03:00
Pavel Emelyanov
4d352c7cf5 sstables: Remove ignore_component_digest_mismatch from sstable_open_config
The ignore_component_digest_mismatch flag is now initialized at sstable construction
time from sstables_manager::config (which is populated from db::config at boot time).
Remove the flag from sstable_open_config struct and all call sites that were setting
it explicitly.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
2026-04-16 13:49:14 +03:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Raphael S. Carvalho
05b11a3b82 sstables_loader: use new sstable add path
Use add_new_sstable_and_update_cache() when attaching SSTables
downloaded by the node-scoped local loader.

This is the correct variant for new SSTables: it can unlink the
SSTable on failure to add it, and it can split the SSTable if a
tablet split is in progress. The older
add_sstable_and_update_cache() helper is intended for preexisting
SSTables that are already stable on disk.

Additionally, downloaded SSTables are now left unsealed (TemporaryTOC)
until they are successfully added to the table's SSTable set. The
download path (download_fully_contained_sstables) passes
leave_unsealed=true to create_stream_sink, and attach_sstable opens
the SSTable with unsealed_sstable=true and seals it only inside the
on_add callback — matching the pattern used by stream_blob.cc and
storage_service.cc for tablet streaming.

This prevents a data-resurrection hazard: previously, if the process
crashed between download and attach_sstable, or if attach_sstable
failed mid-loop, sealed (TOC) SSTables would remain in the table
directory and be reloaded by distributed_loader on restart. With
TemporaryTOC, sstable_directory automatically cleans them up on
restart instead.

Fixes  https://scylladb.atlassian.net/browse/SCYLLADB-1085.

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#29072
2026-03-23 10:33:04 +03:00
Taras Veretilnyk
7214f5a0b6 sstables: propagate ignore_component_digest_mismatch config to all load sites
Add ignore_component_digest_mismatch option to db::config (default false).
When set, sstable loading logs a warning instead of throwing on component
digest mismatches, allowing a node to start up despite corrupted non-vital
components or bugs in digest calculation.

Propagate the config to all production sstable load paths:
- distributed_loader (node startup, upload dir processing)
- storage_service (tablet storage cloning)
- sstables_loader (load-and-stream, download tasks, attach)
- stream_blob (tablet streaming)
2026-03-10 19:24:05 +01:00
Ernest Zaslavsky
13fb605edb streaming: enable direct download of contained sstables
Instead of streaming fully contained sstables, download them directly
and attach them to their corresponding table. This simplifies the
process and avoids unnecessary streaming overhead.
2026-01-25 13:27:44 +02:00
Ernest Zaslavsky
757e9d0f52 streaming: keep sharded database reference on tablet_sstable_streamer 2026-01-21 16:40:12 +02:00
Ernest Zaslavsky
4ffa070715 streaming: skip streaming empty sstable sets
Avoid invoking streaming when the sstable set is empty. This prevents
unnecessary calls and simplifies the streaming logic.
2026-01-21 16:40:12 +02:00
Ernest Zaslavsky
0fcd369ef2 streaming: inline streamer instance creation
Remove the `make_sstable_streamer` function and inline its usage where
needed. This change allows passing different sets of arguments
directly at the call sites.
2026-01-21 16:40:12 +02:00
Pavel Emelyanov
fb32e1c7ee Merge 'streaming: tablet_sstable_streamer::stream refactoring' from Ernest Zaslavsky
Refactor the way we decide the sstable belong to a tablet, fully or partially to simplify the flow and make it more readable. Also extract the logic and make it testable, add tests to cover changes

The change is purely aesthetic, no need to backport

Closes scylladb/scylladb#27101

* github.com:scylladb/scylladb:
  streaming: remove unnecessary lambda creating sstable token range
  streaming: simplify get_sstables_for_tablets logic
  streaming: switch to range-based for loop
  streaming: drop sstable skip microoptimization in tablet loop
  streaming: replace reverse iterators with reverse view in sstables scan
  streaming: return from get_sstables_for_tablets earlier
  streaming: add get_sstables_by_tablet_range tests
  test,sstables: add helper to set sstable first and last keys
  streaming: refactor get_sstables_for_tablets to make it accessible
  streaming: refactor get_sstables_for_tablets to make it testable
  streaming: refactor tablet_sstable_streamer::stream by extracting SST filtering logic
2025-12-09 10:53:57 +03:00
Ernest Zaslavsky
71834ce7dd streaming: remove unnecessary lambda creating sstable token range
The `sstable_token_range` lambda was only used once to create a token
range for an SSTable. Inline the construction directly where needed,
removing the extra lambda. This simplifies the code without changing
behavior.
2025-12-08 12:30:24 +02:00
Ernest Zaslavsky
df21112c39 streaming: simplify get_sstables_for_tablets logic
Remove the use of the `overlaps` helper and unnest nested conditionals
in get_sstables_for_tablets. Straightforward `before` and `after` checks
are sufficient to decide how each SSTable should be handled.
2025-12-08 12:30:24 +02:00
Ernest Zaslavsky
bd339cc4d8 streaming: switch to range-based for loop
Replace the explicit iterator loop with a range-based for loop. This
simplifies the code, enforces constness, and avoids the unnecessary
use of postfix increment. The behavior remains unchanged,but
readability and maintainability are improved.
2025-12-08 12:30:23 +02:00
Ernest Zaslavsky
91bf23eea1 streaming: drop sstable skip microoptimization in tablet loop
Remove the microoptimization that advanced over SSTables ending before a
tablet range. This approach is misleading since SSTables are not sorted
by their end token, and the extra logic adds complexity with little to
no benefit. The streaming path here is not performance‑critical, so the
simpler loop is preferable.
2025-12-08 12:30:23 +02:00
Ernest Zaslavsky
f925ed176b streaming: replace reverse iterators with reverse view in sstables scan
Use a reverse view over the SSTables vector instead of reverse iterators.
This avoids awkward rbegin/rend usage and the mental overhead of tracking
inverted sort order. With a view, we can use standard begin/end iteration
while preserving the intended scan direction.
2025-12-08 12:30:23 +02:00
Ernest Zaslavsky
68dcd1b1b2 streaming: return from get_sstables_for_tablets earlier
Check if tablets or sstables list is empty and if so, return immediately
2025-12-08 12:30:23 +02:00
Ernest Zaslavsky
6ef7ad9b5a streaming: refactor get_sstables_for_tablets to make it accessible
Create `get_sstables_for_tablets_for_tests` friend free function
for testing purposes. Adding this free function allows
direct testing without requiring the full streamer context.
2025-12-08 12:30:23 +02:00
Ernest Zaslavsky
581b8ace83 streaming: refactor get_sstables_for_tablets to make it testable
Make the `get_sstables_for_tablets` member function `static`. This
is a step toward improved testability, allowing the function to be
invoked directly without requiring a full instance of the streamer.
2025-12-08 12:30:23 +02:00
Pavel Emelyanov
6c115c691f sstables_loader: Provide endpoint type for get_sstables_from_object_store()
Currently the method scans db::config to find one. It has some
drawbacks. First, it's not very nice. Second, it needs to handle the
case when the endpoint is missing, while it relally never is. Third, the
type in config entry is not necessarily set.

It's nicer to get the type from storage manager.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-12-02 11:18:32 +03:00
Ernest Zaslavsky
f0e2941e34 streaming: refactor tablet_sstable_streamer::stream by extracting SST filtering logic
Extract the SST filtering logic into a dedicated member function. This
prepares the code for independent testing without requiring the entire
streamer to be initialized.
2025-11-30 18:27:15 +02:00
Ernest Zaslavsky
1d5f60baac streaming:: add more logging
Start logging all missed streaming options like `scope`, `primary_replica` and `skip_reshape` flags

Fixes: https://github.com/scylladb/scylladb/issues/27299

Closes scylladb/scylladb#27311
2025-11-28 12:50:33 +01:00
Piotr Dulikowski
22f22d183f Merge 'Refine sstables_loader vs database dependency' from Pavel Emelyanov
There are two issues with it.

First, a small RPC helper struct carries database reference on board just to get feature state from it.
Second, sstable_streamer uses database as proxy to feature service.

This PR improves both.

Services dependencies improvement, not need to backport

Closes scylladb/scylladb#26989

* github.com:scylladb/scylladb:
  sstables_loader: Get LOAD_AND_STREAM_ABORT_RPC_MESSAGE from messaging
  sstables_loader: Keep bool on send_meta, not database reference
2025-11-20 16:13:16 +01:00
Ernest Zaslavsky
dedc8bdf71 streaming: fix loop break condition in tablet_sstable_streamer::stream
Correct the loop termination logic that previously caused
certain SSTables to be prematurely excluded, resulting in
lost mutations. This change ensures all relevant SSTables
are properly streamed and their mutations preserved.
2025-11-19 17:32:49 +02:00
Pavel Emelyanov
4d5f7a57ea sstables_loader: Get LOAD_AND_STREAM_ABORT_RPC_MESSAGE from messaging
The feature in question is about the way streaming sink-and-source
operate. Since sink-and-source itself are obtained from messaging
service, the feature is better be fetched from it too.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-11-19 09:35:54 +03:00
Pavel Emelyanov
64e099f03b sstables_loader: Keep bool on send_meta, not database reference
The send_meta helper needs database reference to get feature_service
from it (to check some feature state). That's too much, features are
"immutable" throug the loader lifetime, it's enough to keep the boolean
member on send_meta.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-11-19 09:32:56 +03:00
Robert Bindar
817fdadd49 Improve choice distribution for primary replica
I noticed during tests that `maybe_get_primary_replica`
would not distribute uniformly the choice of primary replica
because `info.replicas` on some shards would have an order whilst
on others it'd be ordered differently, thus making the function choose
a node as primary replica multiple times when it clearly could've
chosen a different nodes.

This patch sorts the replica set before passing it through the
scope filter.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
2025-11-11 09:18:01 +02:00
Robert Bindar
136b45d657 Enable scoped primary replica only streaming
This patch removes the restriction for streaming
to primary replica only within a scope.
Node scope streaming to primary replica is dissallowed.

Fixes #26584

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
2025-11-11 09:18:01 +02:00
Robert Bindar
965a16ce6f Support primary_replica_only for native restore API
Current native restore does not support primary_replica_only, it is
hard-coded disabled and this may lead to data amplification issues.

This patch extends the restore REST API to accept a
primary_replica_only parameter and propagates it to
sstables_loader so it gets correctly passed to
load_and_stream.

Fixes #26584

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>
2025-11-11 09:17:52 +02:00
Raphael S. Carvalho
7f34366b9d sstables_loader: Don't bypass synchronization with busy topology
The patch c543059f86 fixed the synchronization issue between tablet
split and load-and-stream. The synchronization worked only with
raft topology, and therefore was disabled with gossip.
To do the check, storage_service::raft_topology_change_enabled()
but the topology kind is only available/set on shard 0, so it caused
the synchronization to be bypassed when load-and-stream runs on
any shard other than 0.

The reason the reproducer didn't catch it is that it was restricted
to single cpu. It will now run with multi cpu and catch the
problem observed.

Fixes #22707

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>

Closes scylladb/scylladb#26730
2025-11-03 18:10:08 +01:00
Pavel Emelyanov
ac1d709709 Merge 'tablet_sstable_streamer: defer SSTable unlinking until fully streamed' from Taras Veretilnyk
When streaming SSTables across tablets, a single SSTable may be streamed to multiple tablets. The previous implementation unlinked SSTables immediately after streaming them for the first tablet, potentially making them partially unavailable for subsequent tablets. This patch replaces unlink() with mark_for_deletion() deferring actual unlinking till sstable::close_files.

test_tablets2::test_tablet_load_and_stream  was enhanced to also verify that SSTables are removed after being streamed.

Fixes #26606

Backport is not required, although it is a bug fix, but it isn't visible. This is more of a preparatory fix for https://github.com/scylladb/scylladb/pull/26444.

Closes scylladb/scylladb#26622

* github.com:scylladb/scylladb:
  test_tablets2: verify SSTable cleanup after tablet load and stream
  tablet_sstable_streamer: replace unlink() call with mark_for_deletion()
2025-10-28 13:25:40 +03:00
Aleksandra Martyniuk
910cd0918b locator: use get_primary_replica for get_primary_endpoints
Currently, tablet_sstable_streamer::get_primary_endpoints is out of
sync with tablet_map::get_primary_replica. The get_primary_replica
optimizes the choice of the replica so that the work is fairly
distributes among nodes. Meanwhile, get_primary_endpoints always
chooses the first replica.

Use get_primary_replica for get_primary_endpoints.

Fixes: https://github.com/scylladb/scylladb/issues/21883.

Closes scylladb/scylladb#26385
2025-10-28 09:56:08 +02:00
Taras Veretilnyk
517a4dc4df tablet_sstable_streamer: replace unlink() call with mark_for_deletion()
When streaming SSTables across tablets, a single SSTable may be streamed to multiple tablets.
The previous implementation unlinked SSTables immediately after streaming them for the first tablet,
potentially making them partially unavailable for subsequent tablets.
This patches replaces unlink() call with mark_for_deletion()
2025-10-27 16:30:05 +01:00
Botond Dénes
c543059f86 Merge 'Synchronize tablet split and load-and-stream' from Raphael Raph Carvalho
Load-and-stream is broken when running concurrently to the finalization step of tablet split.

Consider this:
1) split starts
2) split finalization executes barrier and succeed
3) load-and-stream runs now, starts writing sstable (pre-split)
4) split finalization publishes changes to tablet metadata
5) load-and-stream finishes writing sstable
6) sstable cannot be loaded since it spans two tablets

two possible fixes (maybe both):

1) load-and-stream awaits for topology to quiesce
2) perform split compaction on sstable that spans both sibling tablets

This patch implements # 1. By awaiting for topology to quiesce,
we guarantee that load-and-stream only starts when there's no
chance coordinator is handling some topology operation like
split finalization.

Fixes https://github.com/scylladb/scylladb/issues/26455.

Closes scylladb/scylladb#26456

* github.com:scylladb/scylladb:
  test: Add reproducer for l-a-s and split synchronization issue
  sstables_loader: Synchronize tablet split and load-and-stream
2025-10-21 09:43:38 +03:00
Raphael S. Carvalho
3abc66da5a sstables_loader: Synchronize tablet split and load-and-stream
Load-and-stream is broken when running concurrently to the
finalization step of tablet split.

Consider this:
1) split starts
2) split finalization executes barrier and succeed
3) load-and-stream runs now, starts writing sstable (pre-split)
4) split finalization publishes changes to tablet metadata
5) load-and-stream finishes writing sstable
6) sstable cannot be loaded since it spans two tablets

two possible fixes (maybe both):

1) load-and-stream awaits for topology to quiesce
2) perform split compaction on sstable that spans both sibling tablets

This patch implements #1. By awaiting for topology to quiesce,
we guarantee that load-and-stream only starts when there's no
chance coordinator is handling some topology operation like
split finalization.

Fixes #26455.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-10-20 19:17:22 -03:00
Michał Chojnowski
55c4b89b88 sstables: make sstable::estimated_keys_for_range asynchronous
Currently, `sstable::estimated_keys_for_range` works by
checking what fraction of Summary is covered by the given
range, and multiplying this fraction to the number of all keys.
Since computing things on Summary doesn't involve I/O (because Summary
is always kept in RAM), this is synchronous.

In a later patch, we will modify `sstable::estimated_keys_for_range`
so that it can deal with sstables that don't have a Summary
(because they use BTI indexes instead of BIG indexes).
In that case, the function is going to compute the relevant fraction
by using the index instead of Summary. This will require making
the function asynchronous. This is what we do in this patch.

(The actual change to the logic of `sstable::estimated_keys_for_range`
will come in the next patch. In this one, we only make it asynchronous).
2025-09-29 13:01:21 +02:00
Michał Jadwiszczak
233f4dcee3 db/view/view_building_worker: register staging sstable to view building coordinator when needed
Change return type of `check_needs_view_update_path()`. Instead of
retrning bool which tells whether to use staging directory (and register
to `view_update_generator`) or use normal directory.

Now the function returns enum with possible values:
- `normal_directory` - use normal directory for the sstable
- `staging_directly_to_generator` - use staging directory and register
      to `view_update_generator`
- `staging_managed_by_vbc` - use staging directory but don't register it
      to `view_update_generator` but create view building tasks for
      later

The third option is new, it's used when the table has any view which is
in building process currrently. In this case, registering it to `view_update_generator`
prematurely may lead to base-view inconsistency
(for example when a replica is in a pending state).
2025-08-27 10:23:03 +02:00
Robert Bindar
ca1a9c8d01 Add support for nodetool refresh --skip-reshape
This patch adds the new option in nodetool, patches the
load_new_ss_tables REST request with a new parameter and
skips the reshape step in refresh if this flag is passed.

Signed-off-by: Robert Bindar <robert.bindar@scylladb.com>

Closes scylladb/scylladb#24409
Fixes: #24365
2025-06-10 12:52:13 +03:00
Pavel Emelyanov
2eed2e94ea sstables_loader: Extend logging with recently added skip-cleanup
When starting, the loader prints all its arguments into logs. Recently
added skip-cleanup one is not included, but it's good to have one too.

refs: #24139

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24206
2025-05-28 11:20:27 +03:00
Ernest Zaslavsky
7d0d3ec1c8 load_and_stream: Add abortion flow to mutation streaming
* The new abort command explicitly represents the abortion flow in
mutation streaming, clearly identifying operations that are
intentionally aborted. This reduces ambiguity around failures in
streaming operations.
* In the error-handling section, aborted operations are now
explicitly marked as the cause of the streaming failure. This allows
us to differentiate them from genuine errors and appropriately adjust
log severity to reduce unnecessary alarm caused by aborted streaming
failures.
* To avoid alarming users with excessive error logs, log severity for
streaming failures caused by aborted operations has been downgraded.
This helps keep logs cleaner and prevents unnecessary concerns.
* A new feature has been added to ensure mixed clusters during updates
do not receive unsupported RPC messages, improving compatibility and
stability.

fixes: https://github.com/scylladb/scylladb/issues/23076

Closes scylladb/scylladb#23214
2025-05-27 14:21:58 +03:00
Pavel Emelyanov
bd3bd089e1 sstables_loader: Fix load-and-stream vs skip-cleanup check
The intention was to fail the REST API call in case --skip-cleanup is
requested for --load-and-stream loading. The corresponding if expression
is checking something else :( despite log message is correct.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#24208
2025-05-27 12:01:01 +03:00
Pavel Emelyanov
1b1f653699 api: Introduce skip_cleanup query parameter
Just copy the load_and_stream and primary_replica_only logic, this new
option is the same in this sense.

Throw if it's specified with the load_and_stream one.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-05-13 17:06:28 +03:00
Pavel Emelyanov
4ab049ac8d code: Push bool skip_cleanup flag around
Just put the boolean into the callstack between API and distributed
loader to reduce the churn in the next patches. No functional changes,
flag is false and unused.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-05-13 16:51:21 +03:00
Pavel Emelyanov
cf1ffd6086 Merge 'sstables_loader: fix the racing between get_progress() and release_resources()' from Kefu Chai
This change addresses a critical race condition in the sstables_loader where `get_progress()` could access invalid `progress_holder` instances after `release_resources()` destroyed them.

Problem:
- Progress tracking uses two components: `_progress_state` (tracks state) and `_progress_per_shard` (sharded service with actual progress data)
- `get_progress()` first checks if `_progress_state` is initialized, then accumulates progress from `_progress_per_shard`
- As both functions are coroutines, `get_progress()` could be preempted after state check but before accessing `_progress_per_shard`
- If `release_resources()` runs during this preemption, it destroys the `progress_holder` instances in `_progress_per_shard`, causing `get_progress()` to access invalid memory.

Solution:
- Implemented shared/exclusive locking to protect access to both state and sharded progress data
- Multiple `get_progress()` calls can execute in parallel (shared access)
- `release_resources()` acquires exclusive access before modifying resources
- This prevents potential memory corruption and ensures consistent progress reporting

Fixes #23801

---

this change addresses a racing related to tracking the restore progress from S3 using scylla's native API, which is not used in production yet, hence no need to backport.

Closes scylladb/scylladb#23808

* github.com:scylladb/scylladb:
  sstables_loader: fix the indent
  sstables_loader: fix the racing between get_progress() and release_resources()
2025-05-05 15:45:15 +03:00
Raphael S. Carvalho
c77f710a0c sstables: Fix quadratic space complexity in partitioned_sstable_set
Interval map is very susceptible to quadratic space behavior when
it's flooded with many entries overlapping all (or most of)
intervals, since each such entry will have presence on all
intervals it overlaps with.

A trigger we observed was memtable flush storm, which creates many
small "L0" sstables that spans roughly the entire token range.

Since we cannot rely on insertion order, solution will be about
storing sstables with such wide ranges in a vector (unleveled).

There should be no consequence for single-key reads, since upper
layer applies an additional filtering based on token of key being
queried.
And for range scans, there can be an increase in memory usage,
but not significant because the sstables span an wide range and
would have been selected in the combined reader if the range of
scan overlaps with them.

Anyway, this is a protection against storm of memtable flushes
and shouldn't be the common scenario.

It works both with tablets and vnodes, by adjusting the token
range spanned by compaction group accordingly.

Fixes #23634.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2025-04-29 15:47:33 -03:00
Kefu Chai
a2b46cbf45 sstables_loader: fix the indent
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-04-22 12:05:55 +08:00
Kefu Chai
6b3ecad467 sstables_loader: fix the racing between get_progress() and release_resources()
This change addresses a critical race condition in the sstables_loader
where `get_progress()` could access invalid `progress_holder` instances
after `release_resources()` destroyed them.

Problem:
- Progress tracking uses two components: `_progress_state` (tracks state)
  and `_progress_per_shard` (sharded service with actual progress data)
- `get_progress()` first checks if `_progress_state` is initialized, then
  accumulates progress from `_progress_per_shard`
- As both functions are coroutines, `get_progress()` could be preempted
  after state check but before accessing `_progress_per_shard`
- If `release_resources()` runs during this preemption, it destroys the
  `progress_holder` instances in `_progress_per_shard`, causing
  `get_progress()` to access invalid memory.

Solution:
- Implemented shared/exclusive locking to protect access to both state
  and sharded progress data
- Multiple `get_progress()` calls can execute in parallel (shared access)
- `release_resources()` acquires exclusive access before modifying resources
- This prevents potential memory corruption and ensures consistent
  progress reporting

Fixes #23801

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2025-04-22 12:05:54 +08:00
Benny Halevy
7969293dcf sstables_loader: use named gate
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2025-04-12 11:47:00 +03:00
Pavel Emelyanov
832d83ae4b sstables_loader: Do not stop sharded<progress_monitor> unconditionally
The member in question is unconditionally .stop()-ed in task's
release_resources() method, however, it may happen that the thing wasn't
.start()-ed in the first place. Start happens in the middle of the
task's .run() method and there can be several reasons why it can be
skipped -- e.g. the task is aborted early, or collecting sstables from
S3 throws.

fixes: #23231

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Closes scylladb/scylladb#23483
2025-04-02 12:09:02 +03:00
Pavel Emelyanov
24e5c30cc8 sstables: Do not print component filenames on load-and-stream wrap-up
When load-and-stream finishes it may call sstable::unlink() method to
drop the loaded (and streamed) sstable. Before calling it it prints a
log message about its intention that includes component_filenames()
vector. This log message is ugly in several ways.

First, it prints only recognized components, while unlink() method
unlinks all of them, so it's sort of misleading (it doesn't seem that
anyone ever read this message IRL though)

Next, that's the only place that is _that_ verbose about sstable
unlinking. "Common" unlinking paths don't print that much info.

Finally, the log message happen in debug level, so it's hardly ever
appears in any logs, but collecting several filenames takes time.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2025-03-19 12:45:21 +03:00