test_ssl: fix indentation

generic_server: improve logging broken TLS connection
Preiously we were logging a broken TLS connection and then this has been logged later again, so now instead of logging we're constructing an exception with a message extened with TLS info, which later will be catched with its full message still logged.
2026-01-09 10:27:17 +01:00 · 2026-01-09 10:24:55 +01:00 · 2026-01-09 10:22:19 +01:00 · 2025-12-29 09:34:08 +01:00 · 2025-12-10 14:53:38 +02:00 · 2025-12-10 10:53:30 +02:00
131 changed files with 2004 additions and 3011 deletions
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -57,7 +57,6 @@ repair/* @tgrabiec @asias

 # SCHEMA MANAGEMENT
 db/schema_tables* @tgrabiec
-db/legacy_schema_migrator* @tgrabiec
 service/migration* @tgrabiec
 schema* @tgrabiec

--- a/.github/scripts/auto-backport.py
+++ b/.github/scripts/auto-backport.py
@@ -62,7 +62,7 @@ def create_pull_request(repo, new_branch_name, base_branch_name, pr, backport_pr
        if is_draft:
            labels_to_add.append("conflicts")
            pr_comment = f"@{pr.user.login} - This PR was marked as draft because it has conflicts\n"
-            pr_comment += "Please resolve them and mark this PR as ready for review"
+            pr_comment += "Please resolve them and remove the 'conflicts' label. The PR will be made ready for review automatically."
            backport_pr.create_issue_comment(pr_comment)
        
        # Apply all labels at once if we have any
--- a/SCYLLA_ASSERT_CONVERSION_SUMMARY.md
+++ b/SCYLLA_ASSERT_CONVERSION_SUMMARY.md
@@ -1,182 +0,0 @@
-# SCYLLA_ASSERT to scylla_assert() Conversion Summary
-
-## Objective
-
-Replace crash-inducing `SCYLLA_ASSERT` with exception-throwing `scylla_assert()` to prevent cluster-wide crashes and maintain availability.
-
-## What Was Done
-
-### 1. Infrastructure Implementation ✓
-
-Created new `scylla_assert()` macro in `utils/assert.hh`:
- Based on `on_internal_error()` for exception-based error handling
- Supports optional custom error messages via variadic arguments
- Uses `seastar::format()` for string formatting
- Compatible with C++23 standard (uses `__VA_OPT__`)
-
-**Key difference from SCYLLA_ASSERT:**
-```cpp
-// Old: Crashes the process immediately
-SCYLLA_ASSERT(condition);
-
-// New: Throws exception (or aborts based on config)
-scylla_assert(condition);
-scylla_assert(condition, "custom error message: {}", value);
-```
-
-### 2. Comprehensive Analysis ✓
-
-Analyzed entire codebase to identify safe vs unsafe conversion locations:
-
-**Statistics:**
- Total SCYLLA_ASSERT usages: ~1307 (including tests)
- Non-test usages: ~886
- **Unsafe to convert**: 223 usages (25%)
-  - In noexcept functions: 187 usages across 50 files
-  - In destructors: 36 usages across 25 files
- **Safe to convert**: ~668 usages (75%)
- **Converted in this PR**: 112 usages (16.8% of safe conversions)
-
-### 3. Documentation ✓
-
-Created comprehensive documentation:
-
-1. **Conversion Guide** (`docs/dev/scylla_assert_conversion.md`)
-   - Explains safe vs unsafe contexts
-   - Provides conversion strategy
-   - Lists all completed conversions
-   - Includes testing guidance
-
-2. **Unsafe Locations Report** (`docs/dev/unsafe_scylla_assert_locations.md`)
-   - Detailed listing of 223 unsafe locations
-   - Organized by file with line numbers
-   - Separated into noexcept and destructor categories
-
-### 4. Sample Conversions ✓
-
-Converted 112 safe SCYLLA_ASSERT usages across 32 files as demonstration:
-
-| File | Conversions | Context |
-|------|------------|---------|
-| db/large_data_handler.{cc,hh} | 5 | Future-returning functions |
-| db/schema_applier.cc | 1 | Coroutine function |
-| db/system_distributed_keyspace.cc | 1 | Regular function |
-| db/commitlog/commitlog_replayer.cc | 1 | Coroutine function |
-| db/view/row_locking.cc | 2 | Regular function |
-| db/size_estimates_virtual_reader.cc | 1 | Lambda in coroutine |
-| db/corrupt_data_handler.cc | 2 | Lambdas in future-returning function |
-| raft/tracker.cc | 2 | Unreachable code (switch defaults) |
-| service/topology_coordinator.cc | 11 | Coroutine functions (topology operations) |
-| service/storage_service.cc | 28 | Critical node lifecycle operations |
-| sstables/* (22 files) | 58 | SSTable operations (read/write/compress/index) |
-
-All conversions were in **safe contexts** (non-noexcept, non-destructor functions). 3 assertions in storage_service.cc remain as SCYLLA_ASSERT (in noexcept functions).
-
-## Why These Cannot Be Converted
-
-### Unsafe Context #1: noexcept Functions (187 usages)
-
-**Problem**: Throwing from noexcept causes `std::terminate()`, same as crash.
-
-**Example** (from `locator/production_snitch_base.hh`):
-```cpp
-virtual bool prefer_local() const noexcept override {
-    SCYLLA_ASSERT(_backreference != nullptr);  // Cannot convert!
-    return _backreference->prefer_local();
-}
-```
-
-**Solution for these**: Keep as SCYLLA_ASSERT or use `on_fatal_internal_error()`.
-
-### Unsafe Context #2: Destructors (36 usages)
-
-**Problem**: Destructors are implicitly noexcept, throwing causes `std::terminate()`.
-
-**Example** (from `utils/file_lock.cc`):
-```cpp
-~file_lock() noexcept {
-    if (_fd.get() != -1) {
-        SCYLLA_ASSERT(_fd.get() != -1);  // Cannot convert!
-        auto r = ::flock(_fd.get(), LOCK_UN);
-        SCYLLA_ASSERT(r == 0);  // Cannot convert!
-    }
-}
-```
-
-**Solution for these**: Keep as SCYLLA_ASSERT.
-
-## Benefits of scylla_assert()
-
-1. **Prevents Cluster-Wide Crashes**
-   - Exception can be caught and handled gracefully
-   - Failed node doesn't bring down entire cluster
-
-2. **Maintains Availability**
-   - Service can continue with degraded functionality
-   - Better than complete crash
-
-3. **Better Error Reporting**
-   - Includes backtrace via `on_internal_error()`
-   - Supports custom error messages
-   - Configurable abort-on-error for testing
-
-4. **Backward Compatible**
-   - SCYLLA_ASSERT still exists for unsafe contexts
-   - Can be gradually adopted
-
-## Testing
-
- Created manual test in `test/manual/test_scylla_assert.cc`
- Verifies passing and failing assertions
- Tests custom error messages
- Code review passed with improvements made
-
-## Next Steps (Future Work)
-
-1. **Gradual Conversion**
-   - Convert remaining ~653 safe SCYLLA_ASSERT usages incrementally
-   - Prioritize high-impact code paths first
-
-2. **Review noexcept Functions**
-   - Evaluate if some can be made non-noexcept
-   - Consider using `on_fatal_internal_error()` where appropriate
-
-3. **Integration Testing**
-   - Run full test suite with conversions
-   - Monitor for any unexpected behavior
-   - Validate exception propagation
-
-4. **Automated Analysis Tool**
-   - Create tool to identify safe conversion candidates
-   - Generate conversion patches automatically
-   - Track conversion progress
-
-## Files Modified in This PR
-
-### Core Implementation
- `utils/assert.hh` - Added scylla_assert() macro
-
-### Conversions
- `db/large_data_handler.cc`
- `db/large_data_handler.hh`
- `db/schema_applier.cc`
- `db/system_distributed_keyspace.cc`
- `db/commitlog/commitlog_replayer.cc`
- `db/view/row_locking.cc`
- `db/size_estimates_virtual_reader.cc`
- `db/corrupt_data_handler.cc`
- `raft/tracker.cc`
- `service/topology_coordinator.cc`
- `service/storage_service.cc`
- `sstables/` (22 files across trie/, mx/, and core sstables)
-
-### Documentation
- `docs/dev/scylla_assert_conversion.md`
- `docs/dev/unsafe_scylla_assert_locations.md`
- `test/manual/test_scylla_assert.cc`
-
-## Conclusion
-
-This PR establishes the infrastructure and methodology for replacing SCYLLA_ASSERT with scylla_assert() to improve cluster availability. The sample conversions demonstrate the approach, while comprehensive documentation enables future work.
-
-**Key Achievement**: Provided a safe path forward for converting 75% (~668) of SCYLLA_ASSERT usages to exception-based assertions, while clearly documenting the 25% (~223) that must remain as crash-inducing assertions due to language constraints. Converted 112 usages as demonstration (16.8% of safe conversions), prioritizing critical files like storage_service.cc (node lifecycle) and all sstables files (data persistence), with ~556 remaining.
--- a/alternator/executor.cc
+++ b/alternator/executor.cc
@@ -2223,12 +2223,12 @@ void validate_value(const rjson::value& v, const char* caller) {

 // The put_or_delete_item class builds the mutations needed by the PutItem and
 // DeleteItem operations - either as stand-alone commands or part of a list
-// of commands in BatchWriteItems.
+// of commands in BatchWriteItem.
 // put_or_delete_item splits each operation into two stages: Constructing the
 // object parses and validates the user input (throwing exceptions if there
 // are input errors). Later, build() generates the actual mutation, with a
 // specified timestamp. This split is needed because of the peculiar needs of
-// BatchWriteItems and LWT. BatchWriteItems needs all parsing to happen before
+// BatchWriteItem and LWT. BatchWriteItem needs all parsing to happen before
 // any writing happens (if one of the commands has an error, none of the
 // writes should be done). LWT makes it impossible for the parse step to
 // generate "mutation" objects, because the timestamp still isn't known.
@@ -2739,7 +2739,7 @@ future<executor::request_return_type> rmw_operation::execute(service::storage_pr
    auto read_command = needs_read_before_write ?
            previous_item_read_command(proxy, schema(), _ck, selection) :
            nullptr;
-    return proxy.cas(schema(), std::move(*cas_shard), shared_from_this(), read_command, to_partition_ranges(*schema(), _pk),
+    return proxy.cas(schema(), std::move(*cas_shard), *this, read_command, to_partition_ranges(*schema(), _pk),
            {timeout, std::move(permit), client_state, trace_state},
            db::consistency_level::LOCAL_SERIAL, db::consistency_level::LOCAL_QUORUM, timeout, timeout, true, std::move(cdc_opts)).then([this, read_command, &wcu_total] (bool is_applied) mutable {
        if (!is_applied) {
@@ -3026,17 +3026,20 @@ struct primary_key_equal {
 };

 // This is a cas_request subclass for applying given put_or_delete_items to
-// one partition using LWT as part as BatchWriteItems. This is a write-only
+// one partition using LWT as part as BatchWriteItem. This is a write-only
 // operation, not needing the previous value of the item (the mutation to be
 // done is known prior to starting the operation). Nevertheless, we want to
 // do this mutation via LWT to ensure that it is serialized with other LWT
 // mutations to the same partition.
+// 
+// The std::vector<put_or_delete_item> must remain alive until the
+// storage_proxy::cas() future is resolved.
 class put_or_delete_item_cas_request : public service::cas_request {
    schema_ptr schema;
-    std::vector<put_or_delete_item> _mutation_builders;
+    const std::vector<put_or_delete_item>& _mutation_builders;
 public:
-    put_or_delete_item_cas_request(schema_ptr s, std::vector<put_or_delete_item>&& b) :
-        schema(std::move(s)), _mutation_builders(std::move(b)) { }
+    put_or_delete_item_cas_request(schema_ptr s, const std::vector<put_or_delete_item>& b) :
+        schema(std::move(s)), _mutation_builders(b) { }
    virtual ~put_or_delete_item_cas_request() = default;
    virtual std::optional<mutation> apply(foreign_ptr<lw_shared_ptr<query::result>> qr, const query::partition_slice& slice, api::timestamp_type ts, cdc::per_request_options& cdc_opts) override {
        std::optional<mutation> ret;
@@ -3052,20 +3055,48 @@ public:
    }
 };

-static future<> cas_write(service::storage_proxy& proxy, schema_ptr schema, service::cas_shard cas_shard, dht::decorated_key dk, std::vector<put_or_delete_item>&& mutation_builders,
-        service::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit) {
+future<> executor::cas_write(schema_ptr schema, service::cas_shard cas_shard, const dht::decorated_key& dk,
+        const std::vector<put_or_delete_item>& mutation_builders, service::client_state& client_state,
+        tracing::trace_state_ptr trace_state, service_permit permit)
+{
+    if (!cas_shard.this_shard()) {
+        _stats.shard_bounce_for_lwt++;
+        return container().invoke_on(cas_shard.shard(), _ssg,
+                    [cs = client_state.move_to_other_shard(),
+                    &mb = mutation_builders,
+                    &dk,
+                    ks = schema->ks_name(),
+                    cf = schema->cf_name(),
+                    gt = tracing::global_trace_state_ptr(trace_state),
+                    permit = std::move(permit)]
+                    (executor& self) mutable {
+            return do_with(cs.get(), [&mb, &dk, ks = std::move(ks), cf = std::move(cf),
+                                    trace_state = tracing::trace_state_ptr(gt), &self]
+                                    (service::client_state& client_state) mutable {
+                auto schema = self._proxy.data_dictionary().find_schema(ks, cf);
+                service::cas_shard cas_shard(*schema, dk.token());
+
+                //FIXME: Instead of passing empty_service_permit() to the background operation,
+                // the current permit's lifetime should be prolonged, so that it's destructed
+                // only after all background operations are finished as well.
+                return self.cas_write(schema, std::move(cas_shard), dk, mb, client_state, std::move(trace_state), empty_service_permit());
+            });
+        });
+    }
+
    auto timeout = executor::default_timeout();
-    auto op = seastar::make_shared<put_or_delete_item_cas_request>(schema, std::move(mutation_builders));
+    auto op = std::make_unique<put_or_delete_item_cas_request>(schema, mutation_builders);
+    auto* op_ptr = op.get();
    auto cdc_opts = cdc::per_request_options{
        .alternator = true,
        .alternator_streams_increased_compatibility =
-                schema->cdc_options().enabled() && proxy.data_dictionary().get_config().alternator_streams_increased_compatibility(),
+                schema->cdc_options().enabled() && _proxy.data_dictionary().get_config().alternator_streams_increased_compatibility(),
    };
-    return proxy.cas(schema, std::move(cas_shard), op, nullptr, to_partition_ranges(dk),
+    return _proxy.cas(schema, std::move(cas_shard), *op_ptr, nullptr, to_partition_ranges(dk),
            {timeout, std::move(permit), client_state, trace_state},
            db::consistency_level::LOCAL_SERIAL, db::consistency_level::LOCAL_QUORUM,
-            timeout, timeout, true, std::move(cdc_opts)).discard_result();
-    // We discarded cas()'s future value ("is_applied") because BatchWriteItems
+            timeout, timeout, true, std::move(cdc_opts)).finally([op = std::move(op)]{}).discard_result();
+    // We discarded cas()'s future value ("is_applied") because BatchWriteItem
    // does not need to support conditional updates.
 }

@@ -3087,13 +3118,11 @@ struct schema_decorated_key_equal {

 // FIXME: if we failed writing some of the mutations, need to return a list
 // of these failed mutations rather than fail the whole write (issue #5650).
-static future<> do_batch_write(service::storage_proxy& proxy,
-        smp_service_group ssg,
+future<> executor::do_batch_write(
        std::vector<std::pair<schema_ptr, put_or_delete_item>> mutation_builders,
        service::client_state& client_state,
        tracing::trace_state_ptr trace_state,
-        service_permit permit,
-        stats& stats) {
+        service_permit permit) {
    if (mutation_builders.empty()) {
        return make_ready_future<>();
    }
@@ -3115,7 +3144,7 @@ static future<> do_batch_write(service::storage_proxy& proxy,
            mutations.push_back(b.second.build(b.first, now));
            any_cdc_enabled |= b.first->cdc_options().enabled();
        }
-        return proxy.mutate(std::move(mutations),
+        return _proxy.mutate(std::move(mutations),
                db::consistency_level::LOCAL_QUORUM,
                executor::default_timeout(),
                trace_state,
@@ -3124,55 +3153,48 @@ static future<> do_batch_write(service::storage_proxy& proxy,
                false,
                cdc::per_request_options{
                    .alternator = true,
-                    .alternator_streams_increased_compatibility = any_cdc_enabled && proxy.data_dictionary().get_config().alternator_streams_increased_compatibility(),
+                    .alternator_streams_increased_compatibility = any_cdc_enabled && _proxy.data_dictionary().get_config().alternator_streams_increased_compatibility(),
                });
    } else {
        // Do the write via LWT:
        // Multiple mutations may be destined for the same partition, adding
        // or deleting different items of one partition. Join them together
        // because we can do them in one cas() call.
-        std::unordered_map<schema_decorated_key, std::vector<put_or_delete_item>, schema_decorated_key_hash, schema_decorated_key_equal>
-            key_builders(1, schema_decorated_key_hash{}, schema_decorated_key_equal{});
-        for (auto& b : mutation_builders) {
-            auto dk = dht::decorate_key(*b.first, b.second.pk());
-            auto [it, added] = key_builders.try_emplace(schema_decorated_key{b.first, dk});
+        using map_type = std::unordered_map<schema_decorated_key, 
+            std::vector<put_or_delete_item>, 
+            schema_decorated_key_hash, 
+            schema_decorated_key_equal>;
+        auto key_builders = std::make_unique<map_type>(1, schema_decorated_key_hash{}, schema_decorated_key_equal{});
+        for (auto&& b : std::move(mutation_builders)) {
+            auto [it, added] = key_builders->try_emplace(schema_decorated_key {
+                .schema = b.first,
+                .dk = dht::decorate_key(*b.first, b.second.pk())
+            });
            it->second.push_back(std::move(b.second));
        }
-        return parallel_for_each(std::move(key_builders), [&proxy, &client_state, &stats, trace_state, ssg, permit = std::move(permit)] (auto& e) {
-            stats.write_using_lwt++;
+        auto* key_builders_ptr = key_builders.get();
+        return parallel_for_each(*key_builders_ptr, [this, &client_state, trace_state, permit = std::move(permit)] (const auto& e) {
+            _stats.write_using_lwt++;
            auto desired_shard = service::cas_shard(*e.first.schema, e.first.dk.token());
-            if (desired_shard.this_shard()) {
-                return cas_write(proxy, e.first.schema, std::move(desired_shard), e.first.dk, std::move(e.second), client_state, trace_state, permit);
-            } else {
-                stats.shard_bounce_for_lwt++;
-                return proxy.container().invoke_on(desired_shard.shard(), ssg,
-                            [cs = client_state.move_to_other_shard(),
-                             mb = e.second,
-                             dk = e.first.dk,
-                             ks = e.first.schema->ks_name(),
-                             cf = e.first.schema->cf_name(),
-                             gt =  tracing::global_trace_state_ptr(trace_state),
-                             permit = std::move(permit)]
-                            (service::storage_proxy& proxy) mutable {
-                    return do_with(cs.get(), [&proxy, mb = std::move(mb), dk = std::move(dk), ks = std::move(ks), cf = std::move(cf),
-                                              trace_state = tracing::trace_state_ptr(gt)]
-                                              (service::client_state& client_state) mutable {
-                        auto schema = proxy.data_dictionary().find_schema(ks, cf);
+            auto s = e.first.schema;

-                        // The desired_shard on the original shard remains alive for the duration
-                        // of cas_write on this shard and prevents any tablet operations.
-                        // However, we need a local instance of cas_shard on this shard
-                        // to pass it to sp::cas, so we just create a new one.
-                        service::cas_shard cas_shard(*schema, dk.token());
-
-                        //FIXME: Instead of passing empty_service_permit() to the background operation,
-                        // the current permit's lifetime should be prolonged, so that it's destructed
-                        // only after all background operations are finished as well.
-                        return cas_write(proxy, schema, std::move(cas_shard), dk, std::move(mb), client_state, std::move(trace_state), empty_service_permit());
-                    });
-                }).finally([desired_shard = std::move(desired_shard)]{});
-            }
-        });
+            static const auto* injection_name = "alternator_executor_batch_write_wait";
+            return utils::get_local_injector().inject(injection_name, [s = std::move(s)] (auto& handler) -> future<> {
+                const auto ks = handler.get("keyspace");
+                const auto cf = handler.get("table");
+                const auto shard = std::atoll(handler.get("shard")->data());
+                if (ks == s->ks_name() && cf == s->cf_name() && shard == this_shard_id()) {
+                    elogger.info("{}: hit", injection_name);
+                    co_await handler.wait_for_message(std::chrono::steady_clock::now() + std::chrono::minutes{5});
+                    elogger.info("{}: continue", injection_name);
+                }
+            }).then([&e, desired_shard = std::move(desired_shard),
+                 &client_state, trace_state = std::move(trace_state), permit = std::move(permit), this]() mutable
+            {
+                return cas_write(e.first.schema, std::move(desired_shard), e.first.dk,
+                    std::move(e.second), client_state, std::move(trace_state), std::move(permit));
+            });
+        }).finally([key_builders = std::move(key_builders)]{});
    }
 }

@@ -3319,7 +3341,7 @@ future<executor::request_return_type> executor::batch_write_item(client_state& c
    _stats.wcu_total[stats::DELETE_ITEM] += wcu_delete_units;
    _stats.api_operations.batch_write_item_batch_total += total_items;
    _stats.api_operations.batch_write_item_histogram.add(total_items);
-    co_await do_batch_write(_proxy, _ssg, std::move(mutation_builders), client_state, trace_state, std::move(permit), _stats);
+    co_await do_batch_write(std::move(mutation_builders), client_state, trace_state, std::move(permit));
    // FIXME: Issue #5650: If we failed writing some of the updates,
    // need to return a list of these failed updates in UnprocessedItems
    // rather than fail the whole write (issue #5650).
--- a/alternator/executor.hh
+++ b/alternator/executor.hh
@@ -40,6 +40,7 @@ namespace cql3::selection {

 namespace service {
    class storage_proxy;
+    class cas_shard;
 }

 namespace cdc {
@@ -57,6 +58,7 @@ class schema_builder;
 namespace alternator {

 class rmw_operation;
+class put_or_delete_item;

 schema_ptr get_table(service::storage_proxy& proxy, const rjson::value& request);
 bool is_alternator_keyspace(const sstring& ks_name);
@@ -219,6 +221,16 @@ private:

    static void describe_key_schema(rjson::value& parent, const schema&, std::unordered_map<std::string,std::string> * = nullptr, const std::map<sstring, sstring> *tags = nullptr);

+    future<> do_batch_write(
+        std::vector<std::pair<schema_ptr, put_or_delete_item>> mutation_builders,
+        service::client_state& client_state,
+        tracing::trace_state_ptr trace_state,
+        service_permit permit);
+
+    future<> cas_write(schema_ptr schema, service::cas_shard cas_shard, const dht::decorated_key& dk,
+        const std::vector<put_or_delete_item>& mutation_builders, service::client_state& client_state,
+        tracing::trace_state_ptr trace_state, service_permit permit);
+
 public:
    static void describe_key_schema(rjson::value& parent, const schema& schema, std::unordered_map<std::string,std::string>&, const std::map<sstring, sstring> *tags = nullptr);

--- a/alternator/server.cc
+++ b/alternator/server.cc
@@ -979,9 +979,8 @@ client_data server::ongoing_request::make_client_data() const {
    // and keep "driver_version" unset.
    cd.driver_name = _user_agent;
    // Leave "protocol_version" unset, it has no meaning in Alternator.
-    // Leave "hostname", "ssl_protocol" and "ssl_cipher_suite" unset.
-    // As reported in issue #9216, we never set these fields in CQL
-    // either (see cql_server::connection::make_client_data()).
+    // Leave "hostname", "ssl_protocol" and "ssl_cipher_suite" unset for Alternator.
+    // Note: CQL sets ssl_protocol and ssl_cipher_suite via generic_server::connection base class.
    return cd;
 }

--- a/api/api-doc/storage_service.json
+++ b/api/api-doc/storage_service.json
@@ -729,6 +729,14 @@
                     "allowMultiple":false,
                     "type":"boolean",
                     "paramType":"query"
+                  },
+                  {
+                     "name":"use_sstable_identifier",
+                     "description":"Use the sstable identifier UUID, if available, rather than the sstable generation.",
+                     "required":false,
+                     "allowMultiple":false,
+                     "type":"boolean",
+                     "paramType":"query"
                  }
               ]
            },
--- a/api/storage_service.cc
+++ b/api/storage_service.cc
@@ -2020,12 +2020,16 @@ void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_
        auto tag = req->get_query_param("tag");
        auto column_families = split(req->get_query_param("cf"), ",");
        auto sfopt = req->get_query_param("sf");
-        auto sf = db::snapshot_ctl::skip_flush(strcasecmp(sfopt.c_str(), "true") == 0);
+        auto usiopt = req->get_query_param("use_sstable_identifier");
+        db::snapshot_options opts = {
+            .skip_flush = strcasecmp(sfopt.c_str(), "true") == 0,
+            .use_sstable_identifier = strcasecmp(usiopt.c_str(), "true") == 0
+        };

        std::vector<sstring> keynames = split(req->get_query_param("kn"), ",");
        try {
            if (column_families.empty()) {
-                co_await snap_ctl.local().take_snapshot(tag, keynames, sf);
+                co_await snap_ctl.local().take_snapshot(tag, keynames, opts);
            } else {
                if (keynames.empty()) {
                    throw httpd::bad_param_exception("The keyspace of column families must be specified");
@@ -2033,7 +2037,7 @@ void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_
                if (keynames.size() > 1) {
                    throw httpd::bad_param_exception("Only one keyspace allowed when specifying a column family");
                }
-                co_await snap_ctl.local().take_column_family_snapshot(keynames[0], column_families, tag, sf);
+                co_await snap_ctl.local().take_column_family_snapshot(keynames[0], column_families, tag, opts);
            }
            co_return json_void();
        } catch (...) {
@@ -2068,7 +2072,8 @@ void set_snapshot(http_context& ctx, routes& r, sharded<db::snapshot_ctl>& snap_
        auto info = parse_scrub_options(ctx, std::move(req));

        if (!info.snapshot_tag.empty()) {
-            co_await snap_ctl.local().take_column_family_snapshot(info.keyspace, info.column_families, info.snapshot_tag, db::snapshot_ctl::skip_flush::no);
+            db::snapshot_options opts = {.skip_flush = false, .use_sstable_identifier = false};
+            co_await snap_ctl.local().take_column_family_snapshot(info.keyspace, info.column_families, info.snapshot_tag, opts);
        }

        compaction::compaction_stats stats;
--- a/api/tasks.cc
+++ b/api/tasks.cc
@@ -146,7 +146,8 @@ void set_tasks_compaction_module(http_context& ctx, routes& r, sharded<service::
        auto info = parse_scrub_options(ctx, std::move(req));

        if (!info.snapshot_tag.empty()) {
-            co_await snap_ctl.local().take_column_family_snapshot(info.keyspace, info.column_families, info.snapshot_tag, db::snapshot_ctl::skip_flush::no);
+            db::snapshot_options opts = {.skip_flush = false, .use_sstable_identifier = false};
+            co_await snap_ctl.local().take_column_family_snapshot(info.keyspace, info.column_families, info.snapshot_tag, opts);
        }

        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
--- a/auth/certificate_authenticator.cc
+++ b/auth/certificate_authenticator.cc
@@ -8,6 +8,7 @@
 */

 #include "auth/certificate_authenticator.hh"
+#include "auth/cache.hh"

 #include <boost/regex.hpp>
 #include <fmt/ranges.h>
@@ -34,13 +35,14 @@ static const class_registrator<auth::authenticator
    , cql3::query_processor&
    , ::service::raft_group0_client&
    , ::service::migration_manager&
+    , auth::cache&
    , utils::alien_worker&> cert_auth_reg(CERT_AUTH_NAME);

 enum class auth::certificate_authenticator::query_source {
    subject, altname
 };

-auth::certificate_authenticator::certificate_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&)
+auth::certificate_authenticator::certificate_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, auth::cache&, utils::alien_worker&)
    : _queries([&] {
        auto& conf = qp.db().get_config();
        auto queries = conf.auth_certificate_role_queries();
--- a/auth/certificate_authenticator.hh
+++ b/auth/certificate_authenticator.hh
@@ -26,13 +26,15 @@ class raft_group0_client;

 namespace auth {

+class cache;
+
 extern const std::string_view certificate_authenticator_name;

 class certificate_authenticator : public authenticator {
    enum class query_source;
    std::vector<std::pair<query_source, boost::regex>> _queries;
 public:
-    certificate_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&);
+    certificate_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&, utils::alien_worker&);
    ~certificate_authenticator();

    future<> start() override;
--- a/configure.py
+++ b/configure.py
@@ -1062,7 +1062,6 @@ scylla_core = (['message/messaging_service.cc',
                'db/hints/resource_manager.cc',
                'db/hints/sync_point.cc',
                'db/large_data_handler.cc',
-                'db/legacy_schema_migrator.cc',
                'db/marshal/type_parser.cc',
                'db/per_partition_rate_limit_options.cc',
                'db/rate_limiter.cc',
--- a/cql3/statements/batch_statement.cc
+++ b/cql3/statements/batch_statement.cc
@@ -331,7 +331,7 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::exe
    if (!cl_for_paxos) [[unlikely]] {
        return make_exception_future<shared_ptr<cql_transport::messages::result_message>>(std::move(cl_for_paxos).assume_error());
    }
-    seastar::shared_ptr<cas_request> request;
+    std::unique_ptr<cas_request> request;
    schema_ptr schema;

    db::timeout_clock::time_point now = db::timeout_clock::now();
@@ -354,9 +354,9 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::exe
        if (keys.empty()) {
            continue;
        }
-        if (request.get() == nullptr) {
+        if (!request) {
            schema = statement.s;
-            request = seastar::make_shared<cas_request>(schema, std::move(keys));
+            request = std::make_unique<cas_request>(schema, std::move(keys));
        } else if (keys.size() != 1 || keys.front().equal(request->key().front(), dht::ring_position_comparator(*schema)) == false) {
            throw exceptions::invalid_request_exception("BATCH with conditions cannot span multiple partitions");
        }
@@ -366,7 +366,7 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::exe

        request->add_row_update(statement, std::move(ranges), std::move(json_cache), statement_options);
    }
-    if (request.get() == nullptr) {
+    if (!request) {
        throw exceptions::invalid_request_exception(format("Unrestricted partition key in a conditional BATCH"));
    }

@@ -377,9 +377,10 @@ future<shared_ptr<cql_transport::messages::result_message>> batch_statement::exe
            );
    }

-    return qp.proxy().cas(schema, std::move(cas_shard), request, request->read_command(qp), request->key(),
+    auto* request_ptr = request.get();
+    return qp.proxy().cas(schema, std::move(cas_shard), *request_ptr, request->read_command(qp), request->key(),
            {read_timeout, qs.get_permit(), qs.get_client_state(), qs.get_trace_state()},
-            std::move(cl_for_paxos).assume_value(), cl_for_learn, batch_timeout, cas_timeout).then([this, request] (bool is_applied) {
+            std::move(cl_for_paxos).assume_value(), cl_for_learn, batch_timeout, cas_timeout).then([this, request = std::move(request)] (bool is_applied) {
        return request->build_cas_result_set(_metadata, _columns_of_cas_result_set, is_applied);
    });
 }
--- a/cql3/statements/modification_statement.cc
+++ b/cql3/statements/modification_statement.cc
@@ -401,7 +401,8 @@ modification_statement::execute_with_condition(query_processor& qp, service::que
                    type.is_update() ? "update" : "deletion"));
    }

-    auto request = seastar::make_shared<cas_request>(s, std::move(keys));
+    auto request = std::make_unique<cas_request>(s, std::move(keys));
+    auto* request_ptr = request.get();
    // cas_request can be used for batches as well single statements; Here we have just a single
    // modification in the list of CAS commands, since we're handling single-statement execution.
    request->add_row_update(*this, std::move(ranges), std::move(json_cache), options);
@@ -427,9 +428,9 @@ modification_statement::execute_with_condition(query_processor& qp, service::que
        tablet_info = erm->check_locality(token);
    }

-    return qp.proxy().cas(s, std::move(cas_shard), request, request->read_command(qp), request->key(),
+    return qp.proxy().cas(s, std::move(cas_shard), *request_ptr, request->read_command(qp), request->key(),
            {read_timeout, qs.get_permit(), qs.get_client_state(), qs.get_trace_state()},
-            std::move(cl_for_paxos).assume_value(), cl_for_learn, statement_timeout, cas_timeout).then([this, request, tablet_replicas = std::move(tablet_info->tablet_replicas), token_range = tablet_info->token_range] (bool is_applied) {
+            std::move(cl_for_paxos).assume_value(), cl_for_learn, statement_timeout, cas_timeout).then([this, request = std::move(request), tablet_replicas = std::move(tablet_info->tablet_replicas), token_range = tablet_info->token_range] (bool is_applied) {
        auto result = request->build_cas_result_set(_metadata, _columns_of_cas_result_set, is_applied);
        result->add_tablet_info(tablet_replicas, token_range);
        return result;
--- a/db/CMakeLists.txt
+++ b/db/CMakeLists.txt
@@ -10,7 +10,6 @@ target_sources(db
    schema_applier.cc
    schema_tables.cc
    cql_type_parser.cc
-    legacy_schema_migrator.cc
    commitlog/commitlog.cc
    commitlog/commitlog_replayer.cc
    commitlog/commitlog_entry.cc
--- a/db/commitlog/commitlog_replayer.cc
+++ b/db/commitlog/commitlog_replayer.cc
@@ -165,7 +165,7 @@ future<> db::commitlog_replayer::impl::init() {

 future<db::commitlog_replayer::impl::stats>
 db::commitlog_replayer::impl::recover(const commitlog::descriptor& d, const commitlog::replay_state& rpstate) const {
-    scylla_assert(_column_mappings.local_is_initialized());
+    SCYLLA_ASSERT(_column_mappings.local_is_initialized());

    replay_position rp{d};
    auto gp = min_pos(rp.shard_id());
--- a/db/corrupt_data_handler.cc
+++ b/db/corrupt_data_handler.cc
@@ -10,7 +10,6 @@
 #include "reader_concurrency_semaphore.hh"
 #include "replica/database.hh"
 #include "utils/UUID_gen.hh"
-#include "utils/assert.hh"

 static logging::logger corrupt_data_logger("corrupt_data");

@@ -76,14 +75,14 @@ future<corrupt_data_handler::entry_id> system_table_corrupt_data_handler::do_rec

    auto set_cell_raw = [this, &entry_row, &corrupt_data_schema, timestamp] (const char* cell_name, managed_bytes cell_value) {
        auto cdef = corrupt_data_schema->get_column_definition(cell_name);
-        scylla_assert(cdef);
+        SCYLLA_ASSERT(cdef);

        entry_row.cells().apply(*cdef, atomic_cell::make_live(*cdef->type, timestamp, cell_value, _entry_ttl));
    }; 

    auto set_cell = [this, &entry_row, &corrupt_data_schema, timestamp] (const char* cell_name, data_value cell_value) {
        auto cdef = corrupt_data_schema->get_column_definition(cell_name);
-        scylla_assert(cdef);
+        SCYLLA_ASSERT(cdef);

        entry_row.cells().apply(*cdef, atomic_cell::make_live(*cdef->type, timestamp, cell_value.serialize_nonnull(), _entry_ttl));
    };
--- a/db/large_data_handler.cc
+++ b/db/large_data_handler.cc
@@ -39,7 +39,7 @@ large_data_handler::large_data_handler(uint64_t partition_threshold_bytes, uint6
 }

 future<large_data_handler::partition_above_threshold> large_data_handler::maybe_record_large_partitions(const sstables::sstable& sst, const sstables::key& key, uint64_t partition_size, uint64_t rows, uint64_t range_tombstones, uint64_t dead_rows) {
-    scylla_assert(running());
+    SCYLLA_ASSERT(running());
    partition_above_threshold above_threshold{partition_size > _partition_threshold_bytes, rows > _rows_count_threshold};
    static_assert(std::is_same_v<decltype(above_threshold.size), bool>);
    _stats.partitions_bigger_than_threshold += above_threshold.size; // increment if true
@@ -83,7 +83,7 @@ sstring large_data_handler::sst_filename(const sstables::sstable& sst) {
 }

 future<> large_data_handler::maybe_delete_large_data_entries(sstables::shared_sstable sst) {
-    scylla_assert(running());
+    SCYLLA_ASSERT(running());
    auto schema = sst->get_schema();
    auto filename = sst_filename(*sst);
    using ldt = sstables::large_data_type;
@@ -247,7 +247,7 @@ future<> cql_table_large_data_handler::record_large_rows(const sstables::sstable

 future<> cql_table_large_data_handler::delete_large_data_entries(const schema& s, sstring sstable_name, std::string_view large_table_name) const {
    auto sys_ks = _sys_ks.get_permit();
-    scylla_assert(sys_ks);
+    SCYLLA_ASSERT(sys_ks);
    const sstring req =
            seastar::format("DELETE FROM system.{} WHERE keyspace_name = ? AND table_name = ? AND sstable_name = ?",
                    large_table_name);
--- a/db/large_data_handler.hh
+++ b/db/large_data_handler.hh
@@ -80,7 +80,7 @@ public:

    future<bool> maybe_record_large_rows(const sstables::sstable& sst, const sstables::key& partition_key,
            const clustering_key_prefix* clustering_key, uint64_t row_size) {
-        scylla_assert(running());
+        SCYLLA_ASSERT(running());
        if (row_size > _row_threshold_bytes) [[unlikely]] {
            return with_sem([&sst, &partition_key, clustering_key, row_size, this] {
                return record_large_rows(sst, partition_key, clustering_key, row_size);
@@ -100,7 +100,7 @@ public:

    future<bool> maybe_record_large_cells(const sstables::sstable& sst, const sstables::key& partition_key,
            const clustering_key_prefix* clustering_key, const column_definition& cdef, uint64_t cell_size, uint64_t collection_elements) {
-        scylla_assert(running());
+        SCYLLA_ASSERT(running());
        if (cell_size > _cell_threshold_bytes || collection_elements > _collection_elements_count_threshold) [[unlikely]] {
            return with_sem([&sst, &partition_key, clustering_key, &cdef, cell_size, collection_elements, this] {
                return record_large_cells(sst, partition_key, clustering_key, cdef, cell_size, collection_elements);
--- a/db/legacy_schema_migrator.cc
+++ b/db/legacy_schema_migrator.cc
@@ -1,602 +0,0 @@
-/*
- * Modified by ScyllaDB
- * Copyright (C) 2017-present ScyllaDB
- */
-
-/*
- * SPDX-License-Identifier: (LicenseRef-ScyllaDB-Source-Available-1.0 and Apache-2.0)
- */
-
-// Since Scylla 2.0, we use system tables whose schemas were introduced in
-// Cassandra 3. If Scylla boots to find a data directory with system tables
-// with older schemas - produced by pre-2.0 Scylla or by pre-3.0 Cassandra,
-// we need to migrate these old tables to the new format.
-//
-// We provide here a function, db::legacy_schema_migrator::migrate(),
-// for a one-time migration from old to new system tables. The function
-// reads old system tables, write them back in the new format, and finally
-// delete the old system tables. Scylla's main should call this function and
-// wait for the returned future, before starting to serve the database.
-
-#include <boost/iterator/filter_iterator.hpp>
-#include <seastar/core/future-util.hh>
-#include <seastar/util/log.hh>
-#include <map>
-#include <unordered_set>
-#include <chrono>
-
-#include "replica/database.hh"
-#include "legacy_schema_migrator.hh"
-#include "system_keyspace.hh"
-#include "schema_tables.hh"
-#include "schema/schema_builder.hh"
-#include "service/storage_proxy.hh"
-#include "utils/rjson.hh"
-#include "cql3/query_processor.hh"
-#include "cql3/untyped_result_set.hh"
-#include "cql3/util.hh"
-#include "cql3/statements/property_definitions.hh"
-
-static seastar::logger mlogger("legacy_schema_migrator");
-
-namespace db {
-namespace legacy_schema_migrator {
-
-// local data carriers
-
-class migrator {
-public:
-    static const std::unordered_set<sstring> legacy_schema_tables;
-
-    migrator(sharded<service::storage_proxy>& sp, sharded<replica::database>& db, sharded<db::system_keyspace>& sys_ks, cql3::query_processor& qp)
-                    : _sp(sp), _db(db), _sys_ks(sys_ks), _qp(qp) {
-    }
-    migrator(migrator&&) = default;
-
-    typedef db_clock::time_point time_point;
-
-    // TODO: we don't support triggers.
-    // this is a placeholder.
-    struct trigger {
-        time_point timestamp;
-        sstring name;
-        std::unordered_map<sstring, sstring> options;
-    };
-
-    struct table {
-        time_point timestamp;
-        schema_ptr metadata;
-        std::vector<trigger> triggers;
-    };
-
-    struct type {
-        time_point timestamp;
-        user_type metadata;
-    };
-
-    struct function {
-        time_point timestamp;
-        sstring ks_name;
-        sstring fn_name;
-        std::vector<sstring> arg_names;
-        std::vector<sstring> arg_types;
-        sstring return_type;
-        bool called_on_null_input;
-        sstring language;
-        sstring body;
-    };
-
-    struct aggregate {
-        time_point timestamp;
-        sstring ks_name;
-        sstring fn_name;
-        std::vector<sstring> arg_names;
-        std::vector<sstring> arg_types;
-        sstring return_type;
-        sstring final_func;
-        sstring initcond;
-        sstring state_func;
-        sstring state_type;
-    };
-
-    struct keyspace {
-        time_point timestamp;
-        sstring name;
-        bool durable_writes;
-        std::map<sstring, sstring> replication_params;
-
-        std::vector<table> tables;
-        std::vector<type> types;
-        std::vector<function> functions;
-        std::vector<aggregate> aggregates;
-    };
-
-    class unsupported_feature : public std::runtime_error {
-    public:
-        using runtime_error::runtime_error;
-    };
-
-    static sstring fmt_query(const char* fmt, const char* table) {
-        return fmt::format(fmt::runtime(fmt), db::system_keyspace::NAME, table);
-    }
-
-    typedef ::shared_ptr<cql3::untyped_result_set> result_set_type;
-    typedef const cql3::untyped_result_set::row row_type;
-
-    future<> read_table(keyspace& dst, sstring cf_name, time_point timestamp) {
-        auto fmt = "SELECT * FROM {}.{} WHERE keyspace_name = ? AND columnfamily_name = ?";
-        auto tq = fmt_query(fmt, db::system_keyspace::legacy::COLUMNFAMILIES);
-        auto cq = fmt_query(fmt, db::system_keyspace::legacy::COLUMNS);
-        auto zq = fmt_query(fmt, db::system_keyspace::legacy::TRIGGERS);
-
-        typedef std::tuple<future<result_set_type>, future<result_set_type>, future<result_set_type>, future<db::schema_tables::legacy::schema_mutations>> result_tuple;
-
-        return when_all(_qp.execute_internal(tq, { dst.name, cf_name }, cql3::query_processor::cache_internal::yes),
-                        _qp.execute_internal(cq, { dst.name, cf_name }, cql3::query_processor::cache_internal::yes),
-                        _qp.execute_internal(zq, { dst.name, cf_name }, cql3::query_processor::cache_internal::yes),
-                        db::schema_tables::legacy::read_table_mutations(_sp, dst.name, cf_name, db::system_keyspace::legacy::column_families()))
-                    .then([&dst, cf_name, timestamp](result_tuple&& t) {
-
-            result_set_type tables = std::get<0>(t).get();
-            result_set_type columns = std::get<1>(t).get();
-            result_set_type triggers = std::get<2>(t).get();
-            db::schema_tables::legacy::schema_mutations sm = std::get<3>(t).get();
-
-            row_type& td = tables->one();
-
-            auto ks_name = td.get_as<sstring>("keyspace_name");
-            auto cf_name = td.get_as<sstring>("columnfamily_name");
-            auto id = table_id(td.get_or("cf_id", generate_legacy_id(ks_name, cf_name).uuid()));
-
-            schema_builder builder(dst.name, cf_name, id);
-
-            builder.with_version(sm.digest());
-
-            cf_type cf = sstring_to_cf_type(td.get_or("type", sstring("standard")));
-            if (cf == cf_type::super) {
-                fail(unimplemented::cause::SUPER);
-            }
-
-            auto comparator = td.get_as<sstring>("comparator");
-            bool is_compound = cell_comparator::check_compound(comparator);
-            builder.set_is_compound(is_compound);
-            cell_comparator::read_collections(builder, comparator);
-
-            bool filter_sparse = false;
-
-            data_type default_validator = {};
-            if (td.has("default_validator")) {
-                default_validator = db::schema_tables::parse_type(td.get_as<sstring>("default_validator"));
-                if (default_validator->is_counter()) {
-                    builder.set_is_counter(true);
-                }
-                builder.set_default_validation_class(default_validator);
-            }
-
-            /*
-             * Determine whether or not the table is *really* dense
-             * We cannot trust is_dense value of true (see CASSANDRA-11502, that fixed the issue for 2.2 only, and not retroactively),
-             * but we can trust is_dense value of false.
-             */
-            auto is_dense = td.get_opt<bool>("is_dense");
-            if (!is_dense || *is_dense) {
-                is_dense = [&] {
-                    /*
-                     * As said above, this method is only here because we need to deal with thrift upgrades.
-                     * Once a CF has been "upgraded", i.e. we've rebuilt and save its CQL3 metadata at least once,
-                     * then we'll have saved the "is_dense" value and will be good to go.
-                     *
-                     * But non-upgraded thrift CF (and pre-7744 CF) will have no value for "is_dense", so we need
-                     * to infer that information without relying on it in that case. And for the most part this is
-                     * easy, a CF that has at least one REGULAR definition is not dense. But the subtlety is that not
-                     * having a REGULAR definition may not mean dense because of CQL3 definitions that have only the
-                     * PRIMARY KEY defined.
-                     *
-                     * So we need to recognize those special case CQL3 table with only a primary key. If we have some
-                     * clustering columns, we're fine as said above. So the only problem is that we cannot decide for
-                     * sure if a CF without REGULAR columns nor CLUSTERING_COLUMN definition is meant to be dense, or if it
-                     * has been created in CQL3 by say:
-                     *    CREATE TABLE test (k int PRIMARY KEY)
-                     * in which case it should not be dense. However, we can limit our margin of error by assuming we are
-                     * in the latter case only if the comparator is exactly CompositeType(UTF8Type).
-                     */
-                    std::optional<column_id> max_cl_idx;
-                    const cql3::untyped_result_set::row * regular = nullptr;
-                    for (auto& row : *columns) {
-                        auto kind_str = row.get_as<sstring>("type");
-                        if (kind_str == "compact_value") {
-                            continue;
-                        }
-
-                        auto kind = db::schema_tables::deserialize_kind(kind_str);
-
-                        if (kind == column_kind::regular_column) {
-                            if (regular != nullptr) {
-                                return false;
-                            }
-                            regular = &row;
-                            continue;
-                        }
-                        if (kind == column_kind::clustering_key) {
-                            max_cl_idx = std::max(column_id(row.get_or("component_index", 0)), max_cl_idx.value_or(column_id()));
-                        }
-                    }
-
-                    auto is_cql3_only_pk_comparator = [](const sstring& comparator) {
-                        if (!cell_comparator::check_compound(comparator)) {
-                            return false;
-                        }
-                        // CMH. We don't have composites, nor a parser for it. This is a simple way of c
-                        // checking the same.
-                        auto comma = comparator.find(',');
-                        if (comma != sstring::npos) {
-                            return false;
-                        }
-                        auto off = comparator.find('(');
-                        auto end = comparator.find(')');
-
-                        return comparator.compare(off, end - off, utf8_type->name()) == 0;
-                    };
-
-                    if (max_cl_idx) {
-                        auto n = std::count(comparator.begin(), comparator.end(), ','); // num comp - 1
-                        return *max_cl_idx == n;
-                    }
-
-                    if (regular) {
-                        return false;
-                    }
-
-                    return !is_cql3_only_pk_comparator(comparator);
-
-                }();
-
-                // now, if switched to sparse, remove redundant compact_value column and the last clustering column,
-                // directly copying CASSANDRA-11502 logic. See CASSANDRA-11315.
-
-                filter_sparse = !*is_dense;
-            }
-            builder.set_is_dense(*is_dense);
-
-            auto is_cql = !*is_dense && is_compound;
-            auto is_static_compact = !*is_dense && !is_compound;
-
-            // org.apache.cassandra.schema.LegacySchemaMigrator#isEmptyCompactValueColumn
-            auto is_empty_compact_value = [](const cql3::untyped_result_set::row& column_row) {
-                auto kind_str = column_row.get_as<sstring>("type");
-                // Cassandra only checks for "compact_value", but Scylla generates "regular" instead (#2586)
-                return (kind_str == "compact_value" || kind_str == "regular")
-                       && column_row.get_as<sstring>("column_name").empty();
-            };
-
-            for (auto& row : *columns) {
-                auto kind_str = row.get_as<sstring>("type");
-                auto kind = db::schema_tables::deserialize_kind(kind_str);
-                auto component_index = kind > column_kind::clustering_key ? 0 : column_id(row.get_or("component_index", 0));
-                auto name = row.get_or<sstring>("column_name", sstring());
-                auto validator = db::schema_tables::parse_type(row.get_as<sstring>("validator"));
-
-                if (is_empty_compact_value(row)) {
-                    continue;
-                }
-
-                if (filter_sparse) {
-                    if (kind_str == "compact_value") {
-                        continue;
-                    }
-                    if (kind == column_kind::clustering_key) {
-                        if (cf == cf_type::super && component_index != 0) {
-                            continue;
-                        }
-                        if (cf != cf_type::super && !is_compound) {
-                            continue;
-                        }
-                    }
-                }
-
-                std::optional<index_metadata_kind> index_kind;
-                sstring index_name;
-                index_options_map options;
-                if (row.has("index_type")) {
-                    index_kind = schema_tables::deserialize_index_kind(row.get_as<sstring>("index_type"));
-                }
-                if (row.has("index_name")) {
-                    index_name = row.get_as<sstring>("index_name");
-                }
-                if (row.has("index_options")) {
-                    sstring index_options_str = row.get_as<sstring>("index_options");
-                    options = rjson::parse_to_map<index_options_map>(std::string_view(index_options_str));
-                    sstring type;
-                    auto i = options.find("index_keys");
-                    if (i != options.end()) {
-                        options.erase(i);
-                        type = "KEYS";
-                    }
-                    i = options.find("index_keys_and_values");
-                    if (i != options.end()) {
-                        options.erase(i);
-                        type = "KEYS_AND_VALUES";
-                    }
-                    if (type.empty()) {
-                        if (validator->is_collection() && validator->is_multi_cell()) {
-                            type = "FULL";
-                        } else {
-                            type = "VALUES";
-                        }
-                    }
-                    auto column = cql3::util::maybe_quote(name);
-                    options["target"] = validator->is_collection()
-                                    ? type + "(" + column + ")"
-                                    : column;
-                }
-                if (index_kind) {
-                    // Origin assumes index_name is always set, so let's do the same
-                    builder.with_index(index_metadata(index_name, options, *index_kind, index_metadata::is_local_index::no));
-                }
-
-                data_type column_name_type = [&] {
-                    if (is_static_compact && kind == column_kind::regular_column) {
-                        return db::schema_tables::parse_type(comparator);
-                    }
-                    return utf8_type;
-                }();
-                auto column_name = [&] {
-                    try {
-                        return column_name_type->from_string(name);
-                    } catch (marshal_exception&) {
-                        // #2597: Scylla < 2.0 writes names in serialized form, try to recover
-                        column_name_type->validate(to_bytes_view(name));
-                        return to_bytes(name);
-                    }
-                }();
-                builder.with_column_ordered(column_definition(std::move(column_name), std::move(validator), kind, component_index));
-            }
-
-            if (is_static_compact) {
-                builder.set_regular_column_name_type(db::schema_tables::parse_type(comparator));
-            }
-
-            if (td.has("gc_grace_seconds")) {
-                builder.set_gc_grace_seconds(td.get_as<int32_t>("gc_grace_seconds"));
-            }
-            if (td.has("min_compaction_threshold")) {
-                builder.set_min_compaction_threshold(td.get_as<int32_t>("min_compaction_threshold"));
-            }
-            if (td.has("max_compaction_threshold")) {
-                builder.set_max_compaction_threshold(td.get_as<int32_t>("max_compaction_threshold"));
-            }
-            if (td.has("comment")) {
-                builder.set_comment(td.get_as<sstring>("comment"));
-            }
-            if (td.has("memtable_flush_period_in_ms")) {
-                builder.set_memtable_flush_period(td.get_as<int32_t>("memtable_flush_period_in_ms"));
-            }
-            if (td.has("caching")) {
-                builder.set_caching_options(caching_options::from_sstring(td.get_as<sstring>("caching")));
-            }
-            if (td.has("default_time_to_live")) {
-                builder.set_default_time_to_live(gc_clock::duration(td.get_as<int32_t>("default_time_to_live")));
-            }
-            if (td.has("speculative_retry")) {
-                builder.set_speculative_retry(td.get_as<sstring>("speculative_retry"));
-            }
-            if (td.has("compaction_strategy_class")) {
-                auto strategy = td.get_as<sstring>("compaction_strategy_class");
-                try {
-                    builder.set_compaction_strategy(compaction::compaction_strategy::type(strategy));
-                } catch (const exceptions::configuration_exception& e) {
-                    // If compaction strategy class isn't supported, fallback to incremental.
-                    mlogger.warn("Falling back to incremental compaction strategy after the problem: {}", e.what());
-                    builder.set_compaction_strategy(compaction::compaction_strategy_type::incremental);
-                }
-            }
-            if (td.has("compaction_strategy_options")) {
-                sstring strategy_options_str = td.get_as<sstring>("compaction_strategy_options");
-                builder.set_compaction_strategy_options(rjson::parse_to_map<std::map<sstring, sstring>>(std::string_view(strategy_options_str)));
-            }
-            auto comp_param = td.get_as<sstring>("compression_parameters");
-            compression_parameters cp(rjson::parse_to_map<std::map<sstring, sstring>>(std::string_view(comp_param)));
-            builder.set_compressor_params(cp);
-
-            if (td.has("min_index_interval")) {
-                builder.set_min_index_interval(td.get_as<int32_t>("min_index_interval"));
-            } else if (td.has("index_interval")) { // compatibility
-                builder.set_min_index_interval(td.get_as<int32_t>("index_interval"));
-            }
-            if (td.has("max_index_interval")) {
-                builder.set_max_index_interval(td.get_as<int32_t>("max_index_interval"));
-            }
-            if (td.has("bloom_filter_fp_chance")) {
-                builder.set_bloom_filter_fp_chance(td.get_as<double>("bloom_filter_fp_chance"));
-            } else {
-                builder.set_bloom_filter_fp_chance(builder.get_bloom_filter_fp_chance());
-            }
-            if (td.has("dropped_columns")) {
-                auto map = td.get_map<sstring, int64_t>("dropped_columns");
-                for (auto&& e : map) {
-                    builder.without_column(e.first, api::timestamp_type(e.second));
-                };
-            }
-
-            // ignore version. we're transient
-            if (!triggers->empty()) {
-                throw unsupported_feature("triggers");
-            }
-
-            dst.tables.emplace_back(table{timestamp, builder.build() });
-        });
-    }
-
-    future<> read_tables(keyspace& dst) {
-        auto query = fmt_query("SELECT columnfamily_name, writeTime(type) AS timestamp FROM {}.{} WHERE keyspace_name = ?",
-                        db::system_keyspace::legacy::COLUMNFAMILIES);
-        return _qp.execute_internal(query, {dst.name}, cql3::query_processor::cache_internal::yes).then([this, &dst](result_set_type result) {
-            return parallel_for_each(*result, [this, &dst](row_type& row) {
-                return read_table(dst, row.get_as<sstring>("columnfamily_name"), row.get_as<time_point>("timestamp"));
-            }).finally([result] {});
-        });
-    }
-
-    future<time_point> read_type_timestamp(keyspace& dst, sstring type_name) {
-        // TODO: Unfortunately there is not a single REGULAR column in system.schema_usertypes, so annoyingly we cannot
-        // use the writeTime() CQL function, and must resort to a lower level.
-        // Origin digs up the actual cells of target partition and gets timestamp from there.
-        // We should do the same, but g-dam that's messy. Lets give back dung value for now.
-        return make_ready_future<time_point>(dst.timestamp);
-    }
-
-    future<> read_types(keyspace& dst) {
-        auto query = fmt_query("SELECT * FROM {}.{} WHERE keyspace_name = ?", db::system_keyspace::legacy::USERTYPES);
-        return _qp.execute_internal(query, {dst.name}, cql3::query_processor::cache_internal::yes).then([this, &dst](result_set_type result) {
-            return parallel_for_each(*result, [this, &dst](row_type& row) {
-                auto name = row.get_blob_unfragmented("type_name");
-                auto columns = row.get_list<bytes>("field_names");
-                auto types = row.get_list<sstring>("field_types");
-                std::vector<data_type> field_types;
-                for (auto&& value : types) {
-                    field_types.emplace_back(db::schema_tables::parse_type(value));
-                }
-                auto ut = user_type_impl::get_instance(dst.name, name, columns, field_types, false);
-                return read_type_timestamp(dst, value_cast<sstring>(utf8_type->deserialize(name))).then([ut = std::move(ut), &dst](time_point timestamp) {
-                    dst.types.emplace_back(type{timestamp, ut});
-                });
-            }).finally([result] {});
-        });
-    }
-
-    future<> read_functions(keyspace& dst) {
-        auto query = fmt_query("SELECT * FROM {}.{} WHERE keyspace_name = ?", db::system_keyspace::legacy::FUNCTIONS);
-        return _qp.execute_internal(query, {dst.name}, cql3::query_processor::cache_internal::yes).then([](result_set_type result) {
-            if (!result->empty()) {
-                throw unsupported_feature("functions");
-            }
-        });
-    }
-
-    future<> read_aggregates(keyspace& dst) {
-        auto query = fmt_query("SELECT * FROM {}.{} WHERE keyspace_name = ?", db::system_keyspace::legacy::AGGREGATES);
-        return _qp.execute_internal(query, {dst.name}, cql3::query_processor::cache_internal::yes).then([](result_set_type result) {
-            if (!result->empty()) {
-                throw unsupported_feature("aggregates");
-            }
-        });
-    }
-
-    future<keyspace> read_keyspace(sstring ks_name, bool durable_writes, sstring strategy_class, sstring strategy_options, time_point timestamp) {
-        auto map = rjson::parse_to_map<std::map<sstring, sstring>>(std::string_view(strategy_options));
-        map.emplace("class", std::move(strategy_class));
-        auto ks = ::make_lw_shared<keyspace>(keyspace{timestamp, std::move(ks_name), durable_writes, std::move(map) });
-
-        return read_tables(*ks).then([this, ks] {
-            //Collection<Type> types = readTypes(keyspaceName);
-            return read_types(*ks);
-        }).then([this, ks] {
-            return read_functions(*ks);
-        }).then([this, ks] {
-            return read_aggregates(*ks);
-        }).then([ks] {
-            return make_ready_future<keyspace>(std::move(*ks));
-        });
-    }
-
-    future<> read_all_keyspaces() {
-        static auto ks_filter = [](row_type& row) {
-            auto ks_name = row.get_as<sstring>("keyspace_name");
-            return ks_name != db::system_keyspace::NAME && ks_name != db::schema_tables::v3::NAME;
-        };
-
-        auto query = fmt_query("SELECT keyspace_name, durable_writes, strategy_options, strategy_class, writeTime(durable_writes) AS timestamp FROM {}.{}",
-                        db::system_keyspace::legacy::KEYSPACES);
-
-        return _qp.execute_internal(query, cql3::query_processor::cache_internal::yes).then([this](result_set_type result) {
-            auto i = boost::make_filter_iterator(ks_filter, result->begin(), result->end());
-            auto e = boost::make_filter_iterator(ks_filter, result->end(), result->end());
-            return parallel_for_each(i, e, [this](row_type& row) {
-                return read_keyspace(row.get_as<sstring>("keyspace_name")
-                                , row.get_as<bool>("durable_writes")
-                                , row.get_as<sstring>("strategy_class")
-                                , row.get_as<sstring>("strategy_options")
-                                , row.get_as<db_clock::time_point>("timestamp")
-                                ).then([this](keyspace ks) {
-                    _keyspaces.emplace_back(std::move(ks));
-                   });
-            }).finally([result] {});
-        });
-    }
-
-    future<> drop_legacy_tables() {
-        mlogger.info("Dropping legacy schema tables");
-        auto with_snapshot = !_keyspaces.empty();
-        for (const sstring& cfname : legacy_schema_tables) {
-            co_await replica::database::legacy_drop_table_on_all_shards(_db, _sys_ks, db::system_keyspace::NAME, cfname, with_snapshot);
-        }
-    }
-
-    future<> store_keyspaces_in_new_schema_tables() {
-        mlogger.info("Moving {} keyspaces from legacy schema tables to the new schema keyspace ({})",
-                        _keyspaces.size(), db::schema_tables::v3::NAME);
-
-        utils::chunked_vector<mutation> mutations;
-
-        for (auto& ks : _keyspaces) {
-            auto ksm = ::make_lw_shared<keyspace_metadata>(ks.name
-                            , ks.replication_params["class"] // TODO, make ksm like c3?
-                            , cql3::statements::property_definitions::to_extended_map(ks.replication_params)
-                            , std::nullopt
-                            , std::nullopt
-                            , ks.durable_writes);
-
-            // we want separate time stamps for tables/types, so cannot bulk them into the ksm.
-            for (auto&& m : db::schema_tables::make_create_keyspace_mutations(schema_features::full(), ksm, ks.timestamp.time_since_epoch().count(), false)) {
-                mutations.emplace_back(std::move(m));
-            }
-            for (auto& t : ks.tables) {
-                db::schema_tables::add_table_or_view_to_schema_mutation(t.metadata, t.timestamp.time_since_epoch().count(), true, mutations);
-            }
-            for (auto& t : ks.types) {
-                db::schema_tables::add_type_to_schema_mutation(t.metadata, t.timestamp.time_since_epoch().count(), mutations);
-            }
-        }
-        return _qp.proxy().mutate_locally(std::move(mutations), tracing::trace_state_ptr());
-    }
-
-    future<> flush_schemas() {
-        auto& db = _qp.db().real_database().container();
-        return replica::database::flush_tables_on_all_shards(db, db::schema_tables::all_table_infos(schema_features::full()));
-    }
-
-    future<> migrate() {
-        return read_all_keyspaces().then([this]() {
-            // write metadata to the new schema tables
-            return store_keyspaces_in_new_schema_tables()
-                                                .then(std::bind(&migrator::flush_schemas, this))
-                                                .then(std::bind(&migrator::drop_legacy_tables, this))
-                                                .then([] { mlogger.info("Completed migration of legacy schema tables"); });
-        });
-    }
-
-    sharded<service::storage_proxy>& _sp;
-    sharded<replica::database>& _db;
-    sharded<db::system_keyspace>& _sys_ks;
-    cql3::query_processor& _qp;
-    std::vector<keyspace> _keyspaces;
-};
-
-const std::unordered_set<sstring> migrator::legacy_schema_tables = {
-                db::system_keyspace::legacy::KEYSPACES,
-                db::system_keyspace::legacy::COLUMNFAMILIES,
-                db::system_keyspace::legacy::COLUMNS,
-                db::system_keyspace::legacy::TRIGGERS,
-                db::system_keyspace::legacy::USERTYPES,
-                db::system_keyspace::legacy::FUNCTIONS,
-                db::system_keyspace::legacy::AGGREGATES,
-};
-
-}
-}
-
-future<>
-db::legacy_schema_migrator::migrate(sharded<service::storage_proxy>& sp, sharded<replica::database>& db, sharded<db::system_keyspace>& sys_ks, cql3::query_processor& qp) {
-    return do_with(migrator(sp, db, sys_ks, qp), std::bind(&migrator::migrate, std::placeholders::_1));
-}
-
--- a/db/legacy_schema_migrator.hh
+++ b/db/legacy_schema_migrator.hh
@@ -1,37 +0,0 @@
-/*
- * Modified by ScyllaDB
- * Copyright (C) 2017-present ScyllaDB
- */
-
-/*
- * SPDX-License-Identifier: (LicenseRef-ScyllaDB-Source-Available-1.0 and Apache-2.0)
- */
-
-#pragma once
-
-#include <seastar/core/future.hh>
-#include <seastar/core/sharded.hh>
-
-#include "seastarx.hh"
-
-namespace replica {
-class database;
-}
-
-namespace cql3 {
-class query_processor;
-}
-
-namespace service {
-class storage_proxy;
-}
-
-namespace db {
-class system_keyspace;
-
-namespace legacy_schema_migrator {
-
-future<> migrate(sharded<service::storage_proxy>&, sharded<replica::database>& db, sharded<db::system_keyspace>& sys_ks, cql3::query_processor&);
-
-}
-}
--- a/db/partition_snapshot_row_cursor.hh
+++ b/db/partition_snapshot_row_cursor.hh
@@ -542,6 +542,7 @@ public:
    // Returns the range tombstone for the key range adjacent to the cursor's position from the side of smaller keys.
    // Excludes the range for the row itself. That information is returned by range_tombstone_for_row().
    // It's possible that range_tombstone() is empty and range_tombstone_for_row() is not empty.
+    // Note that this is different from the meaning of rows_entry::range_tombstone(), which includes the row itself.
    tombstone range_tombstone() const { return _range_tombstone; }

    // Can be called when cursor is pointing at a row.
--- a/db/row_cache.cc
+++ b/db/row_cache.cc
@@ -1287,6 +1287,15 @@ row_cache::row_cache(schema_ptr s, snapshot_source src, cache_tracker& tracker,
    , _partitions(dht::raw_token_less_comparator{})
    , _underlying(src())
    , _snapshot_source(std::move(src))
+    , _update_section(abstract_formatter([this] (fmt::context& ctx) {
+        fmt::format_to(ctx.out(), "cache.update {}.{}", _schema->ks_name(), _schema->cf_name());
+    }))
+    , _populate_section(abstract_formatter([this] (fmt::context& ctx) {
+        fmt::format_to(ctx.out(), "cache.populate {}.{}", _schema->ks_name(), _schema->cf_name());
+    }))
+    , _read_section(abstract_formatter([this] (fmt::context& ctx) {
+        fmt::format_to(ctx.out(), "cache.read {}.{}", _schema->ks_name(), _schema->cf_name());
+    }))
 {
  try {
    with_allocator(_tracker.allocator(), [this, cont] {
--- a/db/schema_applier.cc
+++ b/db/schema_applier.cc
@@ -1121,7 +1121,7 @@ future<> schema_applier::commit() {
    // Run func first on shard 0
    // to allow "seeding" of the effective_replication_map
    // with a new e_r_m instance.
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    commit_on_shard(sharded_db.local());
    co_await sharded_db.invoke_on_others([this] (replica::database& db) {
        commit_on_shard(db);
--- a/db/schema_tables.cc
+++ b/db/schema_tables.cc
@@ -404,10 +404,7 @@ const std::unordered_set<table_id>& schema_tables_holding_schema_mutations() {
                computed_columns(),
                dropped_columns(),
                indexes(),
-                scylla_tables(),
-                db::system_keyspace::legacy::column_families(),
-                db::system_keyspace::legacy::columns(),
-                db::system_keyspace::legacy::triggers()}) {
+                scylla_tables()}) {
            SCYLLA_ASSERT(s->clustering_key_size() > 0);
            auto&& first_column_name = s->clustering_column_at(0).name_as_text();
            SCYLLA_ASSERT(first_column_name == "table_name"
@@ -2840,26 +2837,6 @@ void check_no_legacy_secondary_index_mv_schema(replica::database& db, const view
 }


-namespace legacy {
-
-table_schema_version schema_mutations::digest() const {
-    md5_hasher h;
-    const db::schema_features no_features;
-    db::schema_tables::feed_hash_for_schema_digest(h, _columnfamilies, no_features);
-    db::schema_tables::feed_hash_for_schema_digest(h, _columns, no_features);
-    return table_schema_version(utils::UUID_gen::get_name_UUID(h.finalize()));
-}
-
-future<schema_mutations> read_table_mutations(sharded<service::storage_proxy>& proxy,
-    sstring keyspace_name, sstring table_name, schema_ptr s)
-{
-    mutation cf_m = co_await read_schema_partition_for_table(proxy, s, keyspace_name, table_name);
-    mutation col_m = co_await read_schema_partition_for_table(proxy, db::system_keyspace::legacy::columns(), keyspace_name, table_name);
-    co_return schema_mutations{std::move(cf_m), std::move(col_m)};
-}
-
-} // namespace legacy
-
 static auto GET_COLUMN_MAPPING_QUERY = format("SELECT column_name, clustering_order, column_name_bytes, kind, position, type FROM system.{} WHERE cf_id = ? AND schema_version = ?",
    db::schema_tables::SCYLLA_TABLE_SCHEMA_HISTORY);

--- a/db/schema_tables.hh
+++ b/db/schema_tables.hh
@@ -155,24 +155,6 @@ schema_ptr scylla_table_schema_history();
 const std::unordered_set<table_id>& schema_tables_holding_schema_mutations();
 }

-namespace legacy {
-
-class schema_mutations {
-    mutation _columnfamilies;
-    mutation _columns;
-public:
-    schema_mutations(mutation columnfamilies, mutation columns)
-        : _columnfamilies(std::move(columnfamilies))
-        , _columns(std::move(columns))
-    { }
-    table_schema_version digest() const;
-};
-
-future<schema_mutations> read_table_mutations(sharded<service::storage_proxy>& proxy,
-    sstring keyspace_name, sstring table_name, schema_ptr s);
-
-}
-
 struct qualified_name {
    sstring keyspace_name;
    sstring table_name;
--- a/db/size_estimates_virtual_reader.cc
+++ b/db/size_estimates_virtual_reader.cc
@@ -187,7 +187,7 @@ static future<std::vector<token_range>> get_local_ranges(replica::database& db,
        auto ranges = db.get_token_metadata().get_primary_ranges_for(std::move(tokens));
        std::vector<token_range> local_ranges;
        auto to_bytes = [](const std::optional<dht::token_range::bound>& b) {
-            scylla_assert(b);
+            SCYLLA_ASSERT(b);
            return utf8_type->decompose(b->value().to_sstring());
        };
        // We merge the ranges to be compatible with how Cassandra shows it's size estimates table.
--- a/db/snapshot-ctl.cc
+++ b/db/snapshot-ctl.cc
@@ -65,7 +65,7 @@ future<> snapshot_ctl::run_snapshot_modify_operation(noncopyable_function<future
    });
 }

-future<> snapshot_ctl::take_snapshot(sstring tag, std::vector<sstring> keyspace_names, skip_flush sf) {
+future<> snapshot_ctl::take_snapshot(sstring tag, std::vector<sstring> keyspace_names, snapshot_options opts) {
    if (tag.empty()) {
        throw std::runtime_error("You must supply a snapshot name.");
    }
@@ -74,21 +74,21 @@ future<> snapshot_ctl::take_snapshot(sstring tag, std::vector<sstring> keyspace_
        std::ranges::copy(_db.local().get_keyspaces() | std::views::keys, std::back_inserter(keyspace_names));
    };

-    return run_snapshot_modify_operation([tag = std::move(tag), keyspace_names = std::move(keyspace_names), sf, this] () mutable {
-        return do_take_snapshot(std::move(tag), std::move(keyspace_names), sf);
+    return run_snapshot_modify_operation([tag = std::move(tag), keyspace_names = std::move(keyspace_names), opts, this] () mutable {
+        return do_take_snapshot(std::move(tag), std::move(keyspace_names), opts);
    });
 }

-future<> snapshot_ctl::do_take_snapshot(sstring tag, std::vector<sstring> keyspace_names, skip_flush sf) {
+future<> snapshot_ctl::do_take_snapshot(sstring tag, std::vector<sstring> keyspace_names, snapshot_options opts) {
    co_await coroutine::parallel_for_each(keyspace_names, [tag, this] (const auto& ks_name) {
        return check_snapshot_not_exist(ks_name, tag);
    });
-    co_await coroutine::parallel_for_each(keyspace_names, [this, tag = std::move(tag), sf] (const auto& ks_name) {
-        return replica::database::snapshot_keyspace_on_all_shards(_db, ks_name, tag, bool(sf));
+    co_await coroutine::parallel_for_each(keyspace_names, [this, tag = std::move(tag), opts] (const auto& ks_name) {
+        return replica::database::snapshot_keyspace_on_all_shards(_db, ks_name, tag, opts);
    });
 }

-future<> snapshot_ctl::take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, skip_flush sf) {
+future<> snapshot_ctl::take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, snapshot_options opts) {
    if (ks_name.empty()) {
        throw std::runtime_error("You must supply a keyspace name");
    }
@@ -99,14 +99,14 @@ future<> snapshot_ctl::take_column_family_snapshot(sstring ks_name, std::vector<
        throw std::runtime_error("You must supply a snapshot name.");
    }

-    return run_snapshot_modify_operation([this, ks_name = std::move(ks_name), tables = std::move(tables), tag = std::move(tag), sf] () mutable {
-        return do_take_column_family_snapshot(std::move(ks_name), std::move(tables), std::move(tag), sf);
+    return run_snapshot_modify_operation([this, ks_name = std::move(ks_name), tables = std::move(tables), tag = std::move(tag), opts] () mutable {
+        return do_take_column_family_snapshot(std::move(ks_name), std::move(tables), std::move(tag), opts);
    });
 }

-future<> snapshot_ctl::do_take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, skip_flush sf) {
+future<> snapshot_ctl::do_take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, snapshot_options opts) {
    co_await check_snapshot_not_exist(ks_name, tag, tables);
-    co_await replica::database::snapshot_tables_on_all_shards(_db, ks_name, std::move(tables), std::move(tag), bool(sf));
+    co_await replica::database::snapshot_tables_on_all_shards(_db, ks_name, std::move(tables), std::move(tag), opts);
 }

 future<> snapshot_ctl::clear_snapshot(sstring tag, std::vector<sstring> keyspace_names, sstring cf_name) {
--- a/db/snapshot-ctl.hh
+++ b/db/snapshot-ctl.hh
@@ -38,10 +38,13 @@ class backup_task_impl;

 } // snapshot namespace

+struct snapshot_options {
+    bool skip_flush = false;
+    bool use_sstable_identifier = false;
+};
+
 class snapshot_ctl : public peering_sharded_service<snapshot_ctl> {
 public:
-    using skip_flush = bool_class<class skip_flush_tag>;
-
    struct table_snapshot_details {
        int64_t total;
        int64_t live;
@@ -70,8 +73,8 @@ public:
     *
     * @param tag the tag given to the snapshot; may not be null or empty
     */
-    future<> take_snapshot(sstring tag, skip_flush sf = skip_flush::no) {
-        return take_snapshot(tag, {}, sf);
+    future<> take_snapshot(sstring tag, snapshot_options opts = {}) {
+        return take_snapshot(tag, {}, opts);
    }

    /**
@@ -80,7 +83,7 @@ public:
     * @param tag the tag given to the snapshot; may not be null or empty
     * @param keyspace_names the names of the keyspaces to snapshot; empty means "all"
     */
-    future<> take_snapshot(sstring tag, std::vector<sstring> keyspace_names, skip_flush sf = skip_flush::no);
+    future<> take_snapshot(sstring tag, std::vector<sstring> keyspace_names, snapshot_options opts = {});

    /**
     * Takes the snapshot of multiple tables. A snapshot name must be specified.
@@ -89,7 +92,7 @@ public:
     * @param tables a vector of tables names to snapshot
     * @param tag the tag given to the snapshot; may not be null or empty
     */
-    future<> take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, skip_flush sf = skip_flush::no);
+    future<> take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, snapshot_options opts = {});

    /**
     * Remove the snapshot with the given name from the given keyspaces.
@@ -127,8 +130,8 @@ private:

    friend class snapshot::backup_task_impl;

-    future<> do_take_snapshot(sstring tag, std::vector<sstring> keyspace_names, skip_flush sf = skip_flush::no);
-    future<> do_take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, skip_flush sf = skip_flush::no);
+    future<> do_take_snapshot(sstring tag, std::vector<sstring> keyspace_names, snapshot_options opts = {}  );
+    future<> do_take_column_family_snapshot(sstring ks_name, std::vector<sstring> tables, sstring tag, snapshot_options opts = {});
 };

 }
--- a/db/system_distributed_keyspace.cc
+++ b/db/system_distributed_keyspace.cc
@@ -231,7 +231,7 @@ static schema_ptr get_current_service_levels(data_dictionary::database db) {
 }

 static schema_ptr get_updated_service_levels(data_dictionary::database db, bool workload_prioritization_enabled) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    auto schema = get_current_service_levels(db);
    schema_builder b(schema);
    for (const auto& col : new_service_levels_columns(workload_prioritization_enabled)) {
--- a/db/system_keyspace.cc
+++ b/db/system_keyspace.cc
@@ -137,6 +137,8 @@ namespace {
                system_keyspace::ROLE_PERMISSIONS,
                system_keyspace::DICTS,
                system_keyspace::VIEW_BUILDING_TASKS,
+                // repair tasks
+                system_keyspace::REPAIR_TASKS,
            };
            if (ks_name == system_keyspace::NAME && tables.contains(cf_name)) {
                props.is_group0_table = true;
@@ -462,6 +464,24 @@ schema_ptr system_keyspace::repair_history() {
    return schema;
 }

+schema_ptr system_keyspace::repair_tasks() {
+    static thread_local auto schema = [] {
+        auto id = generate_legacy_id(NAME, REPAIR_TASKS);
+        return schema_builder(NAME, REPAIR_TASKS, std::optional(id))
+            .with_column("task_uuid", uuid_type, column_kind::partition_key)
+            .with_column("operation", utf8_type, column_kind::clustering_key)
+            // First and last token for of the tablet
+            .with_column("first_token", long_type, column_kind::clustering_key)
+            .with_column("last_token", long_type, column_kind::clustering_key)
+            .with_column("timestamp", timestamp_type)
+            .with_column("table_uuid", uuid_type, column_kind::static_column)
+            .set_comment("Record tablet repair tasks")
+            .with_hash_version()
+            .build();
+    }();
+    return schema;
+}
+
 schema_ptr system_keyspace::built_indexes() {
    static thread_local auto built_indexes = [] {
        schema_builder builder(generate_legacy_id(NAME, BUILT_INDEXES), NAME, BUILT_INDEXES,
@@ -847,8 +867,6 @@ schema_ptr system_keyspace::corrupt_data() {
    return corrupt_data;
 }

-static constexpr auto schema_gc_grace = std::chrono::duration_cast<std::chrono::seconds>(days(7)).count();
-
 /*static*/ schema_ptr system_keyspace::scylla_local() {
    static thread_local auto scylla_local = [] {
        schema_builder builder(generate_legacy_id(NAME, SCYLLA_LOCAL), NAME, SCYLLA_LOCAL,
@@ -1360,289 +1378,6 @@ schema_ptr system_keyspace::role_permissions() {
    return schema;
 }

-schema_ptr system_keyspace::legacy::hints() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, HINTS), NAME, HINTS,
-        // partition key
-        {{"target_id", uuid_type}},
-        // clustering key
-        {{"hint_id", timeuuid_type}, {"message_version", int32_type}},
-        // regular columns
-        {{"mutation", bytes_type}},
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* hints awaiting delivery"
-       );
-       builder.set_gc_grace_seconds(0);
-       builder.set_compaction_strategy(compaction::compaction_strategy_type::incremental);
-       builder.set_compaction_strategy_options({{"enabled", "false"}});
-       builder.with(schema_builder::compact_storage::yes);
-       builder.with_hash_version();
-       return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::batchlog() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, BATCHLOG), NAME, BATCHLOG,
-        // partition key
-        {{"id", uuid_type}},
-        // clustering key
-        {},
-        // regular columns
-        {{"data", bytes_type}, {"version", int32_type}, {"written_at", timestamp_type}},
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* batchlog entries"
-       );
-       builder.set_gc_grace_seconds(0);
-       builder.set_compaction_strategy(compaction::compaction_strategy_type::incremental);
-       builder.set_compaction_strategy_options({{"min_threshold", "2"}});
-       builder.with(schema_builder::compact_storage::no);
-       builder.with_hash_version();
-       return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::keyspaces() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, KEYSPACES), NAME, KEYSPACES,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {},
-        // regular columns
-        {
-         {"durable_writes", boolean_type},
-         {"strategy_class", utf8_type},
-         {"strategy_options", utf8_type}
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* keyspace definitions"
-       );
-       builder.set_gc_grace_seconds(schema_gc_grace);
-       builder.with(schema_builder::compact_storage::yes);
-       builder.with_hash_version();
-       return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::column_families() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, COLUMNFAMILIES), NAME, COLUMNFAMILIES,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"columnfamily_name", utf8_type}},
-        // regular columns
-        {
-         {"bloom_filter_fp_chance", double_type},
-         {"caching", utf8_type},
-         {"cf_id", uuid_type},
-         {"comment", utf8_type},
-         {"compaction_strategy_class", utf8_type},
-         {"compaction_strategy_options", utf8_type},
-         {"comparator", utf8_type},
-         {"compression_parameters", utf8_type},
-         {"default_time_to_live", int32_type},
-         {"default_validator", utf8_type},
-         {"dropped_columns",  map_type_impl::get_instance(utf8_type, long_type, true)},
-         {"gc_grace_seconds", int32_type},
-         {"is_dense", boolean_type},
-         {"key_validator", utf8_type},
-         {"max_compaction_threshold", int32_type},
-         {"max_index_interval", int32_type},
-         {"memtable_flush_period_in_ms", int32_type},
-         {"min_compaction_threshold", int32_type},
-         {"min_index_interval", int32_type},
-         {"speculative_retry", utf8_type},
-         {"subcomparator", utf8_type},
-         {"type", utf8_type},
-         // The following 4 columns are only present up until 2.1.8 tables
-         {"key_aliases", utf8_type},
-         {"value_alias", utf8_type},
-         {"column_aliases", utf8_type},
-         {"index_interval", int32_type},},
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* table definitions"
-       );
-       builder.set_gc_grace_seconds(schema_gc_grace);
-       builder.with(schema_builder::compact_storage::no);
-       builder.with_hash_version();
-       return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::columns() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, COLUMNS), NAME, COLUMNS,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"columnfamily_name", utf8_type}, {"column_name", utf8_type}},
-        // regular columns
-        {
-            {"component_index", int32_type},
-            {"index_name", utf8_type},
-            {"index_options", utf8_type},
-            {"index_type", utf8_type},
-            {"type", utf8_type},
-            {"validator", utf8_type},
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "column definitions"
-        );
-        builder.set_gc_grace_seconds(schema_gc_grace);
-        builder.with(schema_builder::compact_storage::no);
-        builder.with_hash_version();
-        return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::triggers() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, TRIGGERS), NAME, TRIGGERS,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"columnfamily_name", utf8_type}, {"trigger_name", utf8_type}},
-        // regular columns
-        {
-            {"trigger_options",  map_type_impl::get_instance(utf8_type, utf8_type, true)},
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "trigger definitions"
-        );
-        builder.set_gc_grace_seconds(schema_gc_grace);
-        builder.with(schema_builder::compact_storage::no);
-        builder.with_hash_version();
-        return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::usertypes() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, USERTYPES), NAME, USERTYPES,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"type_name", utf8_type}},
-        // regular columns
-        {
-            {"field_names", list_type_impl::get_instance(utf8_type, true)},
-            {"field_types", list_type_impl::get_instance(utf8_type, true)},
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "user defined type definitions"
-        );
-        builder.set_gc_grace_seconds(schema_gc_grace);
-        builder.with(schema_builder::compact_storage::no);
-        builder.with_hash_version();
-        return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::functions() {
-    /**
-     * Note: we have our own "legacy" version of this table (in schema_tables),
-     * but it is (afaik) not used, and differs slightly from the origin one.
-     * This is based on the origin schema, since we're more likely to encounter
-     * installations of that to migrate, rather than our own (if we dont use the table).
-     */
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, FUNCTIONS), NAME, FUNCTIONS,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"function_name", utf8_type},{"signature", list_type_impl::get_instance(utf8_type, false)}},
-        // regular columns
-        {
-            {"argument_names", list_type_impl::get_instance(utf8_type, true)},
-            {"argument_types", list_type_impl::get_instance(utf8_type, true)},
-            {"body", utf8_type},
-            {"language", utf8_type},
-            {"return_type", utf8_type},
-            {"called_on_null_input", boolean_type},
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* user defined type definitions"
-        );
-        builder.set_gc_grace_seconds(schema_gc_grace);
-        builder.with(schema_builder::compact_storage::no);
-        builder.with_hash_version();
-        return builder.build();
-    }();
-    return schema;
-}
-
-schema_ptr system_keyspace::legacy::aggregates() {
-    static thread_local auto schema = [] {
-        schema_builder builder(generate_legacy_id(NAME, AGGREGATES), NAME, AGGREGATES,
-        // partition key
-        {{"keyspace_name", utf8_type}},
-        // clustering key
-        {{"aggregate_name", utf8_type},{"signature", list_type_impl::get_instance(utf8_type, false)}},
-        // regular columns
-        {
-            {"argument_types", list_type_impl::get_instance(utf8_type, true)},
-            {"final_func", utf8_type},
-            {"initcond", bytes_type},
-            {"return_type", utf8_type},
-            {"state_func", utf8_type},
-            {"state_type", utf8_type},
-        },
-        // static columns
-        {},
-        // regular column name type
-        utf8_type,
-        // comment
-        "*DEPRECATED* user defined aggregate definition"
-        );
-        builder.set_gc_grace_seconds(schema_gc_grace);
-        builder.with(schema_builder::compact_storage::no);
-        builder.with_hash_version();
-        return builder.build();
-    }();
-    return schema;
-}
-
 schema_ptr system_keyspace::dicts() {
    static thread_local auto schema = [] {
        auto id = generate_legacy_id(NAME, DICTS);
@@ -2596,6 +2331,7 @@ std::vector<schema_ptr> system_keyspace::all_tables(const db::config& cfg) {
                    corrupt_data(),
                    scylla_local(), db::schema_tables::scylla_table_schema_history(),
                    repair_history(),
+                    repair_tasks(),
                    v3::views_builds_in_progress(), v3::built_views(),
                    v3::scylla_views_builds_in_progress(),
                    v3::truncated(),
@@ -2615,13 +2351,6 @@ std::vector<schema_ptr> system_keyspace::all_tables(const db::config& cfg) {
    if (cfg.check_experimental(db::experimental_features_t::feature::KEYSPACE_STORAGE_OPTIONS)) {
        r.insert(r.end(), {sstables_registry()});
    }
-    // legacy schema
-    r.insert(r.end(), {
-                    // TODO: once we migrate hints/batchlog and add converter
-                    // legacy::hints(), legacy::batchlog(),
-                    legacy::keyspaces(), legacy::column_families(),
-                    legacy::columns(), legacy::triggers(), legacy::usertypes(),
-                    legacy::functions(), legacy::aggregates(), });

    return r;
 }
@@ -2844,6 +2573,32 @@ future<> system_keyspace::get_repair_history(::table_id table_id, repair_history
    });
 }

+future<utils::chunked_vector<canonical_mutation>> system_keyspace::get_update_repair_task_mutations(const repair_task_entry& entry, api::timestamp_type ts) {
+    // Default to timeout the repair task entries in 10 days, this should be enough time for the management tools to query
+    constexpr int ttl = 10 * 24 * 3600;
+    sstring req = format("INSERT INTO system.{} (task_uuid, operation, first_token, last_token, timestamp, table_uuid) VALUES (?, ?, ?, ?, ?, ?) USING TTL {}", REPAIR_TASKS, ttl);
+    auto muts = co_await _qp.get_mutations_internal(req, internal_system_query_state(), ts,
+            {entry.task_uuid.uuid(), repair_task_operation_to_string(entry.operation),
+            entry.first_token, entry.last_token, entry.timestamp, entry.table_uuid.uuid()});
+    utils::chunked_vector<canonical_mutation> cmuts = {muts.begin(), muts.end()};
+    co_return cmuts;
+}
+
+future<> system_keyspace::get_repair_task(tasks::task_id task_uuid, repair_task_consumer f) {
+    sstring req = format("SELECT * from system.{} WHERE task_uuid = {}", REPAIR_TASKS, task_uuid);
+    co_await _qp.query_internal(req, [&f] (const cql3::untyped_result_set::row& row) mutable -> future<stop_iteration> {
+        repair_task_entry ent;
+        ent.task_uuid = tasks::task_id(row.get_as<utils::UUID>("task_uuid"));
+        ent.operation = repair_task_operation_from_string(row.get_as<sstring>("operation"));
+        ent.first_token = row.get_as<int64_t>("first_token");
+        ent.last_token = row.get_as<int64_t>("last_token");
+        ent.timestamp = row.get_as<db_clock::time_point>("timestamp");
+        ent.table_uuid = ::table_id(row.get_as<utils::UUID>("table_uuid"));
+        co_await f(std::move(ent));
+        co_return stop_iteration::no;
+    });
+}
+
 future<gms::generation_type> system_keyspace::increment_and_get_generation() {
    auto req = format("SELECT gossip_generation FROM system.{} WHERE key='{}'", LOCAL, LOCAL);
    auto rs = co_await _qp.execute_internal(req, cql3::query_processor::cache_internal::yes);
@@ -4015,4 +3770,35 @@ future<> system_keyspace::apply_mutation(mutation m) {
    return _qp.proxy().mutate_locally(m, {}, db::commitlog::force_sync(m.schema()->static_props().wait_for_sync_to_commitlog), db::no_timeout);
 }

+// The names are persisted in system tables so should not be changed.
+static const std::unordered_map<system_keyspace::repair_task_operation, sstring> repair_task_operation_to_name = {
+    {system_keyspace::repair_task_operation::requested, "requested"},
+    {system_keyspace::repair_task_operation::finished, "finished"},
+};
+
+static const std::unordered_map<sstring, system_keyspace::repair_task_operation> repair_task_operation_from_name = std::invoke([] {
+    std::unordered_map<sstring, system_keyspace::repair_task_operation> result;
+    for (auto&& [v, s] : repair_task_operation_to_name) {
+        result.emplace(s, v);
+    }
+    return result;
+});
+
+sstring system_keyspace::repair_task_operation_to_string(system_keyspace::repair_task_operation op) {
+    auto i = repair_task_operation_to_name.find(op);
+    if (i == repair_task_operation_to_name.end()) {
+        on_internal_error(slogger, format("Invalid repair task operation: {}", static_cast<int>(op)));
+    }
+    return i->second;
+}
+
+system_keyspace::repair_task_operation system_keyspace::repair_task_operation_from_string(const sstring& name) {
+    return repair_task_operation_from_name.at(name);
+}
+
 } // namespace db
+
+auto fmt::formatter<db::system_keyspace::repair_task_operation>::format(const db::system_keyspace::repair_task_operation& op, fmt::format_context& ctx) const
+        -> decltype(ctx.out()) {
+    return fmt::format_to(ctx.out(), "{}", db::system_keyspace::repair_task_operation_to_string(op));
+}
--- a/db/system_keyspace.hh
+++ b/db/system_keyspace.hh
@@ -57,6 +57,8 @@ namespace paxos {
 struct topology_request_state;

 class group0_guard;
+
+class raft_group0_client;
 }

 namespace netw {
@@ -184,6 +186,7 @@ public:
    static constexpr auto RAFT_SNAPSHOTS = "raft_snapshots";
    static constexpr auto RAFT_SNAPSHOT_CONFIG = "raft_snapshot_config";
    static constexpr auto REPAIR_HISTORY = "repair_history";
+    static constexpr auto REPAIR_TASKS = "repair_tasks";
    static constexpr auto GROUP0_HISTORY = "group0_history";
    static constexpr auto DISCOVERY = "discovery";
    static constexpr auto BROADCAST_KV_STORE = "broadcast_kv_store";
@@ -241,28 +244,6 @@ public:
        static schema_ptr cdc_local();
    };

-    struct legacy {
-        static constexpr auto HINTS = "hints";
-        static constexpr auto BATCHLOG = "batchlog";
-        static constexpr auto KEYSPACES = "schema_keyspaces";
-        static constexpr auto COLUMNFAMILIES = "schema_columnfamilies";
-        static constexpr auto COLUMNS = "schema_columns";
-        static constexpr auto TRIGGERS = "schema_triggers";
-        static constexpr auto USERTYPES = "schema_usertypes";
-        static constexpr auto FUNCTIONS = "schema_functions";
-        static constexpr auto AGGREGATES = "schema_aggregates";
-
-        static schema_ptr keyspaces();
-        static schema_ptr column_families();
-        static schema_ptr columns();
-        static schema_ptr triggers();
-        static schema_ptr usertypes();
-        static schema_ptr functions();
-        static schema_ptr aggregates();
-        static schema_ptr hints();
-        static schema_ptr batchlog();
-    };
-
    // Partition estimates for a given range of tokens.
    struct range_estimates {
        schema_ptr schema;
@@ -282,6 +263,7 @@ public:
    static schema_ptr raft();
    static schema_ptr raft_snapshots();
    static schema_ptr repair_history();
+    static schema_ptr repair_tasks();
    static schema_ptr group0_history();
    static schema_ptr discovery();
    static schema_ptr broadcast_kv_store();
@@ -420,6 +402,22 @@ public:
        int64_t range_end;
    };

+    enum class repair_task_operation {
+        requested,
+        finished,
+    };
+    static sstring repair_task_operation_to_string(repair_task_operation op);
+    static repair_task_operation repair_task_operation_from_string(const sstring& name);
+
+    struct repair_task_entry {
+        tasks::task_id task_uuid;
+        repair_task_operation operation;
+        int64_t first_token;
+        int64_t last_token;
+        db_clock::time_point timestamp;
+        table_id table_uuid;
+    };
+
    struct topology_requests_entry {
        utils::UUID id;
        utils::UUID initiating_host;
@@ -441,6 +439,10 @@ public:
    using repair_history_consumer = noncopyable_function<future<>(const repair_history_entry&)>;
    future<> get_repair_history(table_id, repair_history_consumer f);

+    future<utils::chunked_vector<canonical_mutation>> get_update_repair_task_mutations(const repair_task_entry& entry, api::timestamp_type ts);
+    using repair_task_consumer = noncopyable_function<future<>(const repair_task_entry&)>;
+    future<> get_repair_task(tasks::task_id task_uuid, repair_task_consumer f);
+
    future<> save_truncation_record(const replica::column_family&, db_clock::time_point truncated_at, db::replay_position);
    future<replay_positions> get_truncated_positions(table_id);
    future<> drop_truncation_rp_records();
@@ -748,3 +750,8 @@ public:
 }; // class system_keyspace

 } // namespace db
+
+template <>
+struct fmt::formatter<db::system_keyspace::repair_task_operation> : fmt::formatter<string_view> {
+    auto format(const db::system_keyspace::repair_task_operation&, fmt::format_context& ctx) const -> decltype(ctx.out());
+};
--- a/db/view/row_locking.cc
+++ b/db/view/row_locking.cc
@@ -153,14 +153,14 @@ row_locker::unlock(const dht::decorated_key* pk, bool partition_exclusive,
            mylog.error("column_family::local_base_lock_holder::~local_base_lock_holder() can't find lock for partition", *pk);
            return;
        }
-        scylla_assert(&pli->first == pk);
+        SCYLLA_ASSERT(&pli->first == pk);
        if (cpk) {
            auto rli = pli->second._row_locks.find(*cpk);
            if (rli == pli->second._row_locks.end()) {
                mylog.error("column_family::local_base_lock_holder::~local_base_lock_holder() can't find lock for row", *cpk);
                return;
            }
-            scylla_assert(&rli->first == cpk);
+            SCYLLA_ASSERT(&rli->first == cpk);
            mylog.debug("releasing {} lock for row {} in partition {}", (row_exclusive ? "exclusive" : "shared"), *cpk, *pk);
            auto& lock = rli->second;
            if (row_exclusive) {
--- a/db/view/view.cc
+++ b/db/view/view.cc
@@ -1744,6 +1744,115 @@ bool should_generate_view_updates_on_this_shard(const schema_ptr& base, const lo
        && std::ranges::contains(shards, this_shard_id());
 }

+static endpoints_to_update get_view_natural_endpoint_vnodes(
+        locator::host_id me,
+        std::vector<std::reference_wrapper<const locator::node>> base_nodes,
+        std::vector<std::reference_wrapper<const locator::node>> view_nodes,
+        locator::endpoint_dc_rack my_location,
+        const locator::network_topology_strategy* network_topology,
+        replica::cf_stats& cf_stats) {
+    using node_vector = std::vector<std::reference_wrapper<const locator::node>>;
+    node_vector base_endpoints, view_endpoints;
+    auto& my_datacenter = my_location.dc;
+
+    auto process_candidate = [&] (node_vector& nodes, std::reference_wrapper<const locator::node> node) {
+        if (!network_topology || node.get().dc() == my_datacenter) {
+            nodes.emplace_back(node);
+        }
+    };
+
+    for (auto&& base_node : base_nodes) {
+        process_candidate(base_endpoints, base_node);
+    }
+
+    for (auto&& view_node : view_nodes) {
+        auto it = std::ranges::find(base_endpoints, view_node.get().host_id(), std::mem_fn(&locator::node::host_id));
+        // If this base replica is also one of the view replicas, we use
+        // ourselves as the view replica.
+        // We don't return an extra endpoint, as it's only needed when
+        // using tablets (so !use_legacy_self_pairing)
+        if (view_node.get().host_id() == me && it != base_endpoints.end()) {
+            return {.natural_endpoint = me};
+        }
+
+        // We have to remove any endpoint which is shared between the base
+        // and the view, as it will select itself and throw off the counts
+        // otherwise.
+        if (it != base_endpoints.end()) {
+            base_endpoints.erase(it);
+        } else if (!network_topology || view_node.get().dc() == my_datacenter) {
+            view_endpoints.push_back(view_node);
+        }
+    }
+
+    auto base_it = std::ranges::find(base_endpoints, me, std::mem_fn(&locator::node::host_id));
+    if (base_it == base_endpoints.end()) {
+        // This node is not a base replica of this key, so we return empty
+        // FIXME: This case shouldn't happen, and if it happens, a view update
+        // would be lost.
+        ++cf_stats.total_view_updates_on_wrong_node;
+        vlogger.warn("Could not find {} in base_endpoints={}", me,
+                base_endpoints | std::views::transform(std::mem_fn(&locator::node::host_id)));
+        return {};
+    }
+    size_t idx = base_it - base_endpoints.begin();
+    return {.natural_endpoint = view_endpoints[idx].get().host_id()};
+}
+
+static std::optional<locator::host_id> get_unpaired_view_endpoint(
+        std::vector<std::reference_wrapper<const locator::node>> base_nodes,
+        std::vector<std::reference_wrapper<const locator::node>> view_nodes,
+        replica::cf_stats& cf_stats) {
+    std::unordered_set<locator::endpoint_dc_rack> base_dc_racks;
+    for (auto&& base_node : base_nodes) {
+        if (base_dc_racks.contains(base_node.get().dc_rack())) {
+            // We can't do rack-aware pairing if there are multiple replicas in the same rack.
+            ++cf_stats.total_view_updates_failed_pairing;
+            vlogger.warn("Can't perform base-view pairing in this topology. There are multiple base table replicas in the same dc/rack({}/{}):",
+                    base_node.get().dc(), base_node.get().rack());
+            return std::nullopt;
+        }
+        base_dc_racks.insert(base_node.get().dc_rack());
+    }
+
+    std::unordered_set<locator::endpoint_dc_rack> paired_view_dc_racks;
+    std::unordered_map<locator::endpoint_dc_rack, locator::host_id> unpaired_view_dc_rack_replicas;
+    for (auto&& view_node : view_nodes) {
+        if (paired_view_dc_racks.contains(view_node.get().dc_rack()) || unpaired_view_dc_rack_replicas.contains(view_node.get().dc_rack())) {
+            // We can't do rack-aware pairing if there are multiple replicas in the same rack.
+            ++cf_stats.total_view_updates_failed_pairing;
+            vlogger.warn("Can't perform base-view pairing in this topology. There are multiple view table replicas in the same dc/rack({}/{}):",
+                    view_node.get().dc(), view_node.get().rack());
+            return std::nullopt;
+        }
+        // Track unpaired replicas in both sets
+        if (base_dc_racks.contains(view_node.get().dc_rack())) {
+            paired_view_dc_racks.insert(view_node.get().dc_rack());
+        } else {
+            unpaired_view_dc_rack_replicas.insert({view_node.get().dc_rack(), view_node.get().host_id()});
+        }
+    }
+
+    if (unpaired_view_dc_rack_replicas.size() > 0) {
+        // There are view replicas that can't be paired with any base replica
+        // This can happen as a result of an RF change when the view replica finishes streaming
+        // before the base replica.
+        // Because of this, a view replica might not get paired with any base replica, so we need
+        // to send an additional update to it.
+        ++cf_stats.total_view_updates_due_to_replica_count_mismatch;
+        auto extra_replica = unpaired_view_dc_rack_replicas.begin()->second;
+        unpaired_view_dc_rack_replicas.erase(unpaired_view_dc_rack_replicas.begin());
+        if (unpaired_view_dc_rack_replicas.size() > 0) {
+            // We only expect one extra replica to appear due to an RF change. If there's more, that's an error,
+            // but we'll still perform updates to the paired and last replicas to minimize degradation.
+            vlogger.warn("There are too many view endpoints for base-view pairing. View updates may get lost on view_endpoints={}",
+                    unpaired_view_dc_rack_replicas | std::views::values);
+        }
+        return extra_replica;
+    }
+    return std::nullopt;
+}
+
 // Calculate the node ("natural endpoint") to which this node should send
 // a view update.
 //
@@ -1756,29 +1865,19 @@ bool should_generate_view_updates_on_this_shard(const schema_ptr& base, const lo
 // of this function is to find, assuming that this node is one of the base
 // replicas for a given partition, the paired view replica.
 //
-// In the past, we used an optimization called "self-pairing" that if a single
-// node was both a base replica and a view replica for a write, the pairing is
-// modified so that this node would send the update to itself. This self-
-// pairing optimization could cause the pairing to change after view ranges
-// are moved between nodes, so currently we only use it if
-// use_legacy_self_pairing is set to true. When using tablets - where range
-// movements are common - it is strongly recommended to set it to false.
+// When using vnodes, we have an optimization called "self-pairing" - if a single
+// node is both a base replica and a view replica for a write, the pairing is
+// modified so that this node sends the update to itself and this node is removed
+// from the lists of nodes paired by index. This self-pairing optimization can
+// cause the pairing to change after view ranges are moved between nodes.
 //
 // If the keyspace's replication strategy is a NetworkTopologyStrategy,
 // we pair only nodes in the same datacenter.
 //
-// When use_legacy_self_pairing is enabled, if one of the base replicas
-// also happens to be a view replica, it is paired with itself
-// (with the other nodes paired by order in the list
-// after taking this node out).
-//
-// If the table uses tablets and the replication strategy is NetworkTopologyStrategy
-// and the replication factor in the node's datacenter is a multiple of the number
-// of racks in the datacenter, then pairing is rack-aware.  In this case,
-// all racks have the same number of replicas, and those are never migrated
-// outside their racks. Therefore, the base replicas are naturally paired with the
-// view replicas that are in the same rack, based on the ordinal position.
-// Note that typically, there is a single replica per rack and pairing is trivial.
+// If the table uses tablets, then pairing is rack-aware. In this case, in each
+// rack where we have a base replica there is also one replica of each view tablet.
+// Therefore, the base replicas are naturally paired with the view replicas that
+// are in the same rack.
 //
 // If the assumption that the given base token belongs to this replica
 // does not hold, we return an empty optional.
@@ -1806,19 +1905,12 @@ endpoints_to_update get_view_natural_endpoint(
        const locator::abstract_replication_strategy& replication_strategy,
        const dht::token& base_token,
        const dht::token& view_token,
-        bool use_legacy_self_pairing,
-        bool use_tablets_rack_aware_view_pairing,
+        bool use_tablets,
        replica::cf_stats& cf_stats) {
    auto& topology = base_erm->get_token_metadata_ptr()->get_topology();
    auto& view_topology = view_erm->get_token_metadata_ptr()->get_topology();
    auto& my_location = topology.get_location(me);
-    auto& my_datacenter = my_location.dc;
    auto* network_topology = dynamic_cast<const locator::network_topology_strategy*>(&replication_strategy);
-    auto rack_aware_pairing = use_tablets_rack_aware_view_pairing && network_topology;
-    bool simple_rack_aware_pairing = false;
-    using node_vector = std::vector<std::reference_wrapper<const locator::node>>;
-    node_vector orig_base_endpoints, orig_view_endpoints;
-    node_vector base_endpoints, view_endpoints;

    auto resolve = [&] (const locator::topology& topology, const locator::host_id& ep, bool is_view) -> const locator::node& {
        if (auto* np = topology.find_node(ep)) {
@@ -1829,6 +1921,7 @@ endpoints_to_update get_view_natural_endpoint(

    // We need to use get_replicas() for pairing to be stable in case base or view tablet
    // is rebuilding a replica which has left the ring. get_natural_endpoints() filters such replicas.
+    using node_vector = std::vector<std::reference_wrapper<const locator::node>>;
    auto base_nodes = base_erm->get_replicas(base_token) | std::views::transform([&] (const locator::host_id& ep) -> const locator::node& {
        return resolve(topology, ep, false);
    }) | std::ranges::to<node_vector>();
@@ -1852,231 +1945,43 @@ endpoints_to_update get_view_natural_endpoint(
                // note that the recursive call will not recurse again because leaving_base is in base_nodes.
                auto leaving_base = it->get().host_id();
                return get_view_natural_endpoint(leaving_base, base_erm, view_erm, replication_strategy, base_token,
-                        view_token, use_legacy_self_pairing, use_tablets_rack_aware_view_pairing, cf_stats);
+                        view_token, use_tablets, cf_stats);
            }
        }
    }

-    std::function<bool(const locator::node&)> is_candidate;
-    if (network_topology) {
-        is_candidate = [&] (const locator::node& node) { return node.dc() == my_datacenter; };
-    } else {
-        is_candidate = [&] (const locator::node&) { return true; };
-    }
-    auto process_candidate = [&] (node_vector& nodes, std::reference_wrapper<const locator::node> node) {
-        if (is_candidate(node)) {
-            nodes.emplace_back(node);
-        }
-    };
-
-    for (auto&& base_node : base_nodes) {
-        process_candidate(base_endpoints, base_node);
+    if (!use_tablets) {
+        return get_view_natural_endpoint_vnodes(
+                me,
+                base_nodes,
+                view_nodes,
+                my_location,
+                network_topology,
+                cf_stats);
    }

-    if (use_legacy_self_pairing) {
-        for (auto&& view_node : view_nodes) {
-            auto it = std::ranges::find(base_endpoints, view_node.get().host_id(), std::mem_fn(&locator::node::host_id));
-            // If this base replica is also one of the view replicas, we use
-            // ourselves as the view replica.
-            // We don't return an extra endpoint, as it's only needed when
-            // using tablets (so !use_legacy_self_pairing)
-            if (view_node.get().host_id() == me && it != base_endpoints.end()) {
-                return {.natural_endpoint = me};
-            }
-
-            // We have to remove any endpoint which is shared between the base
-            // and the view, as it will select itself and throw off the counts
-            // otherwise.
-            if (it != base_endpoints.end()) {
-                base_endpoints.erase(it);
-            } else if (is_candidate(view_node)) {
-                view_endpoints.push_back(view_node);
-            }
-        }
-    } else {
-        for (auto&& view_node : view_nodes) {
-            process_candidate(view_endpoints, view_node);
+    std::optional<locator::host_id> paired_replica;
+    for (auto&& view_node : view_nodes) {
+        if (view_node.get().dc_rack() == my_location) {
+            paired_replica = view_node.get().host_id();
+            break;
        }
    }
-
-    // Try optimizing for simple rack-aware pairing
-    // If the numbers of base and view replica differ, that means an RF change is taking place
-    // and we can't use simple rack-aware pairing.
-    if (rack_aware_pairing && base_endpoints.size() == view_endpoints.size()) {
-        auto dc_rf = network_topology->get_replication_factor(my_datacenter);
-        const auto& racks = topology.get_datacenter_rack_nodes().at(my_datacenter);
-        // Simple rack-aware pairing is possible when the datacenter replication factor
-        // is a multiple of the number of racks in the datacenter.
-        if (dc_rf % racks.size() == 0) {
-            simple_rack_aware_pairing = true;
-            size_t rack_rf = dc_rf / racks.size();
-            // If any rack doesn't have enough nodes to satisfy the per-rack rf
-            // simple rack-aware pairing is disabled.
-            for (const auto& [rack, nodes] : racks) {
-                if (nodes.size() < rack_rf) {
-                    simple_rack_aware_pairing = false;
-                    break;
-                }
-            }
-        }
-        if (dc_rf != base_endpoints.size()) {
-            // If the datacenter replication factor is not equal to the number of base replicas,
-            // we're in progress of a RF change and we can't use simple rack-aware pairing.
-            simple_rack_aware_pairing = false;
-        }
-        if (simple_rack_aware_pairing) {
-            std::erase_if(base_endpoints, [&] (const locator::node& node) { return node.dc_rack() != my_location; });
-            std::erase_if(view_endpoints, [&] (const locator::node& node) { return node.dc_rack() != my_location; });
-        }
+    if (paired_replica && base_nodes.size() == view_nodes.size()) {
+        // We don't need to find any extra replicas, so we can return early
+        return {.natural_endpoint = paired_replica};
    }
-
-    orig_base_endpoints = base_endpoints;
-    orig_view_endpoints = view_endpoints;
-
-    // For the complex rack_aware_pairing case, nodes are already filtered by datacenter
-    // Use best-match, for the minimum number of base and view replicas in each rack,
-    // and ordinal match for the rest.
-    std::optional<std::reference_wrapper<const locator::node>> paired_replica;
-    if (rack_aware_pairing && !simple_rack_aware_pairing) {
-        struct indexed_replica {
-            size_t idx;
-            std::reference_wrapper<const locator::node> node;
-        };
-        std::unordered_map<sstring, std::vector<indexed_replica>> base_racks, view_racks;
-
-        // First, index all replicas by rack
-        auto index_replica_set = [] (std::unordered_map<sstring, std::vector<indexed_replica>>& racks, const node_vector& replicas) {
-            size_t idx = 0;
-            for (const auto& r: replicas) {
-                racks[r.get().rack()].emplace_back(idx++, r);
-            }
-        };
-        index_replica_set(base_racks, base_endpoints);
-        index_replica_set(view_racks, view_endpoints);
-
-        // Try optimistically pairing `me` first
-        const auto& my_base_replicas = base_racks[my_location.rack];
-        auto base_it = std::ranges::find(my_base_replicas, me, [] (const indexed_replica& ir) { return ir.node.get().host_id(); });
-        if (base_it == my_base_replicas.end()) {
-            return {};
-        }
-        const auto& my_view_replicas = view_racks[my_location.rack];
-        size_t idx = base_it - my_base_replicas.begin();
-        if (idx < my_view_replicas.size()) {
-            if (orig_view_endpoints.size() <= orig_base_endpoints.size()) {
-                return {.natural_endpoint = my_view_replicas[idx].node.get().host_id()};
-            } else {
-                // If the number of view replicas is larger than the number of base replicas,
-                // we need to find the unpaired view replica, so we can't return yet.
-                paired_replica = my_view_replicas[idx].node;
-            }
-        }
-
-        // Collect all unpaired base and view replicas,
-        // where the number of replicas in the base rack is different than the respective view rack
-        std::vector<indexed_replica> unpaired_base_replicas, unpaired_view_replicas;
-        for (const auto& [rack, base_replicas] : base_racks) {
-            const auto& view_replicas = view_racks[rack];
-            for (auto i = view_replicas.size(); i < base_replicas.size(); ++i) {
-                unpaired_base_replicas.emplace_back(base_replicas[i]);
-            }
-        }
-        for (const auto& [rack, view_replicas] : view_racks) {
-            const auto& base_replicas = base_racks[rack];
-            for (auto i = base_replicas.size(); i < view_replicas.size(); ++i) {
-                unpaired_view_replicas.emplace_back(view_replicas[i]);
-            }
-        }
-
-        // Sort by the original ordinality, and copy the sorted results
-        // back into {base,view}_endpoints, for backward compatible processing below.
-        std::ranges::sort(unpaired_base_replicas, std::less(), std::mem_fn(&indexed_replica::idx));
-        base_endpoints.clear();
-        std::ranges::transform(unpaired_base_replicas, std::back_inserter(base_endpoints), std::mem_fn(&indexed_replica::node));
-
-        std::ranges::sort(unpaired_view_replicas, std::less(), std::mem_fn(&indexed_replica::idx));
-        view_endpoints.clear();
-        std::ranges::transform(unpaired_view_replicas, std::back_inserter(view_endpoints), std::mem_fn(&indexed_replica::node));
-    }
-
-    auto base_it = std::ranges::find(base_endpoints, me, std::mem_fn(&locator::node::host_id));
-    if (!paired_replica && base_it == base_endpoints.end()) {
-        // This node is not a base replica of this key, so we return empty
-        // FIXME: This case shouldn't happen, and if it happens, a view update
-        // would be lost.
-        ++cf_stats.total_view_updates_on_wrong_node;
-        vlogger.warn("Could not find {} in base_endpoints={}", me,
-                orig_base_endpoints | std::views::transform(std::mem_fn(&locator::node::host_id)));
-        return {};
-    }
-    size_t idx = base_it - base_endpoints.begin();
-    std::optional<std::reference_wrapper<const locator::node>> no_pairing_replica;
-    if (!paired_replica && idx >= view_endpoints.size()) {
-        // There are fewer view replicas than base replicas
-        // FIXME: This might still happen when reducing replication factor with tablets,
-        // see https://github.com/scylladb/scylladb/issues/21492
-        ++cf_stats.total_view_updates_failed_pairing;
-        vlogger.warn("Could not pair {}: rack_aware={} base_endpoints={} view_endpoints={}", me,
-                rack_aware_pairing ? (simple_rack_aware_pairing ? "simple" : "complex") : "none",
-                orig_base_endpoints | std::views::transform(std::mem_fn(&locator::node::host_id)),
-                orig_view_endpoints | std::views::transform(std::mem_fn(&locator::node::host_id)));
-        return {};
-    } else if (base_endpoints.size() < view_endpoints.size()) {
-        // There are fewer base replicas than view replicas.
-        // This can happen as a result of an RF change when the view replica finishes streaming
-        // before the base replica.
-        // Because of this, a view replica might not get paired with any base replica, so we need
-        // to send an additional update to it.
-        ++cf_stats.total_view_updates_due_to_replica_count_mismatch;
-        no_pairing_replica = view_endpoints.back();
-        if (base_endpoints.size() < view_endpoints.size() - 1) {
-            // We only expect one extra replica to appear due to an RF change. If there's more, that's an error,
-            // but we'll still perform updates to the paired and last replicas to minimize degradation.
-            vlogger.warn("There are too many view endpoints for base-view pairing. View updates may get lost on view_endpoints={}",
-                    std::span(view_endpoints.begin() + base_endpoints.size(), view_endpoints.end() - 1) | std::views::transform(std::mem_fn(&locator::node::host_id)));
-        }
-    }
-
    if (!paired_replica) {
-        paired_replica = view_endpoints[idx];
+        // We couldn't find any view replica in our rack
+        ++cf_stats.total_view_updates_failed_pairing;
+        vlogger.warn("Could not find a view replica in the same rack as base replica {} for base_endpoints={} view_endpoints={}", 
+                me,
+                base_nodes | std::views::transform(std::mem_fn(&locator::node::host_id)),
+                view_nodes | std::views::transform(std::mem_fn(&locator::node::host_id)));
    }
-    if (!no_pairing_replica && base_nodes.size() < view_nodes.size()) {
-        // This can happen when the view replica with no pairing is in another DC.
-        // We need to send an update to it if there are no base replicas in that DC yet,
-        // as it won't receive updates otherwise.
-        std::unordered_set<sstring> dcs_with_base_replicas;
-        for (const auto& base_node : base_nodes) {
-            dcs_with_base_replicas.insert(base_node.get().dc());
-        }
-        for (const auto& view_node : view_nodes) {
-            if (!dcs_with_base_replicas.contains(view_node.get().dc())) {
-                ++cf_stats.total_view_updates_due_to_replica_count_mismatch;
-                no_pairing_replica = view_node;
-                break;
-            }
-        }
-    }
-    // https://github.com/scylladb/scylladb/issues/19439
-    // With tablets, a node being replaced might transition to "left" state
-    // but still be kept as a replica.
-    // As of writing this hints are not prepared to handle nodes that are left
-    // but are still replicas. Therefore, there is no other sensible option
-    // right now but to give up attempt to send the update or write a hint
-    // to the paired, permanently down replica.
-    // We use the same workaround for the extra replica.
-    auto return_host_id_if_not_left = [] (const auto& replica) -> std::optional<locator::host_id> {
-        if (!replica) {
-            return std::nullopt;
-        }
-        const auto& node = replica->get();
-        if (!node.left()) {
-            return node.host_id();
-        } else {
-            return std::nullopt;
-        }
-    };
-    return {.natural_endpoint = return_host_id_if_not_left(paired_replica),
-            .endpoint_with_no_pairing = return_host_id_if_not_left(no_pairing_replica)};
+    std::optional<locator::host_id> no_pairing_replica = get_unpaired_view_endpoint(base_nodes, view_nodes, cf_stats);
+    return {.natural_endpoint = paired_replica,
+            .endpoint_with_no_pairing = no_pairing_replica};
 }

 static future<> apply_to_remote_endpoints(service::storage_proxy& proxy, locator::effective_replication_map_ptr ermp,
@@ -2136,12 +2041,6 @@ future<> view_update_generator::mutate_MV(
 {
    auto& ks = _db.find_keyspace(base->ks_name());
    auto& replication = ks.get_replication_strategy();
-    // We set legacy self-pairing for old vnode-based tables (for backward
-    // compatibility), and unset it for tablets - where range movements
-    // are more frequent and backward compatibility is less important.
-    // TODO: Maybe allow users to set use_legacy_self_pairing explicitly
-    // on a view, like we have the synchronous_updates_flag.
-    bool use_legacy_self_pairing = !ks.uses_tablets();
    std::unordered_map<table_id, locator::effective_replication_map_ptr> erms;
    auto get_erm = [&] (table_id id) {
        auto it = erms.find(id);
@@ -2154,10 +2053,6 @@ future<> view_update_generator::mutate_MV(
    for (const auto& mut : view_updates) {
        (void)get_erm(mut.s->id());
    }
-    // Enable rack-aware view updates pairing for tablets
-    // when the cluster feature is enabled so that all replicas agree
-    // on the pairing algorithm.
-    bool use_tablets_rack_aware_view_pairing = _db.features().tablet_rack_aware_view_pairing && ks.uses_tablets();
    auto me = base_ermp->get_topology().my_host_id();
    static constexpr size_t max_concurrent_updates = 128;
    co_await utils::get_local_injector().inject("delay_before_get_view_natural_endpoint", 8000ms);
@@ -2165,7 +2060,7 @@ future<> view_update_generator::mutate_MV(
        auto view_token = dht::get_token(*mut.s, mut.fm.key());
        auto view_ermp = erms.at(mut.s->id());
        auto [target_endpoint, no_pairing_endpoint] = get_view_natural_endpoint(me, base_ermp, view_ermp, replication, base_token, view_token,
-                use_legacy_self_pairing, use_tablets_rack_aware_view_pairing, cf_stats);
+                ks.uses_tablets(), cf_stats);
        auto remote_endpoints = view_ermp->get_pending_replicas(view_token);
        auto memory_units = seastar::make_lw_shared<db::timeout_semaphore_units>(pending_view_update_memory_units.split(memory_usage_of(mut)));
        if (no_pairing_endpoint) {
--- a/db/view/view.hh
+++ b/db/view/view.hh
@@ -305,8 +305,7 @@ endpoints_to_update get_view_natural_endpoint(
    const locator::abstract_replication_strategy& replication_strategy,
    const dht::token& base_token,
    const dht::token& view_token,
-    bool use_legacy_self_pairing,
-    bool use_tablets_basic_rack_aware_view_pairing,
+    bool use_tablets,
    replica::cf_stats& cf_stats);

 /// Verify that the provided keyspace is eligible for storing materialized views.
--- a/dist/common/sysconfig/scylla-node-exporter
+++ b/dist/common/sysconfig/scylla-node-exporter
@@ -1 +1 @@
-SCYLLA_NODE_EXPORTER_ARGS="--collector.interrupts --no-collector.hwmon --no-collector.bcache --no-collector.btrfs --no-collector.fibrechannel --no-collector.infiniband --no-collector.ipvs --no-collector.nfs --no-collector.nfsd --no-collector.powersupplyclass --no-collector.rapl --no-collector.tapestats --no-collector.thermal_zone --no-collector.udp_queues --no-collector.zfs"
+SCYLLA_NODE_EXPORTER_ARGS="--collector.interrupts --collector.ethtool.metrics-include='(bw_in_allowance_exceeded|bw_out_allowance_exceeded|conntrack_allowance_exceeded|conntrack_allowance_available|linklocal_allowance_exceeded)' --collector.ethtool --no-collector.hwmon --no-collector.bcache --no-collector.btrfs --no-collector.fibrechannel --no-collector.infiniband --no-collector.ipvs --no-collector.nfs --no-collector.nfsd --no-collector.powersupplyclass --no-collector.rapl --no-collector.tapestats --no-collector.thermal_zone --no-collector.udp_queues --no-collector.zfs"
--- a/docs/dev/scylla_assert_conversion.md
+++ b/docs/dev/scylla_assert_conversion.md
@@ -1,198 +0,0 @@
-# SCYLLA_ASSERT to scylla_assert() Conversion Guide
-
-## Overview
-
-This document tracks the conversion of `SCYLLA_ASSERT` to the new `scylla_assert()` macro based on `on_internal_error()`. The new macro throws exceptions instead of crashing the process, preventing cluster-wide crashes and loss of availability.
-
-## Status Summary
-
- **Total SCYLLA_ASSERT usages**: ~1307 (including tests)
- **Non-test usages**: ~886
- **Unsafe conversions (noexcept)**: ~187
- **Unsafe conversions (destructors)**: ~36
- **Safe conversions possible**: ~668
- **Converted so far**: 112
-
-## Safe vs Unsafe Contexts
-
-### Safe to Convert ✓
- Regular functions (non-noexcept)
- Coroutine functions (returning `future<T>`)
- Member functions without noexcept specifier
- Functions where exception propagation is acceptable
-
-### Unsafe to Convert ✗
-1. **noexcept functions** - throwing exceptions from noexcept causes `std::terminate()`
-2. **Destructors** - destructors are implicitly noexcept
-3. **noexcept lambdas and callbacks**
-4. **Code with explicit exception-safety requirements** that cannot handle exceptions
-
-## Files with Unsafe Conversions
-
-### Files with SCYLLA_ASSERT in noexcept contexts (examples)
-
-1. **reader_concurrency_semaphore.cc**
-   - Lines with noexcept functions containing SCYLLA_ASSERT
-   - Must remain as SCYLLA_ASSERT
-
-2. **db/large_data_handler.cc**
-   - Line 86: `maybe_delete_large_data_entries()` - marked noexcept but contains SCYLLA_ASSERT
-   - Analysis shows this is actually safe (not truly noexcept)
-
-3. **db/row_cache.cc**
-   - Multiple SCYLLA_ASSERT usages in noexcept member functions
-
-4. **db/schema_tables.cc**
-   - SCYLLA_ASSERT in noexcept contexts
-
-5. **raft/server.cc**
-   - Multiple noexcept functions with SCYLLA_ASSERT
-
-### Files with SCYLLA_ASSERT in destructors
-
-1. **reader_concurrency_semaphore.cc**
-   - Line 1116: SCYLLA_ASSERT in destructor
-
-2. **api/column_family.cc**
-   - Line 102: SCYLLA_ASSERT in destructor
-
-3. **utils/logalloc.cc**
-   - Line 1991: SCYLLA_ASSERT in destructor
-
-4. **utils/file_lock.cc**
-   - Lines 34, 36: SCYLLA_ASSERT in destructor
-
-5. **utils/disk_space_monitor.cc**
-   - Line 66: SCYLLA_ASSERT in destructor
-
-## Conversion Strategy
-
-### Phase 1: Infrastructure (Completed)
- Created `scylla_assert()` macro in `utils/assert.hh`
- Uses `on_internal_error()` for exception-based error handling
- Supports optional message parameters
-
-### Phase 2: Safe Conversions
-Convert SCYLLA_ASSERT to scylla_assert in contexts where:
- Function is not noexcept
- Not in a destructor
- Exception propagation is safe
-
-### Phase 3: Document Remaining Uses
-For contexts that cannot be converted:
- Add comments explaining why SCYLLA_ASSERT must remain
- Consider alternative approaches (e.g., using `on_fatal_internal_error()` in noexcept)
-
-## Converted Files
-
-### Completed Conversions
-
-1. **db/large_data_handler.cc** (3 conversions)
-   - Line 42: `maybe_record_large_partitions()`
-   - Line 86: `maybe_delete_large_data_entries()`
-   - Line 250: `delete_large_data_entries()`
-
-2. **db/large_data_handler.hh** (2 conversions)
-   - Line 83: `maybe_record_large_rows()`
-   - Line 103: `maybe_record_large_cells()`
-
-3. **db/schema_applier.cc** (1 conversion)
-   - Line 1124: `commit()` coroutine
-
-4. **db/system_distributed_keyspace.cc** (1 conversion)
-   - Line 234: `get_updated_service_levels()`
-
-5. **db/commitlog/commitlog_replayer.cc** (1 conversion)
-   - Line 168: `recover()` coroutine
-
-6. **db/view/row_locking.cc** (2 conversions)
-   - Line 156: `unlock()` - partition lock check
-   - Line 163: `unlock()` - row lock check
-
-7. **db/size_estimates_virtual_reader.cc** (1 conversion)
-   - Line 190: Lambda in `get_local_ranges()`
-
-8. **db/corrupt_data_handler.cc** (2 conversions)
-   - Line 78: `set_cell_raw` lambda
-   - Line 85: `set_cell` lambda
-
-9. **raft/tracker.cc** (2 conversions)
-   - Line 49: Switch default case with descriptive error
-   - Line 90: Switch default case with descriptive error
-
-10. **service/topology_coordinator.cc** (11 conversions)
-    - Line 363: Node lookup assertion in `retake_node()`
-    - Line 2313: Bootstrapping state ring check
-    - Line 2362: Replacing state ring check
-    - Line 2365: Normal nodes lookup assertion
-    - Line 2366: Node ring and state validation
-    - Line 3025: Join request ring check
-    - Line 3036: Leave request ring check
-    - Line 3049: Remove request ring check
-    - Line 3061: Replace request ring check
-    - Line 3166: Transition nodes empty check
-    - Line 4016: Barrier validation in `stop()`
-
-11. **service/storage_service.cc** (28 conversions, 3 unsafe kept as SCYLLA_ASSERT)
-    - Lines 603, 691, 857, 901, 969: Core service operations
-    - Lines 1523, 1575, 1844, 2086, 2170, 2195: Bootstrap and join operations
-    - Lines 2319, 2352, 2354: Replacement operations
-    - Lines 3003, 3028, 3228: Cluster join and drain operations
-    - Lines 3995, 4047, 4353: Decommission and removenode operations
-    - Lines 4473, 5787, 5834, 5958: CDC and topology change operations
-    - Lines 6490, 6491: Tablet streaming operations
-    - Line 7512: Join node response handler
-    - **Unsafe (kept as SCYLLA_ASSERT)**: Lines 3398, 5760, 5775 (noexcept functions)
-
-12. **sstables/** (58 conversions across 22 files)
-    - **sstables/trie/bti_node_reader.cc** (6): Node reading operations
-    - **sstables/mx/writer.cc** (6): MX format writing
-    - **sstables/sstable_set.cc** (5): SSTable set management
-    - **sstables/compressor.cc** (5): Compression/decompression
-    - **sstables/trie/trie_writer.hh** (4): Trie writing
-    - **sstables/downsampling.hh** (4): Downsampling operations
-    - **sstables/storage.{cc,hh}** (6): Storage operations
-    - **sstables/sstables_manager.{cc,hh}** (6): SSTable lifecycle management
-    - **sstables/trie/writer_node.{hh,impl.hh}** (4): Trie node writing
-    - **sstables/trie/bti_key_translation.cc** (2): Key translation
-    - **sstables/sstable_directory.cc** (2): Directory management
-    - **sstables/trie/trie_writer.cc** (1): Trie writer implementation
-    - **sstables/trie/trie_traversal.hh** (1): Trie traversal
-    - **sstables/sstables.cc** (1): Core SSTable operations
-    - **sstables/partition_index_cache.hh** (1): Index caching
-    - **sstables/generation_type.hh** (1): Generation management
-    - **sstables/compress.{cc,hh}** (2): Compression utilities
-    - **sstables/exceptions.hh** (1): Comment update
-
-## Testing
-
-### Manual Testing
-Created `test/manual/test_scylla_assert.cc` to verify:
- Passing assertions succeed
- Failing assertions throw exceptions
- Custom messages are properly formatted
-
-### Integration Testing
- Run existing test suite with converted assertions
- Verify no regressions in error handling
- Confirm exception propagation works correctly
-
-## Future Work
-
-1. **Automated Analysis Tool**
-   - Create tool to identify safe vs unsafe conversion contexts
-   - Generate reports of remaining conversions
-
-2. **Gradual Conversion**
-   - Convert additional safe usages incrementally
-   - Monitor for any unexpected issues
-
-3. **noexcept Review**
-   - Review functions marked noexcept that contain SCYLLA_ASSERT
-   - Consider if they should use `on_fatal_internal_error()` instead
-
-## References
-
- `utils/assert.hh` - Implementation of both SCYLLA_ASSERT and scylla_assert
- `utils/on_internal_error.hh` - Exception-based error handling infrastructure
- GitHub Issue: [Link to original issue tracking this work]
--- a/docs/dev/unsafe_scylla_assert_locations.md
+++ b/docs/dev/unsafe_scylla_assert_locations.md
@@ -1,614 +0,0 @@
-# Unsafe SCYLLA_ASSERT Locations
-
-This document lists specific locations where SCYLLA_ASSERT cannot be safely converted to scylla_assert().
-
-## Summary
-
- Files with noexcept SCYLLA_ASSERT: 50
- Files with destructor SCYLLA_ASSERT: 25
- Total unsafe SCYLLA_ASSERT in noexcept: 187
- Total unsafe SCYLLA_ASSERT in destructors: 36
-
-## SCYLLA_ASSERT in noexcept Functions
-
-### auth/cache.cc
-
- Line 118: `SCYLLA_ASSERT(this_shard_id() == 0);`
-
-Total: 1 usages
-
-### db/cache_mutation_reader.hh
-
- Line 309: `SCYLLA_ASSERT(sr->is_static_row());`
-
-Total: 1 usages
-
-### db/commitlog/commitlog.cc
-
- Line 531: `SCYLLA_ASSERT(!*this);`
- Line 544: `SCYLLA_ASSERT(!*this);`
- Line 662: `SCYLLA_ASSERT(_iter != _end);`
- Line 1462: `SCYLLA_ASSERT(i->second >= count);`
-
-Total: 4 usages
-
-### db/hints/manager.hh
-
- Line 167: `SCYLLA_ASSERT(_ep_managers.empty());`
-
-Total: 1 usages
-
-### db/partition_snapshot_row_cursor.hh
-
- Line 384: `SCYLLA_ASSERT(_latest_it);`
-
-Total: 1 usages
-
-### db/row_cache.cc
-
- Line 1365: `SCYLLA_ASSERT(it->is_last_dummy());`
-
-Total: 1 usages
-
-### db/schema_tables.cc
-
- Line 774: `SCYLLA_ASSERT(this_shard_id() == 0);`
-
-Total: 1 usages
-
-### db/view/view.cc
-
- Line 3623: `SCYLLA_ASSERT(thread::running_in_thread());`
-
-Total: 1 usages
-
-### gms/gossiper.cc
-
- Line 876: `SCYLLA_ASSERT(ptr->pid == _permit_id);`
-
-Total: 1 usages
-
-### locator/production_snitch_base.hh
-
- Line 77: `SCYLLA_ASSERT(_backreference != nullptr);`
- Line 82: `SCYLLA_ASSERT(_backreference != nullptr);`
- Line 87: `SCYLLA_ASSERT(_backreference != nullptr);`
-
-Total: 3 usages
-
-### locator/topology.cc
-
- Line 135: `SCYLLA_ASSERT(_shard == this_shard_id());`
-
-Total: 1 usages
-
-### mutation/counters.hh
-
- Line 314: `SCYLLA_ASSERT(_cell.is_live());`
- Line 315: `SCYLLA_ASSERT(!_cell.is_counter_update());`
-
-Total: 2 usages
-
-### mutation/mutation_partition_v2.hh
-
- Line 271: `SCYLLA_ASSERT(s.version() == _schema_version);`
-
-Total: 1 usages
-
-### mutation/partition_version.cc
-
- Line 364: `SCYLLA_ASSERT(!_snapshot->is_locked());`
- Line 701: `SCYLLA_ASSERT(!rows.empty());`
- Line 703: `SCYLLA_ASSERT(last_dummy.is_last_dummy());`
- Line 746: `SCYLLA_ASSERT(!_snapshot->is_locked());`
- Line 770: `SCYLLA_ASSERT(at_latest_version());`
- Line 777: `SCYLLA_ASSERT(at_latest_version());`
-
-Total: 6 usages
-
-### mutation/partition_version.hh
-
- Line 211: `SCYLLA_ASSERT(_schema);`
- Line 217: `SCYLLA_ASSERT(_schema);`
- Line 254: `SCYLLA_ASSERT(!_version->_backref);`
- Line 282: `SCYLLA_ASSERT(_version);`
- Line 286: `SCYLLA_ASSERT(_version);`
- Line 290: `SCYLLA_ASSERT(_version);`
- Line 294: `SCYLLA_ASSERT(_version);`
-
-Total: 7 usages
-
-### mutation/partition_version_list.hh
-
- Line 36: `SCYLLA_ASSERT(!_head->is_referenced_from_entry());`
- Line 42: `SCYLLA_ASSERT(!_tail->is_referenced_from_entry());`
- Line 70: `SCYLLA_ASSERT(!_head->is_referenced_from_entry());`
-
-Total: 3 usages
-
-### mutation/range_tombstone_list.cc
-
- Line 412: `SCYLLA_ASSERT (it != rt_list.end());`
- Line 422: `SCYLLA_ASSERT (it != rt_list.end());`
-
-Total: 2 usages
-
-### raft/server.cc
-
- Line 1720: `SCYLLA_ASSERT(_non_joint_conf_commit_promise);`
-
-Total: 1 usages
-
-### reader_concurrency_semaphore.cc
-
- Line 109: `SCYLLA_ASSERT(_permit == o._permit);`
- Line 432: `SCYLLA_ASSERT(_need_cpu_branches);`
- Line 455: `SCYLLA_ASSERT(_awaits_branches);`
- Line 1257: `SCYLLA_ASSERT(!_stopped);`
- Line 1585: `SCYLLA_ASSERT(_stats.need_cpu_permits);`
- Line 1587: `SCYLLA_ASSERT(_stats.need_cpu_permits >= _stats.awaits_permits);`
- Line 1593: `SCYLLA_ASSERT(_stats.need_cpu_permits >= _stats.awaits_permits);`
- Line 1598: `SCYLLA_ASSERT(_stats.awaits_permits);`
-
-Total: 8 usages
-
-### readers/multishard.cc
-
- Line 296: `SCYLLA_ASSERT(!_irh);`
-
-Total: 1 usages
-
-### repair/repair.cc
-
- Line 1073: `SCYLLA_ASSERT(table_names().size() == table_ids.size());`
-
-Total: 1 usages
-
-### replica/database.cc
-
- Line 3299: `SCYLLA_ASSERT(!_cf_lock.try_write_lock()); // lock should be acquired before the`
- Line 3304: `SCYLLA_ASSERT(!_cf_lock.try_write_lock()); // lock should be acquired before the`
-
-Total: 2 usages
-
-### replica/database.hh
-
- Line 1971: `SCYLLA_ASSERT(_user_sstables_manager);`
- Line 1976: `SCYLLA_ASSERT(_system_sstables_manager);`
-
-Total: 2 usages
-
-### replica/dirty_memory_manager.cc
-
- Line 67: `SCYLLA_ASSERT(!child->_heap_handle);`
-
-Total: 1 usages
-
-### replica/dirty_memory_manager.hh
-
- Line 261: `SCYLLA_ASSERT(_shutdown_requested);`
-
-Total: 1 usages
-
-### replica/memtable.cc
-
- Line 563: `SCYLLA_ASSERT(_mt._flushed_memory <= static_cast<int64_t>(_mt.occupancy().total_`
- Line 860: `SCYLLA_ASSERT(!reclaiming_enabled());`
-
-Total: 2 usages
-
-### replica/table.cc
-
- Line 2829: `SCYLLA_ASSERT(!trange.start()->is_inclusive() && trange.end()->is_inclusive());`
-
-Total: 1 usages
-
-### schema/schema.hh
-
- Line 1022: `SCYLLA_ASSERT(_schema->is_view());`
-
-Total: 1 usages
-
-### schema/schema_registry.cc
-
- Line 257: `SCYLLA_ASSERT(_state >= state::LOADED);`
- Line 262: `SCYLLA_ASSERT(_state >= state::LOADED);`
- Line 329: `SCYLLA_ASSERT(o._cpu_of_origin == current);`
-
-Total: 3 usages
-
-### service/direct_failure_detector/failure_detector.cc
-
- Line 628: `SCYLLA_ASSERT(alive != endpoint_liveness.marked_alive);`
-
-Total: 1 usages
-
-### service/storage_service.cc
-
- Line 3398: `SCYLLA_ASSERT(this_shard_id() == 0);`
- Line 5760: `SCYLLA_ASSERT(this_shard_id() == 0);`
- Line 5775: `SCYLLA_ASSERT(this_shard_id() == 0);`
- Line 5787: `SCYLLA_ASSERT(this_shard_id() == 0);`
-
-Total: 4 usages
-
-### sstables/generation_type.hh
-
- Line 132: `SCYLLA_ASSERT(bool(gen));`
-
-Total: 1 usages
-
-### sstables/partition_index_cache.hh
-
- Line 62: `SCYLLA_ASSERT(!ready());`
-
-Total: 1 usages
-
-### sstables/sstables_manager.hh
-
- Line 244: `SCYLLA_ASSERT(_sstables_registry && "sstables_registry is not plugged");`
-
-Total: 1 usages
-
-### sstables/storage.hh
-
- Line 86: `SCYLLA_ASSERT(false && "Changing directory not implemented");`
- Line 89: `SCYLLA_ASSERT(false && "Direct links creation not implemented");`
- Line 92: `SCYLLA_ASSERT(false && "Direct move not implemented");`
-
-Total: 3 usages
-
-### sstables_loader.cc
-
- Line 735: `SCYLLA_ASSERT(p);`
-
-Total: 1 usages
-
-### tasks/task_manager.cc
-
- Line 56: `SCYLLA_ASSERT(inserted);`
- Line 76: `SCYLLA_ASSERT(child->get_status().progress_units == progress_units);`
- Line 454: `SCYLLA_ASSERT(this_shard_id() == 0);`
-
-Total: 3 usages
-
-### tools/schema_loader.cc
-
- Line 281: `SCYLLA_ASSERT(p);`
-
-Total: 1 usages
-
-### utils/UUID.hh
-
- Line 59: `SCYLLA_ASSERT(is_timestamp());`
-
-Total: 1 usages
-
-### utils/bptree.hh
-
- Line 289: `SCYLLA_ASSERT(n.is_leftmost());`
- Line 301: `SCYLLA_ASSERT(n.is_rightmost());`
- Line 343: `SCYLLA_ASSERT(leaf->is_leaf());`
- Line 434: `SCYLLA_ASSERT(d->attached());`
- Line 453: `SCYLLA_ASSERT(n._num_keys > 0);`
- Line 505: `SCYLLA_ASSERT(n->is_leftmost());`
- Line 511: `SCYLLA_ASSERT(n->is_rightmost());`
- Line 517: `SCYLLA_ASSERT(n->is_root());`
- Line 557: `SCYLLA_ASSERT(!is_end());`
- Line 566: `SCYLLA_ASSERT(!is_end());`
- Line 613: `SCYLLA_ASSERT(n->_num_keys > 0);`
- Line 833: `SCYLLA_ASSERT(_left->_num_keys > 0);`
- Line 926: `SCYLLA_ASSERT(rl == rb);`
- Line 927: `SCYLLA_ASSERT(rl <= nr);`
- Line 1037: `SCYLLA_ASSERT(is_leaf());`
- Line 1042: `SCYLLA_ASSERT(is_leaf());`
- Line 1047: `SCYLLA_ASSERT(is_leaf());`
- Line 1052: `SCYLLA_ASSERT(is_leaf());`
- Line 1062: `SCYLLA_ASSERT(t->_right == this);`
- Line 1083: `SCYLLA_ASSERT(t->_left == this);`
- Line 1091: `SCYLLA_ASSERT(t->_right == this);`
- Line 1103: `SCYLLA_ASSERT(false);`
- Line 1153: `SCYLLA_ASSERT(i <= _num_keys);`
- Line 1212: `SCYLLA_ASSERT(off <= _num_keys);`
- Line 1236: `SCYLLA_ASSERT(from._num_keys > 0);`
- Line 1389: `SCYLLA_ASSERT(!is_root());`
- Line 1450: `SCYLLA_ASSERT(_num_keys == NodeSize);`
- Line 1563: `SCYLLA_ASSERT(_num_keys < NodeSize);`
- Line 1577: `SCYLLA_ASSERT(i != 0 || left_kid_sorted(k, less));`
- Line 1647: `SCYLLA_ASSERT(nodes.empty());`
- Line 1684: `SCYLLA_ASSERT(_num_keys > 0);`
- Line 1686: `SCYLLA_ASSERT(p._kids[i].n == this);`
- Line 1788: `SCYLLA_ASSERT(_num_keys == 0);`
- Line 1789: `SCYLLA_ASSERT(is_root() || !is_leaf() || (get_prev() == this && get_next() == th`
- Line 1821: `SCYLLA_ASSERT(_parent->_kids[i].n == &other);`
- Line 1841: `SCYLLA_ASSERT(i <= _num_keys);`
- Line 1856: `SCYLLA_ASSERT(!_nodes.empty());`
- Line 1938: `SCYLLA_ASSERT(!attached());`
- Line 1943: `SCYLLA_ASSERT(attached());`
-
-Total: 39 usages
-
-### utils/cached_file.hh
-
- Line 104: `SCYLLA_ASSERT(!_use_count);`
-
-Total: 1 usages
-
-### utils/compact-radix-tree.hh
-
- Line 1026: `SCYLLA_ASSERT(check_capacity(head, ni));`
- Line 1027: `SCYLLA_ASSERT(!_data.has(ni));`
- Line 1083: `SCYLLA_ASSERT(next_cap > head._capacity);`
- Line 1149: `SCYLLA_ASSERT(capacity != 0);`
- Line 1239: `SCYLLA_ASSERT(i < Size);`
- Line 1240: `SCYLLA_ASSERT(_idx[i] == unused_node_index);`
- Line 1470: `SCYLLA_ASSERT(kid != nullptr);`
- Line 1541: `SCYLLA_ASSERT(ret.first != nullptr);`
- Line 1555: `SCYLLA_ASSERT(leaf_depth >= depth);`
- Line 1614: `SCYLLA_ASSERT(n->check_prefix(key, depth));`
- Line 1850: `SCYLLA_ASSERT(_root.is(nil_root));`
-
-Total: 11 usages
-
-### utils/cross-shard-barrier.hh
-
- Line 134: `SCYLLA_ASSERT(w.has_value());`
-
-Total: 1 usages
-
-### utils/double-decker.hh
-
- Line 200: `SCYLLA_ASSERT(!hint.match);`
- Line 366: `SCYLLA_ASSERT(nb == end._bucket);`
-
-Total: 2 usages
-
-### utils/intrusive-array.hh
-
- Line 217: `SCYLLA_ASSERT(!is_single_element());`
- Line 218: `SCYLLA_ASSERT(pos < max_len);`
- Line 225: `SCYLLA_ASSERT(pos > 0);`
- Line 238: `SCYLLA_ASSERT(train_len < max_len);`
- Line 329: `SCYLLA_ASSERT(idx < max_len); // may the force be with us...`
-
-Total: 5 usages
-
-### utils/intrusive_btree.hh
-
- Line 148: `SCYLLA_ASSERT(to.num_keys == 0);`
- Line 157: `SCYLLA_ASSERT(!attached());`
- Line 227: `SCYLLA_ASSERT(n->is_inline());`
- Line 232: `SCYLLA_ASSERT(n->is_inline());`
- Line 288: `SCYLLA_ASSERT(n.is_root());`
- Line 294: `SCYLLA_ASSERT(n.is_leftmost());`
- Line 302: `SCYLLA_ASSERT(n.is_rightmost());`
- Line 368: `SCYLLA_ASSERT(_root->is_leaf());`
- Line 371: `SCYLLA_ASSERT(_inline.empty());`
- Line 601: `SCYLLA_ASSERT(n->is_leaf());`
- Line 673: `SCYLLA_ASSERT(!is_end());`
- Line 674: `SCYLLA_ASSERT(h->attached());`
- Line 677: `SCYLLA_ASSERT(_idx < cur.n->_base.num_keys);`
- Line 679: `SCYLLA_ASSERT(_hook->attached());`
- Line 690: `SCYLLA_ASSERT(!is_end());`
- Line 764: `SCYLLA_ASSERT(n->num_keys > 0);`
- Line 994: `SCYLLA_ASSERT(!_it.is_end());`
- Line 1178: `SCYLLA_ASSERT(is_leaf());`
- Line 1183: `SCYLLA_ASSERT(is_root());`
- Line 1261: `SCYLLA_ASSERT(!is_root());`
- Line 1268: `SCYLLA_ASSERT(p->_base.num_keys > 0 && p->_kids[0] == this);`
- Line 1275: `SCYLLA_ASSERT(p->_base.num_keys > 0 && p->_kids[p->_base.num_keys] == this);`
- Line 1286: `SCYLLA_ASSERT(false);`
- Line 1291: `SCYLLA_ASSERT(!nb->is_inline());`
- Line 1296: `SCYLLA_ASSERT(!nb->is_inline());`
- Line 1338: `SCYLLA_ASSERT(_base.num_keys == 0);`
- Line 1373: `SCYLLA_ASSERT(!(is_leftmost() || is_rightmost()));`
- Line 1378: `SCYLLA_ASSERT(p->_kids[i] != this);`
- Line 1396: `SCYLLA_ASSERT(!is_leaf());`
- Line 1537: `SCYLLA_ASSERT(src != _base.num_keys); // need more keys for the next leaf`
- Line 1995: `SCYLLA_ASSERT(_parent.n->_base.num_keys > 0);`
- Line 2135: `SCYLLA_ASSERT(is_leaf());`
- Line 2144: `SCYLLA_ASSERT(_base.num_keys != 0);`
- Line 2160: `SCYLLA_ASSERT(_base.num_keys != 0);`
- Line 2172: `SCYLLA_ASSERT(!empty());`
- Line 2198: `SCYLLA_ASSERT(leaf == ret->is_leaf());`
-
-Total: 36 usages
-
-### utils/loading_shared_values.hh
-
- Line 203: `SCYLLA_ASSERT(!_set.size());`
-
-Total: 1 usages
-
-### utils/logalloc.cc
-
- Line 544: `SCYLLA_ASSERT(!_background_reclaimer);`
- Line 926: `SCYLLA_ASSERT(idx < _segments.size());`
- Line 933: `SCYLLA_ASSERT(idx < _segments.size());`
- Line 957: `SCYLLA_ASSERT(i != _segments.end());`
- Line 1323: `SCYLLA_ASSERT(_lsa_owned_segments_bitmap.test(idx_from_segment(seg)));`
- Line 1366: `SCYLLA_ASSERT(desc._region);`
- Line 1885: `SCYLLA_ASSERT(desc._buf_pointers.empty());`
- Line 1911: `SCYLLA_ASSERT(&desc == old_ptr->_desc);`
- Line 2105: `SCYLLA_ASSERT(seg);`
- Line 2116: `SCYLLA_ASSERT(seg);`
- Line 2341: `SCYLLA_ASSERT(pool.current_emergency_reserve_goal() >= n_segments);`
-
-Total: 11 usages
-
-### utils/logalloc.hh
-
- Line 307: `SCYLLA_ASSERT(this_shard_id() == _cpu);`
-
-Total: 1 usages
-
-### utils/reusable_buffer.hh
-
- Line 60: `SCYLLA_ASSERT(_refcount == 0);`
-
-Total: 1 usages
-
-
-## SCYLLA_ASSERT in Destructors
-
-### api/column_family.cc
-
- Line 102: `SCYLLA_ASSERT(this_shard_id() == 0);`
-
-Total: 1 usages
-
-### cdc/generation.cc
-
- Line 846: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
-### cdc/log.cc
-
- Line 173: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
-### compaction/compaction_manager.cc
-
- Line 1074: `SCYLLA_ASSERT(_state == state::none || _state == state::stopped);`
-
-Total: 1 usages
-
-### db/hints/internal/hint_endpoint_manager.cc
-
- Line 188: `SCYLLA_ASSERT(stopped());`
-
-Total: 1 usages
-
-### mutation/partition_version.cc
-
- Line 347: `SCYLLA_ASSERT(!_snapshot->is_locked());`
-
-Total: 1 usages
-
-### reader_concurrency_semaphore.cc
-
- Line 1116: `SCYLLA_ASSERT(!_stats.waiters);`
- Line 1125: `SCYLLA_ASSERT(_inactive_reads.empty() && !_close_readers_gate.get_count() && !_p`
-
-Total: 2 usages
-
-### repair/row_level.cc
-
- Line 3647: `SCYLLA_ASSERT(_state == state::none || _state == state::stopped);`
-
-Total: 1 usages
-
-### replica/cell_locking.hh
-
- Line 371: `SCYLLA_ASSERT(_partitions.empty());`
-
-Total: 1 usages
-
-### replica/distributed_loader.cc
-
- Line 305: `SCYLLA_ASSERT(_sstable_directories.empty());`
-
-Total: 1 usages
-
-### schema/schema_registry.cc
-
- Line 45: `SCYLLA_ASSERT(!_schema);`
-
-Total: 1 usages
-
-### service/direct_failure_detector/failure_detector.cc
-
- Line 378: `SCYLLA_ASSERT(_ping_fiber.available());`
- Line 379: `SCYLLA_ASSERT(_notify_fiber.available());`
- Line 701: `SCYLLA_ASSERT(_shard_workers.empty());`
- Line 702: `SCYLLA_ASSERT(_destroy_subscriptions.available());`
- Line 703: `SCYLLA_ASSERT(_update_endpoint_fiber.available());`
- Line 707: `SCYLLA_ASSERT(!_impl);`
-
-Total: 6 usages
-
-### service/load_broadcaster.hh
-
- Line 37: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
-### service/paxos/paxos_state.cc
-
- Line 323: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
-### service/storage_proxy.cc
-
- Line 281: `SCYLLA_ASSERT(_stopped);`
- Line 3207: `SCYLLA_ASSERT(!_remote);`
-
-Total: 2 usages
-
-### service/tablet_allocator.cc
-
- Line 3288: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
-### sstables/compressor.cc
-
- Line 1271: `SCYLLA_ASSERT(thread::running_in_thread());`
-
-Total: 1 usages
-
-### sstables/sstables_manager.cc
-
- Line 58: `SCYLLA_ASSERT(_closing);`
- Line 59: `SCYLLA_ASSERT(_active.empty());`
- Line 60: `SCYLLA_ASSERT(_undergoing_close.empty());`
-
-Total: 3 usages
-
-### sstables/sstables_manager.hh
-
- Line 188: `SCYLLA_ASSERT(_storage != nullptr);`
-
-Total: 1 usages
-
-### utils/cached_file.hh
-
- Line 477: `SCYLLA_ASSERT(_cache.empty());`
-
-Total: 1 usages
-
-### utils/disk_space_monitor.cc
-
- Line 66: `SCYLLA_ASSERT(_poller_fut.available());`
-
-Total: 1 usages
-
-### utils/file_lock.cc
-
- Line 34: `SCYLLA_ASSERT(_fd.get() != -1);`
- Line 36: `SCYLLA_ASSERT(r == 0);`
-
-Total: 2 usages
-
-### utils/logalloc.cc
-
- Line 1991: `SCYLLA_ASSERT(desc.is_empty());`
- Line 1996: `SCYLLA_ASSERT(segment_pool().descriptor(_active).is_empty());`
-
-Total: 2 usages
-
-### utils/lru.hh
-
- Line 41: `SCYLLA_ASSERT(!_lru_link.is_linked());`
-
-Total: 1 usages
-
-### utils/replicator.hh
-
- Line 221: `SCYLLA_ASSERT(_stopped);`
-
-Total: 1 usages
-
--- a/docs/operating-scylla/nodetool-commands/setlogginglevel.rst
+++ b/docs/operating-scylla/nodetool-commands/setlogginglevel.rst
@@ -110,7 +110,6 @@ To display the log classes (output changes with each version so your display may
   keys
   keyspace_utils
   large_data
-   legacy_schema_migrator
   lister
   load_balancer
   load_broadcaster
--- a/docs/operating-scylla/nodetool-commands/snapshot.rst
+++ b/docs/operating-scylla/nodetool-commands/snapshot.rst
@@ -17,7 +17,7 @@ SYNOPSIS
                   [(-u <username> | --username <username>)] snapshot
                   [(-cf <table> | --column-family <table> | --table <table>)]
                   [(-kc <kclist> | --kc.list <kclist>)]
-                   [(-sf | --skip-flush)] [(-t <tag> | --tag <tag>)] [--] [<keyspaces...>]
+                   [(-sf | --skip-flush)] [--use-sstable-identifier] [(-t <tag> | --tag <tag>)] [--] [<keyspaces...>]

 OPTIONS
 .......
@@ -37,6 +37,8 @@ Parameter                                                             Descriptio
 --------------------------------------------------------------------  -------------------------------------------------------------------------------------
 -sf / --skip-flush                                                    Do not flush memtables before snapshotting (snapshot will not contain unflushed data)
 --------------------------------------------------------------------  -------------------------------------------------------------------------------------
+--use-sstable-identifier                                              Use the sstable identifier UUID, if available, rather than the sstable generation.
+--------------------------------------------------------------------  -------------------------------------------------------------------------------------
 -t <tag> / --tag <tag>                                                The name of the snapshot
 ====================================================================  =====================================================================================

--- a/docs/poetry.lock
+++ b/docs/poetry.lock
@@ -1018,14 +1018,14 @@ sphinx-markdown-tables = "0.0.17"

 [[package]]
 name = "sphinx-scylladb-theme"
-version = "1.8.9"
+version = "1.8.10"
 description = "A Sphinx Theme for ScyllaDB documentation projects"
 optional = false
 python-versions = "<4.0,>=3.10"
 groups = ["main"]
 files = [
-    {file = "sphinx_scylladb_theme-1.8.9-py3-none-any.whl", hash = "sha256:f8649a7753a29494fd2b417d1cb855035dddb9ebd498ea033fd73f5f9338271e"},
-    {file = "sphinx_scylladb_theme-1.8.9.tar.gz", hash = "sha256:ab7cda4c10a0d067c5c3a45f7b1f68cb8ebefe135a0be0738bfa282a344769b6"},
+    {file = "sphinx_scylladb_theme-1.8.10-py3-none-any.whl", hash = "sha256:8b930f33bec7308ccaa92698ebb5ad85059bcbf93a463f92917aeaf473fce632"},
+    {file = "sphinx_scylladb_theme-1.8.10.tar.gz", hash = "sha256:8a78a9b692d9a946be2c4a64aa472fd82204cc8ea0b1ee7f60de6db35b356326"},
 ]

 [package.dependencies]
@@ -1603,4 +1603,4 @@ files = [
 [metadata]
 lock-version = "2.1"
 python-versions = "^3.10"
-content-hash = "74912627a3f424290ed7889451c0bdb1a862ab85b1d07c85f4f3b8c34f32a020"
+content-hash = "0ae673106f45d3465cbdabbf511e165ca44feadd34d7753f2e68093afaa95c79"
--- a/docs/pyproject.toml
+++ b/docs/pyproject.toml
@@ -9,7 +9,7 @@ package-mode = false
 python = "^3.10"
 pygments = "^2.18.0"
 redirects_cli ="^0.1.3"
-sphinx-scylladb-theme = "^1.8.9"
+sphinx-scylladb-theme = "^1.8.10"
 sphinx-sitemap = "^2.6.0"
 sphinx-autobuild = "^2024.4.19"
 Sphinx = "^7.3.7"
--- a/gms/feature_service.hh
+++ b/gms/feature_service.hh
@@ -143,6 +143,7 @@ public:

    gms::feature tablet_incremental_repair { *this, "TABLET_INCREMENTAL_REPAIR"sv };
    gms::feature tablet_repair_scheduler { *this, "TABLET_REPAIR_SCHEDULER"sv };
+    gms::feature tablet_repair_tasks_table { *this, "TABLET_REPAIR_TASKS_TABLE"sv };
    gms::feature tablet_merge { *this, "TABLET_MERGE"sv };
    gms::feature tablet_rack_aware_view_pairing { *this, "TABLET_RACK_AWARE_VIEW_PAIRING"sv };

--- a/idl/raft.idl.hh
+++ b/idl/raft.idl.hh
@@ -129,6 +129,6 @@ struct direct_fd_ping_reply {
    std::variant<std::monostate, service::wrong_destination, service::group_liveness_info> result;
 };

-verb [[with_client_info, cancellable]] direct_fd_ping (raft::server_id dst_id) -> service::direct_fd_ping_reply;
+verb [[with_client_info, with_timeout, cancellable]] direct_fd_ping (raft::server_id dst_id) -> service::direct_fd_ping_reply;

 } // namespace service
--- a/install-dependencies.sh
+++ b/install-dependencies.sh
@@ -38,6 +38,7 @@ debian_base_packages=(
    python3-aiohttp
    python3-pyparsing
    python3-colorama
+    python3-dev
    python3-tabulate
    python3-pytest
    python3-pytest-asyncio
@@ -65,6 +66,7 @@ debian_base_packages=(
    git-lfs
    e2fsprogs
    fuse3
+    libev-dev # for python driver
 )

 fedora_packages=(
@@ -90,6 +92,7 @@ fedora_packages=(
    patchelf
    python3
    python3-aiohttp
+    python3-devel
    python3-pip
    python3-file-magic
    python3-colorama
@@ -154,6 +157,8 @@ fedora_packages=(
    https://github.com/scylladb/cassandra-stress/releases/download/v3.18.1/cassandra-stress-java21-3.18.1-1.noarch.rpm
    elfutils
    jq
+
+    libev-devel # for python driver
 )

 fedora_python3_packages=(
--- a/main.cc
+++ b/main.cc
@@ -39,7 +39,6 @@
 #include "api/api_init.hh"
 #include "db/config.hh"
 #include "db/extensions.hh"
-#include "db/legacy_schema_migrator.hh"
 #include "service/storage_service.hh"
 #include "service/migration_manager.hh"
 #include "service/tablet_allocator.hh"
@@ -1641,7 +1640,7 @@ To start the scylla server proper, simply invoke as: scylla server (or just scyl
            fd.start(
                std::ref(fd_pinger), std::ref(fd_clock),
                service::direct_fd_clock::base::duration{std::chrono::milliseconds{100}}.count(),
-                service::direct_fd_clock::base::duration{std::chrono::milliseconds{cfg->direct_failure_detector_ping_timeout_in_ms()}}.count()).get();
+                service::direct_fd_clock::base::duration{std::chrono::milliseconds{cfg->direct_failure_detector_ping_timeout_in_ms()}}.count(), dbcfg.gossip_scheduling_group).get();

            auto stop_fd = defer_verbose_shutdown("direct_failure_detector", [] {
                fd.stop().get();
@@ -1851,8 +1850,6 @@ To start the scylla server proper, simply invoke as: scylla server (or just scyl
            group0_client.init().get();

            checkpoint(stop_signal, "initializing system schema");
-            // schema migration, if needed, is also done on shard 0
-            db::legacy_schema_migrator::migrate(proxy, db, sys_ks, qp.local()).get();
            db::schema_tables::save_system_schema(qp.local()).get();
            db::schema_tables::recalculate_schema_version(sys_ks, proxy, feature_service.local()).get();

--- a/message/messaging_service.cc
+++ b/message/messaging_service.cc
@@ -686,6 +686,7 @@ static constexpr unsigned do_get_rpc_client_idx(messaging_verb verb) {
    case messaging_verb::RAFT_MODIFY_CONFIG:
    case messaging_verb::RAFT_PULL_SNAPSHOT:
    case messaging_verb::NOTIFY_BANNED:
+    case messaging_verb::DIRECT_FD_PING:
        // See comment above `TOPOLOGY_INDEPENDENT_IDX`.
        // DO NOT put any 'hot' (e.g. data path) verbs in this group,
        // only verbs which are 'rare' and 'cheap'.
@@ -747,7 +748,6 @@ static constexpr unsigned do_get_rpc_client_idx(messaging_verb verb) {
    case messaging_verb::PAXOS_ACCEPT:
    case messaging_verb::PAXOS_LEARN:
    case messaging_verb::PAXOS_PRUNE:
-    case messaging_verb::DIRECT_FD_PING:
        return 2;
    case messaging_verb::MUTATION_DONE:
    case messaging_verb::MUTATION_FAILED:
--- a/mutation/partition_version.cc
+++ b/mutation/partition_version.cc
@@ -575,10 +575,15 @@ utils::coroutine partition_entry::apply_to_incomplete(const schema& s,
                        }
                        res.row.set_range_tombstone(cur.range_tombstone_for_row() + src_cur.range_tombstone());

+                        if (need_preempt()) {
+                            lb = position_in_partition(cur.position());
+                            ++tracker.get_stats().rows_covered_by_range_tombstones_from_memtable;
+                            return stop_iteration::no;
+                        }
+
                        // FIXME: Compact the row
                        ++tracker.get_stats().rows_covered_by_range_tombstones_from_memtable;
                        cur.next();
-                        // FIXME: preempt
                    }
                }
                {
--- a/raft/tracker.cc
+++ b/raft/tracker.cc
@@ -46,7 +46,7 @@ bool follower_progress::is_stray_reject(const append_reply::rejected& rejected)
        // any reject during snapshot transfer is stray one
        return true;
    default:
-        scylla_assert(false, "invalid follower_progress state: {}", static_cast<int>(state));
+        SCYLLA_ASSERT(false);
    }
    return false;
 }
@@ -87,7 +87,7 @@ bool follower_progress::can_send_to() {
        // before starting to sync the log.
        return false;
    }
-    scylla_assert(false, "invalid follower_progress state in can_send_to: {}", static_cast<int>(state));
+    SCYLLA_ASSERT(false);
    return false;
 }

--- a/repair/row_level.cc
+++ b/repair/row_level.cc
@@ -3844,3 +3844,83 @@ future<uint32_t> repair_service::get_next_repair_meta_id() {
 locator::host_id repair_service::my_host_id() const noexcept {
    return _gossiper.local().my_host_id();
 }
+
+future<size_t> count_finished_tablets(utils::chunked_vector<tablet_token_range> ranges1, utils::chunked_vector<tablet_token_range> ranges2) {
+    if (ranges1.empty() || ranges2.empty()) {
+        co_return 0;
+    }
+
+    auto sort = [] (utils::chunked_vector<tablet_token_range>& ranges) {
+        std::sort(ranges.begin(), ranges.end(), [] (const auto& a, const auto& b) {
+            if (a.first_token != b.first_token) {
+                return a.first_token < b.first_token;
+            }
+            return a.last_token < b.last_token;
+        });
+    };
+
+    // First, merge overlapping and adjacent ranges in ranges2.
+    sort(ranges2);
+    utils::chunked_vector<tablet_token_range> merged;
+    merged.push_back(ranges2[0]);
+    for (size_t i = 1; i < ranges2.size(); ++i) {
+        co_await coroutine::maybe_yield();
+        // To avoid overflow with max() + 1, we check adjacency with `a - 1 <= b` instead of `a <= b + 1`
+        if (ranges2[i].first_token - 1 <= merged.back().last_token) {
+            merged.back().last_token = std::max(merged.back().last_token, ranges2[i].last_token);
+        } else {
+            merged.push_back(ranges2[i]);
+        }
+    }
+
+    // Count covered ranges using a linear scan
+    size_t covered_count = 0;
+    auto it = merged.begin();
+    auto end = merged.end();
+    sort(ranges1);
+    for (const auto& r1 : ranges1) {
+        co_await coroutine::maybe_yield();
+        // Advance the merged iterator only if the current merged range ends
+        // before the current r1 starts.
+        while (it != end && it->last_token < r1.first_token) {
+            co_await coroutine::maybe_yield();
+            ++it;
+        }
+        // If we have exhausted the merged ranges, no further r1 can be covered
+        if (it == end) {
+            break;
+        }
+        // Check if the current merged range covers r1.
+        if (it->first_token <= r1.first_token && r1.last_token <= it->last_token) {
+            covered_count++;
+        }
+    }
+
+    co_return covered_count;
+}
+
+future<std::optional<repair_task_progress>> repair_service::get_tablet_repair_task_progress(tasks::task_id task_uuid) {
+    utils::chunked_vector<tablet_token_range> requested_tablets;
+    utils::chunked_vector<tablet_token_range> finished_tablets;
+    table_id tid;
+    if (!_db.local().features().tablet_repair_tasks_table) {
+        co_return std::nullopt;
+    }
+    co_await _sys_ks.local().get_repair_task(task_uuid, [&tid, &requested_tablets, &finished_tablets] (const db::system_keyspace::repair_task_entry& entry) -> future<> {
+        rlogger.debug("repair_task_progress: Get entry operation={} first_token={} last_token={}", entry.operation, entry.first_token, entry.last_token);
+        if (entry.operation == db::system_keyspace::repair_task_operation::requested) {
+            requested_tablets.push_back({entry.first_token, entry.last_token});
+        } else if (entry.operation == db::system_keyspace::repair_task_operation::finished) {
+            finished_tablets.push_back({entry.first_token, entry.last_token});
+        }
+        tid = entry.table_uuid;
+        co_return;
+    });
+    auto requested = requested_tablets.size();
+    auto finished_nomerge = finished_tablets.size();
+    auto finished = co_await count_finished_tablets(std::move(requested_tablets), std::move(finished_tablets));
+    auto progress = repair_task_progress{requested, finished, tid};
+    rlogger.debug("repair_task_progress: task_uuid={} table_uuid={} requested_tablets={} finished_tablets={} progress={} finished_nomerge={}",
+            task_uuid, tid, requested, finished, progress.progress(), finished_nomerge);
+    co_return progress;
+}
--- a/repair/row_level.hh
+++ b/repair/row_level.hh
@@ -99,6 +99,15 @@ public:

 using host2ip_t = std::function<future<gms::inet_address> (locator::host_id)>;

+struct repair_task_progress {
+    size_t requested;
+    size_t finished;
+    table_id table_uuid;
+    float progress() const {
+        return requested == 0 ? 1.0 : float(finished) / requested;
+    }
+};
+
 class repair_service : public seastar::peering_sharded_service<repair_service> {
    sharded<service::topology_state_machine>& _tsm;
    sharded<gms::gossiper>& _gossiper;
@@ -222,6 +231,9 @@ private:
 public:
    future<gc_clock::time_point> repair_tablet(gms::gossip_address_map& addr_map, locator::tablet_metadata_guard& guard, locator::global_tablet_id gid, tasks::task_info global_tablet_repair_task_info, service::frozen_topology_guard topo_guard, std::optional<locator::tablet_replica_set> rebuild_replicas, locator::tablet_transition_stage stage);

+
+    future<std::optional<repair_task_progress>> get_tablet_repair_task_progress(tasks::task_id task_uuid);
+
 private:

    future<repair_update_system_table_response> repair_update_system_table_handler(
@@ -326,3 +338,12 @@ future<std::list<repair_row>> to_repair_rows_list(repair_rows_on_wire rows,
        schema_ptr s, uint64_t seed, repair_master is_master,
        reader_permit permit, repair_hasher hasher);
 void flush_rows(schema_ptr s, std::list<repair_row>& rows, lw_shared_ptr<repair_writer>& writer, std::optional<small_table_optimization_params> small_table_optimization = std::nullopt, repair_meta* rm = nullptr);
+
+// A struct to hold the first and last token of a tablet.
+struct tablet_token_range {
+    int64_t first_token;
+    int64_t last_token;
+};
+
+// Function to count the number of ranges in ranges1 covered by the merged ranges of ranges2.
+future<size_t> count_finished_tablets(utils::chunked_vector<tablet_token_range> ranges1, utils::chunked_vector<tablet_token_range> ranges2);
--- a/replica/database.cc
+++ b/replica/database.cc
@@ -2810,26 +2810,26 @@ future<> database::drop_cache_for_keyspace_on_all_shards(sharded<database>& shar
    });
 }

-future<> database::snapshot_table_on_all_shards(sharded<database>& sharded_db, table_id uuid, sstring tag, bool skip_flush) {
-    if (!skip_flush) {
+future<> database::snapshot_table_on_all_shards(sharded<database>& sharded_db, table_id uuid, sstring tag, db::snapshot_options opts) {
+    if (!opts.skip_flush) {
        co_await flush_table_on_all_shards(sharded_db, uuid);
    }
    auto table_shards = co_await get_table_on_all_shards(sharded_db, uuid);
-    co_await table::snapshot_on_all_shards(sharded_db, table_shards, tag);
+    co_await table::snapshot_on_all_shards(sharded_db, table_shards, tag, opts);
 }

-future<> database::snapshot_tables_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, std::vector<sstring> table_names, sstring tag, bool skip_flush) {
-    return parallel_for_each(table_names, [&sharded_db, ks_name, tag = std::move(tag), skip_flush] (auto& table_name) {
+future<> database::snapshot_tables_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, std::vector<sstring> table_names, sstring tag, db::snapshot_options opts) {
+    return parallel_for_each(table_names, [&sharded_db, ks_name, tag = std::move(tag), opts] (auto& table_name) {
        auto uuid = sharded_db.local().find_uuid(ks_name, table_name);
-        return snapshot_table_on_all_shards(sharded_db, uuid, tag, skip_flush);
+        return snapshot_table_on_all_shards(sharded_db, uuid, tag, opts);
    });
 }

-future<> database::snapshot_keyspace_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, sstring tag, bool skip_flush) {
+future<> database::snapshot_keyspace_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, sstring tag, db::snapshot_options opts) {
    auto& ks = sharded_db.local().find_keyspace(ks_name);
-    co_await coroutine::parallel_for_each(ks.metadata()->cf_meta_data(), [&, tag = std::move(tag), skip_flush] (const auto& pair) -> future<> {
+    co_await coroutine::parallel_for_each(ks.metadata()->cf_meta_data(), [&, tag = std::move(tag), opts] (const auto& pair) -> future<> {
        auto uuid = pair.second->id();
-        co_await snapshot_table_on_all_shards(sharded_db, uuid, tag, skip_flush);
+        co_await snapshot_table_on_all_shards(sharded_db, uuid, tag, opts);
    });
 }

@@ -2951,7 +2951,12 @@ future<> database::truncate_table_on_all_shards(sharded<database>& sharded_db, s
        auto truncated_at = truncated_at_opt.value_or(db_clock::now());
        auto name = snapshot_name_opt.value_or(
            format("{:d}-{}", truncated_at.time_since_epoch().count(), cf.schema()->cf_name()));
-        co_await table::snapshot_on_all_shards(sharded_db, table_shards, name);
+        // Use the sstable identifier in snapshot names to allow de-duplication of sstables
+        // at backup time even if they were migrated across shards or nodes and were renamed a given a new generation.
+        // We hard-code that here since we have no way to pass this option to auto-snapshot and
+        // it is always safe to use the sstable identifier for the sstable generation.
+        auto opts = db::snapshot_options{.use_sstable_identifier = true};
+        co_await table::snapshot_on_all_shards(sharded_db, table_shards, name, opts);
    }

    co_await sharded_db.invoke_on_all([&] (database& db) {
--- a/replica/database.hh
+++ b/replica/database.hh
@@ -1040,12 +1040,12 @@ public:
 private:
    using snapshot_file_set = foreign_ptr<std::unique_ptr<std::unordered_set<sstring>>>;

-    future<snapshot_file_set> take_snapshot(sstring jsondir);
+    future<snapshot_file_set> take_snapshot(sstring jsondir, db::snapshot_options opts);
    // Writes the table schema and the manifest of all files in the snapshot directory.
    future<> finalize_snapshot(const global_table_ptr& table_shards, sstring jsondir, std::vector<snapshot_file_set> file_sets);
    static future<> seal_snapshot(sstring jsondir, std::vector<snapshot_file_set> file_sets);
 public:
-    static future<> snapshot_on_all_shards(sharded<database>& sharded_db, const global_table_ptr& table_shards, sstring name);
+    static future<> snapshot_on_all_shards(sharded<database>& sharded_db, const global_table_ptr& table_shards, sstring name, db::snapshot_options opts);

    future<std::unordered_map<sstring, snapshot_details>> get_snapshot_details();
    static future<snapshot_details> get_snapshot_details(std::filesystem::path snapshot_dir, std::filesystem::path datadir);
@@ -2009,9 +2009,9 @@ public:
    static future<> drop_cache_for_table_on_all_shards(sharded<database>& sharded_db, table_id id);
    static future<> drop_cache_for_keyspace_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name);

-    static future<> snapshot_table_on_all_shards(sharded<database>& sharded_db, table_id id, sstring tag, bool skip_flush);
-    static future<> snapshot_tables_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, std::vector<sstring> table_names, sstring tag, bool skip_flush);
-    static future<> snapshot_keyspace_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, sstring tag, bool skip_flush);
+    static future<> snapshot_table_on_all_shards(sharded<database>& sharded_db, table_id id, sstring tag, db::snapshot_options opts);
+    static future<> snapshot_tables_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, std::vector<sstring> table_names, sstring tag, db::snapshot_options opts);
+    static future<> snapshot_keyspace_on_all_shards(sharded<database>& sharded_db, std::string_view ks_name, sstring tag, db::snapshot_options opts);

 public:
    bool update_column_family(schema_ptr s);
--- a/replica/distributed_loader.cc
+++ b/replica/distributed_loader.cc
@@ -234,18 +234,12 @@ distributed_loader::get_sstables_from_upload_dir(sharded<replica::database>& db,
 }

 future<std::tuple<table_id, std::vector<std::vector<sstables::shared_sstable>>>>
-distributed_loader::get_sstables_from_object_store(sharded<replica::database>& db, sstring ks, sstring cf, std::vector<sstring> sstables, sstring endpoint, sstring bucket, sstring prefix, sstables::sstable_open_config cfg, std::function<seastar::abort_source*()> get_abort_src) {
-    return get_sstables_from(db, ks, cf, cfg, [bucket, endpoint, prefix, sstables=std::move(sstables), &get_abort_src, &db] (auto& global_table, auto& directory) {
+distributed_loader::get_sstables_from_object_store(sharded<replica::database>& db, sstring ks, sstring cf, std::vector<sstring> sstables, sstring endpoint, sstring type, sstring bucket, sstring prefix, sstables::sstable_open_config cfg, std::function<seastar::abort_source*()> get_abort_src) {
+    return get_sstables_from(db, ks, cf, cfg, [bucket, endpoint, type, prefix, sstables=std::move(sstables), &get_abort_src] (auto& global_table, auto& directory) {
        return directory.start(global_table.as_sharded_parameter(),
-            sharded_parameter([bucket, endpoint, prefix, &get_abort_src, &db] {
-                auto eps = db.local().get_config().object_storage_endpoints() 
-                    | std::views::filter([&endpoint](auto& ep) { return ep.key() == endpoint; })
-                    ;
-                if (eps.empty()) {
-                    throw std::invalid_argument(fmt::format("Undefined endpoint {}", endpoint));
-                }
+            sharded_parameter([bucket, endpoint, type, prefix, &get_abort_src] {
                seastar::abort_source* as = get_abort_src ? get_abort_src() : nullptr;
-                auto opts = data_dictionary::make_object_storage_options(endpoint, eps.front().type(), bucket, prefix, as);
+                auto opts = data_dictionary::make_object_storage_options(endpoint, type, bucket, prefix, as);
                return make_lw_shared<const data_dictionary::storage_options>(std::move(opts));
            }),
            sstables,
--- a/replica/distributed_loader.hh
+++ b/replica/distributed_loader.hh
@@ -92,7 +92,7 @@ public:
    static future<std::tuple<table_id, std::vector<std::vector<sstables::shared_sstable>>>>
            get_sstables_from_upload_dir(sharded<replica::database>& db, sstring ks, sstring cf, sstables::sstable_open_config cfg);
    static future<std::tuple<table_id, std::vector<std::vector<sstables::shared_sstable>>>>
-            get_sstables_from_object_store(sharded<replica::database>& db, sstring ks, sstring cf, std::vector<sstring> sstables, sstring endpoint, sstring bucket, sstring prefix, sstables::sstable_open_config cfg, std::function<seastar::abort_source*()> = {});
+            get_sstables_from_object_store(sharded<replica::database>& db, sstring ks, sstring cf, std::vector<sstring> sstables, sstring endpoint, sstring type, sstring bucket, sstring prefix, sstables::sstable_open_config cfg, std::function<seastar::abort_source*()> = {});
    static future<> process_upload_dir(sharded<replica::database>& db, sharded<db::view::view_builder>& vb, sharded<db::view::view_building_worker>& vbw, sstring ks_name, sstring cf_name, bool skip_cleanup, bool skip_reshape);
 };

--- a/replica/table.cc
+++ b/replica/table.cc
@@ -3268,7 +3268,7 @@ future<> table::write_schema_as_cql(const global_table_ptr& table_shards, sstrin
 }

 // Runs the orchestration code on an arbitrary shard to balance the load.
-future<> table::snapshot_on_all_shards(sharded<database>& sharded_db, const global_table_ptr& table_shards, sstring name) {
+future<> table::snapshot_on_all_shards(sharded<database>& sharded_db, const global_table_ptr& table_shards, sstring name, db::snapshot_options opts) {
    auto* so = std::get_if<storage_options::local>(&table_shards->get_storage_options().value);
    if (so == nullptr) {
        throw std::runtime_error("Snapshotting non-local tables is not implemented");
@@ -3291,7 +3291,7 @@ future<> table::snapshot_on_all_shards(sharded<database>& sharded_db, const glob
        co_await io_check([&jsondir] { return recursive_touch_directory(jsondir); });
        co_await coroutine::parallel_for_each(smp::all_cpus(), [&] (unsigned shard) -> future<> {
            file_sets.emplace_back(co_await smp::submit_to(shard, [&] {
-                return table_shards->take_snapshot(jsondir);
+                return table_shards->take_snapshot(jsondir, opts);
            }));
        });
        co_await io_check(sync_directory, jsondir);
@@ -3300,19 +3300,22 @@ future<> table::snapshot_on_all_shards(sharded<database>& sharded_db, const glob
    });
 }

-future<table::snapshot_file_set> table::take_snapshot(sstring jsondir) {
-    tlogger.trace("take_snapshot {}", jsondir);
+future<table::snapshot_file_set> table::take_snapshot(sstring jsondir, db::snapshot_options opts) {
+    tlogger.trace("take_snapshot {}: use_sstable_identifier={}", jsondir, opts.use_sstable_identifier);

    auto sstable_deletion_guard = co_await get_sstable_list_permit();

    auto tables = *_sstables->all() | std::ranges::to<std::vector<sstables::shared_sstable>>();
    auto table_names = std::make_unique<std::unordered_set<sstring>>();

-    co_await _sstables_manager.dir_semaphore().parallel_for_each(tables, [&jsondir, &table_names] (sstables::shared_sstable sstable) {
-        table_names->insert(sstable->component_basename(sstables::component_type::Data));
-        return io_check([sstable, &dir = jsondir] {
-            return sstable->snapshot(dir);
+    auto& ks_name = schema()->ks_name();
+    auto& cf_name = schema()->cf_name();
+    co_await _sstables_manager.dir_semaphore().parallel_for_each(tables, [&, opts] (sstables::shared_sstable sstable) -> future<> {
+        auto gen = co_await io_check([sstable, &dir = jsondir, opts] {
+            return sstable->snapshot(dir, opts.use_sstable_identifier);
        });
+        auto fname = sstable->component_basename(ks_name, cf_name, sstable->get_version(), gen, sstable->get_format(), sstables::component_type::Data);
+        table_names->insert(fname);
    });
    co_return make_foreign(std::move(table_names));
 }
--- a/scripts/pull_github_pr.sh
+++ b/scripts/pull_github_pr.sh
@@ -38,8 +38,9 @@ for required in jq curl; do
 	fi
 done

-FORCE=0
 ALLOW_SUBMODULE=0
+ALLOW_UNSTABLE=0
+ALLOW_ANY_BRANCH=0

 function print_usage {
 cat << EOF
@@ -60,12 +61,18 @@ Options:
 -h
    Print this help message and exit.

--force
-    Do not check current branch to be next*
-    Do not check jenkins job status
-
 --allow-submodule
    Allow a PR to update a submudule
+
+--allow-unstable
+    Do not check jenkins job status
+
+--allow-any-branch
+    Merge PR even if target branch is not next
+
+--force
+    Sets all above --allow-* options
+
 EOF
 }

@@ -73,13 +80,23 @@ while [[ $# -gt 0 ]]
 do
    case $1 in
        "--force"|"-f")
-            FORCE=1
+            ALLOW_UNSTABLE=1
+            ALLOW_SUBMODULE=1
+            ALLOW_ANY_BRANCH=1
            shift 1
            ;;
        --allow-submodule)
            ALLOW_SUBMODULE=1
            shift
            ;;
+        --allow-unstable)
+            ALLOW_UNSTABLE=1
+            shift
+            ;;
+        --allow-any-branch)
+            ALLOW_ANY_BRANCH=1
+            shift
+            ;;
        +([0-9]))
            PR_NUM=$1
            shift 1
@@ -147,7 +164,7 @@ check_jenkins_job_status() {
  fi
 }

-if [[ $FORCE -eq 0 ]]; then
+if [[ $ALLOW_UNSTABLE -eq 0 ]]; then
  check_jenkins_job_status
 fi

@@ -179,17 +196,19 @@ echo -n "Fetching full name of author $PR_LOGIN... "
 USER_NAME=$(curl -s "https://api.github.com/users/$PR_LOGIN" | jq -r .name)
 echo "$USER_NAME"

-BASE_BRANCH=$(jq -r .base.ref <<< $PR_DATA)
-CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
-TARGET_BASE="unknown"
-if [[ ${BASE_BRANCH} == master ]]; then
-    TARGET_BASE="next"
-elif [[ ${BASE_BRANCH}  == branch-* ]]; then
-    TARGET_BASE=${BASE_BRANCH//branch/next}
-fi
-if [[ "${CURRENT_BRANCH}" != "${TARGET_BASE}" ]]; then
-    echo "Merging into wrong next, want ${TARGET_BASE}, have ${CURRENT_BRANCH}"
-    exit 1
+if [[ $ALLOW_ANY_BRANCH -eq 0 ]]; then
+    BASE_BRANCH=$(jq -r .base.ref <<< $PR_DATA)
+    CURRENT_BRANCH=$(git rev-parse --abbrev-ref HEAD)
+    TARGET_BASE="unknown"
+    if [[ ${BASE_BRANCH} == master ]]; then
+        TARGET_BASE="next"
+    elif [[ ${BASE_BRANCH}  == branch-* ]]; then
+        TARGET_BASE=${BASE_BRANCH//branch/next}
+    fi
+    if [[ "${CURRENT_BRANCH}" != "${TARGET_BASE}" ]]; then
+        echo "Merging into wrong next, want ${TARGET_BASE}, have ${CURRENT_BRANCH}. Use --allow-any-branch or --force to skip this check"
+        exit 1
+    fi
 fi

 git fetch "$REMOTE" pull/$PR_NUM/head
--- a/service/direct_failure_detector/failure_detector.cc
+++ b/service/direct_failure_detector/failure_detector.cc
@@ -6,6 +6,7 @@
 * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
 */

+#include "seastar/core/scheduling.hh"
 #include "utils/assert.hh"
 #include <unordered_set>

@@ -17,6 +18,7 @@
 #include <seastar/core/condition-variable.hh>
 #include <seastar/coroutine/parallel_for_each.hh>
 #include <seastar/util/defer.hh>
+#include <seastar/coroutine/switch_to.hh>

 #include "utils/log.hh"

@@ -118,7 +120,7 @@ struct failure_detector::impl {

    // Fetches endpoint updates from _endpoint_queue and performs the add/remove operation.
    // Runs on shard 0 only.
-    future<> update_endpoint_fiber();
+    future<> update_endpoint_fiber(seastar::scheduling_group sg);
    future<> _update_endpoint_fiber = make_ready_future<>();

    // Workers running on this shard.
@@ -140,7 +142,7 @@ struct failure_detector::impl {
    // The unregistering process requires cross-shard operations which we perform on this fiber.
    future<> _destroy_subscriptions = make_ready_future<>();

-    impl(failure_detector& parent, pinger&, clock&, clock::interval_t ping_period, clock::interval_t ping_timeout);
+    impl(failure_detector& parent, pinger&, clock&, clock::interval_t ping_period, clock::interval_t ping_timeout, seastar::scheduling_group sg);
    ~impl();

    // Inform update_endpoint_fiber() about an added/removed endpoint.
@@ -177,19 +179,19 @@ struct failure_detector::impl {
 };

 failure_detector::failure_detector(
-    pinger& pinger, clock& clock, clock::interval_t ping_period, clock::interval_t ping_timeout)
-        : _impl(std::make_unique<impl>(*this, pinger, clock, ping_period, ping_timeout))
+    pinger& pinger, clock& clock, clock::interval_t ping_period, clock::interval_t ping_timeout, seastar::scheduling_group sg)
+        : _impl(std::make_unique<impl>(*this, pinger, clock, ping_period, ping_timeout, sg))
 {}

 failure_detector::impl::impl(
-    failure_detector& parent, pinger& pinger, clock& clock, clock::interval_t ping_period, clock::interval_t ping_timeout)
+    failure_detector& parent, pinger& pinger, clock& clock, clock::interval_t ping_period, clock::interval_t ping_timeout, seastar::scheduling_group sg)
        : _parent(parent), _pinger(pinger), _clock(clock), _ping_period(ping_period), _ping_timeout(ping_timeout) {
    if (this_shard_id() != 0) {
        return;
    }

    _num_workers.resize(smp::count, 0);
-    _update_endpoint_fiber = update_endpoint_fiber();
+    _update_endpoint_fiber = update_endpoint_fiber(sg);
 }

 void failure_detector::impl::send_update_endpoint(pinger::endpoint_id ep, endpoint_update update) {
@@ -205,9 +207,9 @@ void failure_detector::impl::send_update_endpoint(pinger::endpoint_id ep, endpoi
    _endpoint_changed.signal();
 }

-future<> failure_detector::impl::update_endpoint_fiber() {
+future<> failure_detector::impl::update_endpoint_fiber(seastar::scheduling_group sg) {
    SCYLLA_ASSERT(this_shard_id() == 0);
-
+    co_await coroutine::switch_to(sg);
    while (true) {
        co_await _endpoint_changed.wait([this] { return !_endpoint_updates.empty(); });

@@ -480,7 +482,7 @@ static future<bool> ping_with_timeout(pinger::endpoint_id id, clock::timepoint_t
        }
    });

-    auto f = pinger.ping(id, timeout_as);
+    auto f = pinger.ping(id, timeout, timeout_as, c);
    auto sleep_and_abort = [] (clock::timepoint_t timeout, abort_source& timeout_as, clock& c) -> future<> {
        co_await c.sleep_until(timeout, timeout_as).then_wrapped([&timeout_as] (auto&& f) {
            // Avoid throwing if sleep was aborted.
--- a/service/direct_failure_detector/failure_detector.hh
+++ b/service/direct_failure_detector/failure_detector.hh
@@ -19,26 +19,6 @@ class abort_source;

 namespace direct_failure_detector {

-class pinger {
-public:
-    // Opaque endpoint ID.
-    // A specific implementation of `pinger` maps those IDs to 'real' addresses.
-    using endpoint_id = utils::UUID;
-
-    // Send a message to `ep` and wait until it responds.
-    // The wait can be aborted using `as`.
-    // Abort should be signalized with `abort_requested_exception`.
-    //
-    // If the ping fails in an expected way (e.g. the endpoint is down and refuses to connect),
-    // returns `false`. If it succeeds, returns `true`.
-    virtual future<bool> ping(endpoint_id ep, abort_source& as) = 0;
-
-protected:
-    // The `pinger` object must not be destroyed through the `pinger` interface.
-    // `failure_detector` does not take ownership of `pinger`, only a non-owning reference.
-    ~pinger() = default;
-};
-
 // A clock that uses abstract units to measure time.
 // The implementation is responsible for periodically advancing the clock.
 //
@@ -60,12 +40,33 @@ public:
    // Aborts should be signalized using `seastar::sleep_aborted`.
    virtual future<> sleep_until(timepoint_t tp, abort_source& as) = 0;

+    virtual std::chrono::milliseconds to_milliseconds(timepoint_t tp) const = 0;
 protected:
    // The `clock` object must not be destroyed through the `clock` interface.
    // `failure_detector` does not take ownership of `clock`, only a non-owning reference.
    ~clock() = default;
 };

+class pinger {
+public:
+    // Opaque endpoint ID.
+    // A specific implementation of `pinger` maps those IDs to 'real' addresses.
+    using endpoint_id = utils::UUID;
+
+    // Send a message to `ep` and wait until it responds.
+    // The wait can be aborted using `as`.
+    // Abort should be signalized with `abort_requested_exception`.
+    //
+    // If the ping fails in an expected way (e.g. the endpoint is down and refuses to connect),
+    // returns `false`. If it succeeds, returns `true`.
+    virtual future<bool> ping(endpoint_id ep, clock::timepoint_t timeout, abort_source& as, clock& c) = 0;
+
+protected:
+    // The `pinger` object must not be destroyed through the `pinger` interface.
+    // `failure_detector` does not take ownership of `pinger`, only a non-owning reference.
+    ~pinger() = default;
+};
+
 class listener {
 public:
    // Called when an endpoint in the detected set (added by `failure_detector::add_endpoint`) responds to a ping
@@ -127,7 +128,10 @@ public:

        // Duration after which a ping is aborted, so that next ping can be started
        // (pings are sent sequentially).
-        clock::interval_t ping_timeout
+        clock::interval_t ping_timeout,
+
+        // Scheduling group used for fibers inside the failure detector.
+        seastar::scheduling_group sg
    );

    ~failure_detector();
--- a/service/raft/raft_group_registry.cc
+++ b/service/raft/raft_group_registry.cc
@@ -18,6 +18,7 @@
 #include "utils/error_injection.hh"
 #include "seastar/core/shared_future.hh"

+#include <chrono>
 #include <seastar/core/coroutine.hh>
 #include <seastar/core/when_all.hh>
 #include <seastar/core/sleep.hh>
@@ -202,8 +203,11 @@ void raft_group_registry::init_rpc_verbs() {
    });

    ser::raft_rpc_verbs::register_direct_fd_ping(&_ms,
-            [this] (const rpc::client_info&, raft::server_id dst) -> future<direct_fd_ping_reply> {
-        // XXX: update address map here as well?
+            [this] (const rpc::client_info&, rpc::opt_time_point timeout, raft::server_id dst) -> future<direct_fd_ping_reply> {
+
+        if (timeout && *timeout <= netw::messaging_service::clock_type::now()) {
+            throw timed_out_error{};
+        }

        if (_my_id != dst) {
            return make_ready_future<direct_fd_ping_reply>(direct_fd_ping_reply {
@@ -213,19 +217,10 @@ void raft_group_registry::init_rpc_verbs() {
            });
        }

-        return container().invoke_on(0, [] (raft_group_registry& me) -> future<direct_fd_ping_reply> {
-            bool group0_alive = false;
-            if (me._group0_id) {
-                auto* group0_server = me.find_server(*me._group0_id);
-                if (group0_server && group0_server->is_alive()) {
-                    group0_alive = true;
-                }
+        return make_ready_future<direct_fd_ping_reply>(direct_fd_ping_reply {
+            .result = service::group_liveness_info{
+                .group0_alive = _group0_is_alive,
            }
-            co_return direct_fd_ping_reply {
-                .result = service::group_liveness_info{
-                    .group0_alive = group0_alive,
-                }
-            };
        });
    });
 }
@@ -380,6 +375,12 @@ future<> raft_group_registry::start_server_for_group(raft_server_for_group new_g
        co_await server.abort();
        std::rethrow_exception(ex);
    }
+
+    if (gid == _group0_id) {
+        co_await container().invoke_on_all([] (raft_group_registry& rg) {
+            rg._group0_is_alive = true;
+        });
+    }
 }

 future<> raft_group_registry::abort_server(raft::group_id gid, sstring reason) {
@@ -389,14 +390,18 @@ future<> raft_group_registry::abort_server(raft::group_id gid, sstring reason) {
    if (const auto it = _servers.find(gid); it != _servers.end()) {
        auto& [gid, s] = *it;
        if (!s.aborted) {
+            if (gid == _group0_id) {
+                co_await container().invoke_on_all([] (raft_group_registry& rg) {
+                    rg._group0_is_alive = false;
+                });
+            }
            s.aborted = s.server->abort(std::move(reason))
                .handle_exception([gid] (std::exception_ptr ex) {
                    rslog.warn("Failed to abort raft group server {}: {}", gid, ex);
                });
        }
-        return s.aborted->get_future();
+        co_await s.aborted->get_future();
    }
-    return make_ready_future<>();
 }

 unsigned raft_group_registry::shard_for_group(const raft::group_id& gid) const {
@@ -517,11 +522,13 @@ future<> raft_server_with_timeouts::read_barrier(seastar::abort_source* as, std:
    }, "read_barrier", as, timeout);
 }

-future<bool> direct_fd_pinger::ping(direct_failure_detector::pinger::endpoint_id id, abort_source& as) {
+future<bool> direct_fd_pinger::ping(direct_failure_detector::pinger::endpoint_id id, direct_failure_detector::clock::timepoint_t timeout, abort_source& as, direct_failure_detector::clock& c) {
    auto dst_id = raft::server_id{id};

    try {
-        auto reply = co_await ser::raft_rpc_verbs::send_direct_fd_ping(&_ms, locator::host_id{id}, as, dst_id);
+        std::chrono::milliseconds timeout_ms = c.to_milliseconds(timeout);
+        netw::messaging_service::clock_type::time_point deadline = netw::messaging_service::clock_type::now() + timeout_ms;
+        auto reply = co_await ser::raft_rpc_verbs::send_direct_fd_ping(&_ms, locator::host_id{id}, deadline, as, dst_id);
        if (auto* wrong_dst = std::get_if<wrong_destination>(&reply.result)) {
            // FIXME: after moving to host_id based verbs we will not get `wrong_destination`
            //        any more since the connection will fail
@@ -554,4 +561,11 @@ future<> direct_fd_clock::sleep_until(direct_failure_detector::clock::timepoint_
    return sleep_abortable(t - n, as);
 }

+std::chrono::milliseconds direct_fd_clock::to_milliseconds(direct_failure_detector::clock::timepoint_t tp) const {
+    auto t = base::time_point{base::duration{tp}};
+    auto n = base::now();
+    return std::chrono::duration_cast<std::chrono::milliseconds>(t - n);
+}
+
+
 } // end of namespace service
--- a/service/raft/raft_group_registry.hh
+++ b/service/raft/raft_group_registry.hh
@@ -127,6 +127,7 @@ private:
    // My Raft ID. Shared between different Raft groups.
    raft::server_id _my_id;

+    bool _group0_is_alive = false;
 public:
    raft_group_registry(raft::server_id my_id, netw::messaging_service& ms,
            direct_failure_detector::failure_detector& fd);
@@ -181,6 +182,9 @@ public:
    unsigned shard_for_group(const raft::group_id& gid) const;
    shared_ptr<raft::failure_detector> failure_detector();
    direct_failure_detector::failure_detector& direct_fd() { return _direct_fd; }
+    bool is_group0_alive() const {
+        return _group0_is_alive;
+    }
 };

 // Implementation of `direct_failure_detector::pinger` which uses DIRECT_FD_PING verb for pinging.
@@ -198,7 +202,7 @@ public:
    direct_fd_pinger(const direct_fd_pinger&) = delete;
    direct_fd_pinger(direct_fd_pinger&&) = delete;

-    future<bool> ping(direct_failure_detector::pinger::endpoint_id id, abort_source& as) override;
+    future<bool> ping(direct_failure_detector::pinger::endpoint_id id, direct_failure_detector::clock::timepoint_t timeout, abort_source& as, direct_failure_detector::clock& c) override;
 };

 // XXX: find a better place to put this?
@@ -207,6 +211,7 @@ struct direct_fd_clock : public direct_failure_detector::clock {

    direct_failure_detector::clock::timepoint_t now() noexcept override;
    future<> sleep_until(direct_failure_detector::clock::timepoint_t tp, abort_source& as) override;
+    std::chrono::milliseconds to_milliseconds(direct_failure_detector::clock::timepoint_t tp) const override;
 };

 } // end of namespace service
--- a/service/storage_proxy.cc
+++ b/service/storage_proxy.cc
@@ -6688,10 +6688,11 @@ storage_proxy::do_query_with_paxos(schema_ptr s,
        }
    };

-    auto request = seastar::make_shared<read_cas_request>();
+    auto request = std::make_unique<read_cas_request>();
+    auto* request_ptr = request.get();

-    return cas(std::move(s), std::move(cas_shard), request, cmd, std::move(partition_ranges), std::move(query_options),
-            cl, db::consistency_level::ANY, timeout, cas_timeout, false).then([request] (bool is_applied) mutable {
+    return cas(std::move(s), std::move(cas_shard), *request_ptr, cmd, std::move(partition_ranges), std::move(query_options),
+            cl, db::consistency_level::ANY, timeout, cas_timeout, false).then([request = std::move(request)] (bool is_applied) mutable {
        return make_ready_future<coordinator_query_result>(std::move(request->res));
    });
 }
@@ -6754,11 +6755,13 @@ static mutation_write_failure_exception read_failure_to_write(read_failure_excep
 * NOTE: `cmd` argument can be nullptr, in which case it's guaranteed that this function would not perform
 * any reads of committed values (in case user of the function is not interested in them).
 *
+ * NOTE: The `request` object must be guaranteed to be alive until the returned future is resolved.
+ *
 * WARNING: the function must be called on a shard that owns the key cas() operates on.
 * The cas_shard must be created *before* selecting the shard, to protect against
 * concurrent tablet migrations.
 */
-future<bool> storage_proxy::cas(schema_ptr schema, cas_shard cas_shard, shared_ptr<cas_request> request, lw_shared_ptr<query::read_command> cmd,
+future<bool> storage_proxy::cas(schema_ptr schema, cas_shard cas_shard, cas_request& request, lw_shared_ptr<query::read_command> cmd,
        dht::partition_range_vector partition_ranges, storage_proxy::coordinator_query_options query_options,
        db::consistency_level cl_for_paxos, db::consistency_level cl_for_learn,
        clock_type::time_point write_timeout, clock_type::time_point cas_timeout, bool write, cdc::per_request_options cdc_opts) {
@@ -6859,7 +6862,7 @@ future<bool> storage_proxy::cas(schema_ptr schema, cas_shard cas_shard, shared_p
                qr = std::move(cqr.query_result);
            }

-            auto mutation = request->apply(std::move(qr), cmd->slice, utils::UUID_gen::micros_timestamp(ballot), cdc_opts);
+            auto mutation = request.apply(std::move(qr), cmd->slice, utils::UUID_gen::micros_timestamp(ballot), cdc_opts);
            condition_met = true;
            if (!mutation) {
                if (write) {
--- a/service/storage_proxy.hh
+++ b/service/storage_proxy.hh
@@ -829,7 +829,7 @@ public:
        clock_type::time_point timeout,
        tracing::trace_state_ptr trace_state = nullptr);

-    future<bool> cas(schema_ptr schema, cas_shard cas_shard, shared_ptr<cas_request> request, lw_shared_ptr<query::read_command> cmd,
+    future<bool> cas(schema_ptr schema, cas_shard cas_shard, cas_request& request, lw_shared_ptr<query::read_command> cmd,
            dht::partition_range_vector partition_ranges, coordinator_query_options query_options,
            db::consistency_level cl_for_paxos, db::consistency_level cl_for_learn,
            clock_type::time_point write_timeout, clock_type::time_point cas_timeout, bool write = true, cdc::per_request_options cdc_opts = {});
--- a/service/storage_service.cc
+++ b/service/storage_service.cc
@@ -600,7 +600,7 @@ future<storage_service::nodes_to_notify_after_sync> storage_service::sync_raft_t
            co_await update_topology_change_info(tmptr, ::format("{} {}/{}", rs.state, id, ip));
            break;
        case node_state::replacing: {
-            scylla_assert(_topology_state_machine._topology.req_param.contains(id));
+            SCYLLA_ASSERT(_topology_state_machine._topology.req_param.contains(id));
            auto replaced_id = std::get<replace_param>(_topology_state_machine._topology.req_param[id]).replaced_id;
            auto existing_ip = _address_map.find(locator::host_id{replaced_id.uuid()});
            const auto replaced_host_id = locator::host_id(replaced_id.uuid());
@@ -688,7 +688,7 @@ future<> storage_service::notify_nodes_after_sync(nodes_to_notify_after_sync&& n
 future<> storage_service::topology_state_load(state_change_hint hint) {
 #ifdef SEASTAR_DEBUG
    static bool running = false;
-    scylla_assert(!running); // The function is not re-entrant
+    SCYLLA_ASSERT(!running); // The function is not re-entrant
    auto d = defer([] {
        running = false;
    });
@@ -854,7 +854,7 @@ future<> storage_service::topology_state_load(state_change_hint hint) {
 }

 future<> storage_service::topology_transition(state_change_hint hint) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    co_await topology_state_load(std::move(hint)); // reload new state

    _topology_state_machine.event.broadcast();
@@ -898,7 +898,7 @@ future<> storage_service::view_building_state_load() {
 }

 future<> storage_service::view_building_transition() {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    co_await view_building_state_load();

    _view_building_state_machine.event.broadcast();
@@ -966,7 +966,7 @@ future<> storage_service::merge_topology_snapshot(raft_snapshot snp) {
 }

 future<> storage_service::update_service_levels_cache(qos::update_both_cache_levels update_only_effective_cache, qos::query_context ctx) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    if (_sl_controller.local().is_v2()) {
        // Skip cache update unless the topology upgrade is done
        co_await _sl_controller.local().update_cache(update_only_effective_cache, ctx);
@@ -1520,7 +1520,7 @@ future<> storage_service::update_topology_with_local_metadata(raft::server& raft
 }

 future<> storage_service::start_upgrade_to_raft_topology() {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);

    if (_topology_state_machine._topology.upgrade_state != topology::upgrade_state_type::not_upgraded) {
        co_return;
@@ -1572,7 +1572,7 @@ future<> storage_service::start_upgrade_to_raft_topology() {
 }

 topology::upgrade_state_type storage_service::get_topology_upgrade_state() const {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    return _topology_state_machine._topology.upgrade_state;
 }

@@ -1841,7 +1841,7 @@ future<> storage_service::join_topology(sharded<service::storage_proxy>& proxy,
        slogger.info("Nodes {} are alive", get_sync_nodes());
    }

-    scylla_assert(_group0);
+    SCYLLA_ASSERT(_group0);

    join_node_request_params join_params {
        .host_id = _group0->load_my_id(),
@@ -2083,7 +2083,7 @@ future<> storage_service::join_topology(sharded<service::storage_proxy>& proxy,

    if (!_sys_ks.local().bootstrap_complete()) {
        // If we're not bootstrapping then we shouldn't have chosen a CDC streams timestamp yet.
-        scylla_assert(should_bootstrap() || !cdc_gen_id);
+        SCYLLA_ASSERT(should_bootstrap() || !cdc_gen_id);

        // Don't try rewriting CDC stream description tables.
        // See cdc.md design notes, `Streams description table V1 and rewriting` section, for explanation.
@@ -2167,7 +2167,7 @@ future<> storage_service::join_topology(sharded<service::storage_proxy>& proxy,
        throw std::runtime_error(err);
    }

-    scylla_assert(_group0);
+    SCYLLA_ASSERT(_group0);
    co_await _group0->finish_setup_after_join(*this, _qp, _migration_manager.local(), false);
    co_await _cdc_gens.local().after_join(std::move(cdc_gen_id));

@@ -2192,7 +2192,7 @@ future<> storage_service::join_topology(sharded<service::storage_proxy>& proxy,
 }

 future<> storage_service::track_upgrade_progress_to_topology_coordinator(sharded<service::storage_proxy>& proxy) {
-    scylla_assert(_group0);
+    SCYLLA_ASSERT(_group0);

    while (true) {
        _group0_as.check();
@@ -2316,7 +2316,7 @@ future<> storage_service::bootstrap(std::unordered_set<token>& bootstrap_tokens,

            // After we pick a generation timestamp, we start gossiping it, and we stick with it.
            // We don't do any other generation switches (unless we crash before complecting bootstrap).
-            scylla_assert(!cdc_gen_id);
+            SCYLLA_ASSERT(!cdc_gen_id);

            cdc_gen_id = _cdc_gens.local().legacy_make_new_generation(bootstrap_tokens, !is_first_node()).get();

@@ -2349,9 +2349,9 @@ future<> storage_service::bootstrap(std::unordered_set<token>& bootstrap_tokens,
            slogger.debug("Removing replaced endpoint {} from system.peers", replace_addr);
            _sys_ks.local().remove_endpoint(replace_addr).get();

-            scylla_assert(replaced_host_id);
+            SCYLLA_ASSERT(replaced_host_id);
            auto raft_id = raft::server_id{replaced_host_id.uuid()};
-            scylla_assert(_group0);
+            SCYLLA_ASSERT(_group0);
            bool raft_available = _group0->wait_for_raft().get();
            if (raft_available) {
                slogger.info("Replace: removing {}/{} from group 0...", replace_addr, raft_id);
@@ -3000,7 +3000,7 @@ future<> storage_service::stop_transport() {
 }

 future<> storage_service::drain_on_shutdown() {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    return (_operation_mode == mode::DRAINING || _operation_mode == mode::DRAINED) ?
        _drain_finished.get_future() : do_drain();
 }
@@ -3025,7 +3025,7 @@ bool storage_service::is_topology_coordinator_enabled() const {

 future<> storage_service::join_cluster(sharded<service::storage_proxy>& proxy,
        start_hint_manager start_hm, gms::generation_type new_generation) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);

    if (_sys_ks.local().was_decommissioned()) {
        auto msg = sstring("This node was decommissioned and will not rejoin the ring unless "
@@ -3225,7 +3225,7 @@ future<> storage_service::join_cluster(sharded<service::storage_proxy>& proxy,
 }

 future<token_metadata_change> storage_service::prepare_token_metadata_change(mutable_token_metadata_ptr tmptr, const schema_getter& schema_getter) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    std::exception_ptr ex;
    token_metadata_change change;

@@ -3992,7 +3992,7 @@ future<> storage_service::decommission() {
                slogger.info("DECOMMISSIONING: starts");
                ctl.req.leaving_nodes = std::list<gms::inet_address>{endpoint};

-                scylla_assert(ss._group0);
+                SCYLLA_ASSERT(ss._group0);
                bool raft_available = ss._group0->wait_for_raft().get();

                try {
@@ -4044,7 +4044,7 @@ future<> storage_service::decommission() {

                    if (raft_available && left_token_ring) {
                        slogger.info("decommission[{}]: leaving Raft group 0", uuid);
-                        scylla_assert(ss._group0);
+                        SCYLLA_ASSERT(ss._group0);
                        ss._group0->leave_group0().get();
                        slogger.info("decommission[{}]: left Raft group 0", uuid);
                    }
@@ -4350,7 +4350,7 @@ future<> storage_service::removenode(locator::host_id host_id, locator::host_id_
            auto stop_ctl = deferred_stop(ctl);
            auto uuid = ctl.uuid();
            const auto& tmptr = ctl.tmptr;
-            scylla_assert(ss._group0);
+            SCYLLA_ASSERT(ss._group0);
            auto raft_id = raft::server_id{host_id.uuid()};
            bool raft_available = ss._group0->wait_for_raft().get();
            bool is_group0_member = raft_available && ss._group0->is_member(raft_id, false);
@@ -4470,7 +4470,7 @@ future<> storage_service::removenode(locator::host_id host_id, locator::host_id_
 }

 future<> storage_service::check_and_repair_cdc_streams() {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);

    if (!_cdc_gens.local_is_initialized()) {
        return make_exception_future<>(std::runtime_error("CDC generation service not initialized yet"));
@@ -5784,7 +5784,7 @@ future<> storage_service::mutate_token_metadata(std::function<future<> (mutable_
 }

 future<> storage_service::update_topology_change_info(mutable_token_metadata_ptr tmptr, sstring reason) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);

    try {
        locator::dc_rack_fn get_dc_rack_by_host_id([this, &tm = *tmptr] (locator::host_id host_id) -> std::optional<locator::endpoint_dc_rack> {
@@ -5831,7 +5831,7 @@ future<> storage_service::keyspace_changed(const sstring& ks_name) {
 }

 future<locator::mutable_token_metadata_ptr> storage_service::prepare_tablet_metadata(const locator::tablet_metadata_change_hint& hint, mutable_token_metadata_ptr pending_token_metadata) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    if (hint) {
        co_await replica::update_tablet_metadata(_db.local(), _qp, pending_token_metadata->tablets(), hint);
    } else {
@@ -5955,7 +5955,7 @@ void storage_service::start_tablet_split_monitor() {
 }

 future<> storage_service::snitch_reconfigured() {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);
    auto& snitch = _snitch.local();
    co_await mutate_token_metadata([&snitch] (mutable_token_metadata_ptr tmptr) -> future<> {
        // re-read local rack and DC info
@@ -6487,8 +6487,8 @@ future<> storage_service::stream_tablet(locator::global_tablet_id tablet) {
    co_await utils::get_local_injector().inject("block_tablet_streaming", [this, &tablet] (auto& handler) -> future<> {
        const auto keyspace = handler.get("keyspace");
        const auto table = handler.get("table");
-        scylla_assert(keyspace);
-        scylla_assert(table);
+        SCYLLA_ASSERT(keyspace);
+        SCYLLA_ASSERT(table);
        auto s = _db.local().find_column_family(tablet.table).schema();
        bool should_block = s->ks_name() == *keyspace && s->cf_name() == *table;
        while (should_block && !handler.poll_for_message() && !_async_gate.is_closed()) {
@@ -6822,6 +6822,7 @@ future<std::unordered_map<sstring, sstring>> storage_service::add_repair_tablet_
            });
        }

+        auto ts = db_clock::now();
        for (const auto& token : tokens) {
            auto tid = tmap.get_tablet_id(token);
            auto& tinfo = tmap.get_tablet_info(tid);
@@ -6835,6 +6836,20 @@ future<std::unordered_map<sstring, sstring>> storage_service::add_repair_tablet_
                tablet_mutation_builder_for_base_table(guard.write_timestamp(), table)
                    .set_repair_task_info(last_token, repair_task_info, _feature_service)
                    .build());
+            db::system_keyspace::repair_task_entry entry{
+                .task_uuid   = tasks::task_id(repair_task_info.tablet_task_id.uuid()),
+                .operation   = db::system_keyspace::repair_task_operation::requested,
+                .first_token = dht::token::to_int64(tmap.get_first_token(tid)),
+                .last_token  = dht::token::to_int64(tmap.get_last_token(tid)),
+                .timestamp   = ts,
+                .table_uuid  = table,
+            };
+            if (_feature_service.tablet_repair_tasks_table) {
+                auto cmuts = co_await _sys_ks.local().get_update_repair_task_mutations(entry, guard.write_timestamp());
+                for (auto& m : cmuts) {
+                    updates.push_back(std::move(m));
+                }
+            }
        }

        sstring reason = format("Repair tablet by API request tokens={} tablet_task_id={}", tokens, repair_task_info.tablet_task_id);
@@ -7509,7 +7524,7 @@ future<join_node_request_result> storage_service::join_node_request_handler(join
 }

 future<join_node_response_result> storage_service::join_node_response_handler(join_node_response_params params) {
-    scylla_assert(this_shard_id() == 0);
+    SCYLLA_ASSERT(this_shard_id() == 0);

    // Usually this handler will only run once, but there are some cases where we might get more than one RPC,
    // possibly happening at the same time, e.g.:
--- a/service/tablet_allocator.cc
+++ b/service/tablet_allocator.cc
@@ -136,6 +136,17 @@ db::tablet_options combine_tablet_options(R&& opts) {
    return combined_opts;
 }

+static std::unordered_set<locator::tablet_id> split_string_to_tablet_id(std::string_view s, char delimiter) {
+    auto tokens_view = s | std::views::split(delimiter)
+		 | std::views::transform([](auto&& range) {
+			 return std::string_view(&*range.begin(), std::ranges::distance(range));
+		 })
+		 | std::views::transform([](std::string_view sv) {
+			 return locator::tablet_id(std::stoul(std::string(sv)));
+		 });
+    return std::unordered_set<locator::tablet_id>{tokens_view.begin(), tokens_view.end()};
+}
+
 // Used to compare different migration choices in regard to impact on load imbalance.
 // There is a total order on migration_badness such that better migrations are ordered before worse ones.
 struct migration_badness {
@@ -893,6 +904,8 @@ public:
            co_await coroutine::maybe_yield();
            auto& config = tmap.repair_scheduler_config();
            auto now = db_clock::now();
+            auto skip = utils::get_local_injector().inject_parameter<std::string_view>("tablet_repair_skip_sched");
+            auto skip_tablets = skip ? split_string_to_tablet_id(*skip, ',') : std::unordered_set<locator::tablet_id>();
            co_await tmap.for_each_tablet([&] (locator::tablet_id id, const locator::tablet_info& info) -> future<> {
                auto gid = locator::global_tablet_id{table, id};
                // Skip tablet that is in transitions.
@@ -913,6 +926,11 @@ public:
                    co_return;
                }

+                if (skip_tablets.contains(id)) {
+                    lblogger.debug("Skipped tablet repair for tablet={} by error injector", gid);
+                    co_return;
+                }
+
                // Avoid rescheduling a failed tablet repair in a loop
                // TODO: Allow user to config
                const auto min_reschedule_time = std::chrono::seconds(5);
--- a/service/task_manager_module.cc
+++ b/service/task_manager_module.cc
@@ -10,6 +10,7 @@
 #include "replica/database.hh"
 #include "service/migration_manager.hh"
 #include "service/storage_service.hh"
+#include "repair/row_level.hh"
 #include "service/task_manager_module.hh"
 #include "tasks/task_handler.hh"
 #include "tasks/virtual_task_hint.hh"
@@ -109,6 +110,16 @@ future<std::optional<tasks::virtual_task_hint>> tablet_virtual_task::contains(ta
            tid = tmap.next_tablet(*tid);
        }
    }
+
+    // Check if the task id is present in the repair task table
+    auto progress = co_await _ss._repair.local().get_tablet_repair_task_progress(task_id);
+    if (progress && progress->requested > 0) {
+        co_return tasks::virtual_task_hint{
+            .table_id = progress->table_uuid,
+            .task_type = locator::tablet_task_type::user_repair,
+            .tablet_id = std::nullopt,
+        };
+    }
    co_return std::nullopt;
 }

@@ -243,7 +254,20 @@ future<std::optional<status_helper>> tablet_virtual_task::get_status_helper(task
    size_t sched_nr = 0;
    auto tmptr = _ss.get_token_metadata_ptr();
    auto& tmap = tmptr->tablets().get_tablet_map(table);
+    bool repair_task_finished = false;
+    bool repair_task_pending = false;
    if (is_repair_task(task_type)) {
+        auto progress = co_await _ss._repair.local().get_tablet_repair_task_progress(id);
+        if (progress) {
+            res.status.progress.completed = progress->finished;
+            res.status.progress.total = progress->requested;
+            res.status.progress_units = "tablets";
+            if (progress->requested > 0 && progress->requested == progress->finished) {
+                repair_task_finished = true;
+            } if (progress->requested > 0 && progress->requested > progress->finished) {
+                repair_task_pending = true;
+            }
+        }
        co_await tmap.for_each_tablet([&] (locator::tablet_id tid, const locator::tablet_info& info) {
            auto& task_info = info.repair_task_info;
            if (task_info.tablet_task_id.uuid() == id.uuid()) {
@@ -275,7 +299,17 @@ future<std::optional<status_helper>> tablet_virtual_task::get_status_helper(task
        res.status.state = sched_nr == 0 ? tasks::task_manager::task_state::created : tasks::task_manager::task_state::running;
        co_return res;
    }
-    // FIXME: Show finished tasks.
+
+    if (repair_task_pending) {
+        // When repair_task_pending is true, the res.tablets will be empty iff the request is aborted by user.
+        res.status.state = res.tablets.empty() ? tasks::task_manager::task_state::failed : tasks::task_manager::task_state::running;
+        co_return res;
+    }
+    if (repair_task_finished) {
+        res.status.state = tasks::task_manager::task_state::done;
+        co_return res;
+    }
+
    co_return std::nullopt;
 }

--- a/service/topology_coordinator.cc
+++ b/service/topology_coordinator.cc
@@ -360,7 +360,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
        auto& topo = _topo_sm._topology;

        auto it = topo.find(id);
-        scylla_assert(it);
+        SCYLLA_ASSERT(it);

        std::optional<topology_request> req;
        auto rit = topo.requests.find(id);
@@ -1205,6 +1205,8 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
        std::unordered_map<locator::tablet_transition_stage, background_action_holder> barriers;
        // Record the repair_time returned by the repair_tablet rpc call
        db_clock::time_point repair_time;
+        // Record the repair task update muations
+        utils::chunked_vector<canonical_mutation> repair_task_updates;
        service::session_id session_id;
    };

@@ -1737,6 +1739,14 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                            }
                            dst = dst_opt.value().host;
                        }
+                        // Update repair task
+                        db::system_keyspace::repair_task_entry entry{
+                            .task_uuid   = tasks::task_id(tinfo.repair_task_info.tablet_task_id.uuid()),
+                            .operation   = db::system_keyspace::repair_task_operation::finished,
+                            .first_token = dht::token::to_int64(tmap.get_first_token(gid.tablet)),
+                            .last_token  = dht::token::to_int64(tmap.get_last_token(gid.tablet)),
+                            .table_uuid  = gid.table,
+                        };
                        rtlogger.info("Initiating tablet repair host={} tablet={}", dst, gid);
                        auto session_id = utils::get_local_injector().enter("handle_tablet_migration_repair_random_session") ?
                            service::session_id::create_random_id() : trinfo->session_id;
@@ -1745,6 +1755,10 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                        auto duration = std::chrono::duration<float>(db_clock::now() - sched_time);
                        auto& tablet_state = _tablets[tablet];
                        tablet_state.repair_time = db_clock::from_time_t(gc_clock::to_time_t(res.repair_time));
+                        if (_feature_service.tablet_repair_tasks_table) {
+                            entry.timestamp = db_clock::now();
+                            tablet_state.repair_task_updates = co_await _sys_ks.get_update_repair_task_mutations(entry, api::new_timestamp());
+                        }
                        rtlogger.info("Finished tablet repair host={} tablet={} duration={} repair_time={}",
                                dst, tablet, duration, res.repair_time);
                    })) {
@@ -1763,6 +1777,9 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                                        .set_stage(last_token, locator::tablet_transition_stage::end_repair)
                                        .del_repair_task_info(last_token, _feature_service)
                                        .del_session(last_token);
+                        for (auto& m : tablet_state.repair_task_updates) {
+                            updates.push_back(std::move(m));
+                        }
                        // Skip update repair time in case hosts filter or dcs filter is set.
                        if (valid && is_filter_off) {
                            auto sched_time = tinfo.repair_task_info.sched_time;
@@ -2310,7 +2327,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {

                switch (node.rs->state) {
                    case node_state::bootstrapping: {
-                        scylla_assert(!node.rs->ring);
+                        SCYLLA_ASSERT(!node.rs->ring);
                        auto num_tokens = std::get<join_param>(node.req_param.value()).num_tokens;
                        auto tokens_string = std::get<join_param>(node.req_param.value()).tokens_string;

@@ -2359,11 +2376,11 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                    }
                        break;
                    case node_state::replacing: {
-                        scylla_assert(!node.rs->ring);
+                        SCYLLA_ASSERT(!node.rs->ring);
                        auto replaced_id = std::get<replace_param>(node.req_param.value()).replaced_id;
                        auto it = _topo_sm._topology.normal_nodes.find(replaced_id);
-                        scylla_assert(it != _topo_sm._topology.normal_nodes.end());
-                        scylla_assert(it->second.ring && it->second.state == node_state::normal);
+                        SCYLLA_ASSERT(it != _topo_sm._topology.normal_nodes.end());
+                        SCYLLA_ASSERT(it->second.ring && it->second.state == node_state::normal);

                        topology_mutation_builder builder(node.guard.write_timestamp());

@@ -3022,7 +3039,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                rtbuilder.set("start_time", db_clock::now());
                switch (node.request.value()) {
                    case topology_request::join: {
-                        scylla_assert(!node.rs->ring);
+                        SCYLLA_ASSERT(!node.rs->ring);
                        // Write chosen tokens through raft.
                        builder.set_transition_state(topology::transition_state::join_group0)
                               .with_node(node.id)
@@ -3033,7 +3050,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                        break;
                        }
                    case topology_request::leave:
-                        scylla_assert(node.rs->ring);
+                        SCYLLA_ASSERT(node.rs->ring);
                        // start decommission and put tokens of decommissioning nodes into write_both_read_old state
                        // meaning that reads will go to the replica being decommissioned
                        // but writes will go to new owner as well
@@ -3046,7 +3063,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                                                       "start decommission");
                        break;
                    case topology_request::remove: {
-                        scylla_assert(node.rs->ring);
+                        SCYLLA_ASSERT(node.rs->ring);

                        builder.set_transition_state(topology::transition_state::tablet_draining)
                               .set_version(_topo_sm._topology.version + 1)
@@ -3058,7 +3075,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {
                        break;
                        }
                    case topology_request::replace: {
-                        scylla_assert(!node.rs->ring);
+                        SCYLLA_ASSERT(!node.rs->ring);

                        builder.set_transition_state(topology::transition_state::join_group0)
                               .with_node(node.id)
@@ -3163,7 +3180,7 @@ class topology_coordinator : public endpoint_lifecycle_subscriber {

        auto id = node.id;

-        scylla_assert(!_topo_sm._topology.transition_nodes.empty());
+        SCYLLA_ASSERT(!_topo_sm._topology.transition_nodes.empty());

        release_node(std::move(node));

@@ -4013,7 +4030,7 @@ future<> topology_coordinator::stop() {
        // but let's check all of them because we never reset these holders
        // once they are added as barriers
        for (auto& [stage, barrier]: tablet_state.barriers) {
-            scylla_assert(barrier.has_value());
+            SCYLLA_ASSERT(barrier.has_value());
            co_await stop_background_action(barrier, gid, [stage] { return format("at stage {}", tablet_transition_stage_to_string(stage)); });
        }

--- a/sstables/compress.cc
+++ b/sstables/compress.cc
@@ -251,7 +251,7 @@ void compression::discard_hidden_options() {
 }

 compressor& compression::get_compressor() const {
-    scylla_assert(_compressor);
+    SCYLLA_ASSERT(_compressor);
    return *_compressor.get();
 }

--- a/sstables/compress.hh
+++ b/sstables/compress.hh
@@ -170,7 +170,7 @@ struct compression {
            const_iterator(const const_iterator& other) = default;

            const_iterator& operator=(const const_iterator& other) {
-                scylla_assert(&_offsets == &other._offsets);
+                SCYLLA_ASSERT(&_offsets == &other._offsets);
                _index = other._index;
                return *this;
            }
--- a/sstables/compressor.cc
+++ b/sstables/compressor.cc
@@ -24,7 +24,6 @@
 #include "sstables/sstable_compressor_factory.hh"
 #include "compressor.hh"
 #include "exceptions/exceptions.hh"
-#include "utils/assert.hh"
 #include "utils/config_file_impl.hh"
 #include "utils/class_registrator.hh"
 #include "gms/feature_service.hh"
@@ -296,7 +295,7 @@ size_t zstd_processor::uncompress(const char* input, size_t input_len, char* out
        if (_ddict) {
            return ZSTD_decompress_usingDDict(dctx, output, output_len, input, input_len, _ddict->dict());
        } else {
-            scylla_assert(!_cdict && "Write-only compressor used for reading");
+            SCYLLA_ASSERT(!_cdict && "Write-only compressor used for reading");
            return ZSTD_decompressDCtx(dctx, output, output_len, input, input_len);
        }
    });
@@ -311,7 +310,7 @@ size_t zstd_processor::compress(const char* input, size_t input_len, char* outpu
        if (_cdict) {
            return ZSTD_compress_usingCDict(cctx, output, output_len, input, input_len, _cdict->dict());
        } else {
-            scylla_assert(!_ddict && "Read-only compressor used for writing");
+            SCYLLA_ASSERT(!_ddict && "Read-only compressor used for writing");
            return ZSTD_compressCCtx(cctx, output, output_len, input, input_len, _compression_level);
        }
    });
@@ -628,7 +627,7 @@ size_t lz4_processor::uncompress(const char* input, size_t input_len,
    if (_ddict) {
        ret = LZ4_decompress_safe_usingDict(input, output, input_len, output_len, reinterpret_cast<const char*>(_ddict->raw().data()), _ddict->raw().size());
    } else {
-        scylla_assert(!_cdict && "Write-only compressor used for reading");
+        SCYLLA_ASSERT(!_cdict && "Write-only compressor used for reading");
        ret = LZ4_decompress_safe(input, output, input_len, output_len);
    }
    if (ret < 0) {
@@ -658,7 +657,7 @@ size_t lz4_processor::compress(const char* input, size_t input_len,
            LZ4_resetStream_fast(ctx);
        }
    } else {
-        scylla_assert(!_ddict && "Read-only compressor used for writing");
+        SCYLLA_ASSERT(!_ddict && "Read-only compressor used for writing");
        ret = LZ4_compress_default(input, output + 4, input_len, LZ4_compressBound(input_len));
    }
    if (ret == 0) {
@@ -1269,7 +1268,7 @@ lz4_cdict::~lz4_cdict() {
 }

 std::unique_ptr<sstable_compressor_factory> make_sstable_compressor_factory_for_tests_in_thread() {
-    scylla_assert(thread::running_in_thread());
+    SCYLLA_ASSERT(thread::running_in_thread());
    struct wrapper : sstable_compressor_factory {
        using impl = default_sstable_compressor_factory;
        sharded<impl> _impl;
--- a/sstables/downsampling.hh
+++ b/sstables/downsampling.hh
@@ -44,14 +44,14 @@ public:
     * @return A list of `sampling_level` unique indices between 0 and `sampling_level`
     */
    static const std::vector<int>& get_sampling_pattern(int sampling_level) {
-        scylla_assert(sampling_level > 0 && sampling_level <= BASE_SAMPLING_LEVEL);
+        SCYLLA_ASSERT(sampling_level > 0 && sampling_level <= BASE_SAMPLING_LEVEL);
        auto& entry = _sample_pattern_cache[sampling_level-1];
        if (!entry.empty()) {
            return entry;
        }

        if (sampling_level <= 1) {
-            scylla_assert(_sample_pattern_cache[0].empty());
+            SCYLLA_ASSERT(_sample_pattern_cache[0].empty());
            _sample_pattern_cache[0].push_back(0);
            return _sample_pattern_cache[0];
        }
@@ -96,7 +96,7 @@ public:
     * @return a list of original indexes for current summary entries
     */
    static const std::vector<int>& get_original_indexes(int sampling_level) {
-        scylla_assert(sampling_level > 0 && sampling_level <= BASE_SAMPLING_LEVEL);
+        SCYLLA_ASSERT(sampling_level > 0 && sampling_level <= BASE_SAMPLING_LEVEL);
        auto& entry = _original_index_cache[sampling_level-1];
        if (!entry.empty()) {
            return entry;
@@ -128,7 +128,7 @@ public:
     * @return the number of partitions before the next index summary entry, inclusive on one end
     */
    static int get_effective_index_interval_after_index(int index, int sampling_level, int min_index_interval) {
-        scylla_assert(index >= -1);
+        SCYLLA_ASSERT(index >= -1);
        const std::vector<int>& original_indexes = get_original_indexes(sampling_level);
        if (index == -1) {
            return original_indexes[0] * min_index_interval;
--- a/sstables/exceptions.hh
+++ b/sstables/exceptions.hh
@@ -31,7 +31,7 @@ public:
 [[noreturn]] void on_parse_error(sstring message, std::optional<component_name> filename);
 [[noreturn, gnu::noinline]] void on_bti_parse_error(uint64_t pos);

-// Use this instead of scylla_assert() or assert() in code that is used while parsing SSTables.
+// Use this instead of SCYLLA_ASSERT() or assert() in code that is used while parsing SSTables.
 // SSTables can be corrupted either by ScyllaDB itself or by a freak accident like cosmic background
 // radiation hitting the disk the wrong way. Either way a corrupt SSTable should not bring down the
 // whole server. This method will call on_internal_error() if the condition is false.
--- a/sstables/generation_type.hh
+++ b/sstables/generation_type.hh
@@ -129,7 +129,7 @@ public:
    /// way to determine that is overlapping its partition-ranges with the shard's
    /// owned ranges.
    static bool maybe_owned_by_this_shard(const sstables::generation_type& gen) {
-        scylla_assert(bool(gen));
+        SCYLLA_ASSERT(bool(gen));
        int64_t hint = 0;
        if (gen.is_uuid_based()) {
            hint = std::hash<utils::UUID>{}(gen.as_uuid());
--- a/sstables/index_reader.hh
+++ b/sstables/index_reader.hh
@@ -57,7 +57,10 @@ public:
    index_list indexes;

    index_consumer(logalloc::region& r, schema_ptr s)
-        : _s(std::move(s))
+        : _s(s)
+        , _alloc_section(abstract_formatter([s] (fmt::format_context& ctx) {
+            fmt::format_to(ctx.out(), "index_consumer {}.{}", s->ks_name(), s->cf_name());
+        }))
        , _region(r)
    { }

@@ -785,6 +788,9 @@ public:
                                                      _sstable->manager().get_cache_tracker().region(),
                                                      _sstable->manager().get_cache_tracker().get_partition_index_cache_stats()))
        , _index_cache(caching ? *_sstable->_index_cache : *_local_index_cache)
+        , _alloc_section(abstract_formatter([sst = _sstable] (fmt::format_context& ctx) {
+            fmt::format_to(ctx.out(), "index_reader {}", sst->get_filename());
+        }))
        , _region(_sstable->manager().get_cache_tracker().region())
        , _use_caching(caching)
        , _single_page_read(single_partition_read) // all entries for a given partition are within a single page
--- a/sstables/mx/bsearch_clustered_cursor.hh
+++ b/sstables/mx/bsearch_clustered_cursor.hh
@@ -284,6 +284,9 @@ public:
        , _clustering_parser(s, permit, _ctr.clustering_column_value_fix_legths(), true)
        , _block_parser(s, permit, _ctr.clustering_column_value_fix_legths())
        , _permit(std::move(permit))
+        , _as(abstract_formatter([s] (fmt::format_context& ctx) {
+            fmt::format_to(ctx.out(), "cached_promoted_index {}.{}", s.ks_name(), s.cf_name());
+        }))
    { }

    ~cached_promoted_index() {
--- a/sstables/mx/writer.cc
+++ b/sstables/mx/writer.cc
@@ -91,7 +91,7 @@ public:
        {}

        void increment() {
-            scylla_assert(_range);
+            SCYLLA_ASSERT(_range);
            if (!_range->next()) {
                _range = nullptr;
            }
@@ -102,7 +102,7 @@ public:
        }

        const ValueType dereference() const {
-            scylla_assert(_range);
+            SCYLLA_ASSERT(_range);
            return _range->get_value();
        }

@@ -153,7 +153,7 @@ public:
        auto limit = std::min(_serialization_limit_size, _offset + clustering_block::max_block_size);

        _current_block = {};
-        scylla_assert (_offset % clustering_block::max_block_size == 0);
+        SCYLLA_ASSERT (_offset % clustering_block::max_block_size == 0);
        while (_offset < limit) {
            auto shift = _offset % clustering_block::max_block_size;
            if (_offset < _prefix.size(_schema)) {
@@ -280,7 +280,7 @@ public:
                    ++_current_index;
                }
            } else {
-                scylla_assert(_mode == encoding_mode::large_encode_missing);
+                SCYLLA_ASSERT(_mode == encoding_mode::large_encode_missing);
                while (_current_index < total_size) {
                    auto cell = _row.find_cell(_columns[_current_index].get().id);
                    if (!cell) {
@@ -1180,7 +1180,7 @@ void writer::write_cell(bytes_ostream& writer, const clustering_key_prefix* clus

    if (cdef.is_counter()) {
        if (!is_deleted) {
-            scylla_assert(!cell.is_counter_update());
+            SCYLLA_ASSERT(!cell.is_counter_update());
            auto ccv = counter_cell_view(cell);
            write_counter_value(ccv, writer, _sst.get_version(), [] (bytes_ostream& out, uint32_t value) {
                return write_vint(out, value);
@@ -1489,7 +1489,7 @@ template <typename W>
 requires Writer<W>
 static void write_clustering_prefix(sstable_version_types v, W& writer, bound_kind_m kind,
    const schema& s, const clustering_key_prefix& clustering) {
-    scylla_assert(kind != bound_kind_m::static_clustering);
+    SCYLLA_ASSERT(kind != bound_kind_m::static_clustering);
    write(v, writer, kind);
    auto is_ephemerally_full = ephemerally_full_prefix{s.is_compact_table()};
    if (kind != bound_kind_m::clustering) {
--- a/sstables/partition_index_cache.hh
+++ b/sstables/partition_index_cache.hh
@@ -59,7 +59,7 @@ private:
                // Live entry_ptr should keep the entry alive, except when the entry failed on loading.
                // In that case, entry_ptr holders are not supposed to use the pointer, so it's safe
                // to nullify those entry_ptrs.
-                scylla_assert(!ready());
+                SCYLLA_ASSERT(!ready());
            }
        }

--- a/sstables/sstable_directory.cc
+++ b/sstables/sstable_directory.cc
@@ -496,7 +496,7 @@ sstable_directory::move_foreign_sstables(sharded<sstable_directory>& source_dire
            return make_ready_future<>();
        }
        // Should be empty, since an SSTable that belongs to this shard is not remote.
-        scylla_assert(shard_id != this_shard_id());
+        SCYLLA_ASSERT(shard_id != this_shard_id());
        dirlog.debug("Moving {} unshared SSTables of {}.{} to shard {} ", info_vec.size(), _schema->ks_name(), _schema->cf_name(), shard_id);
        return source_directory.invoke_on(shard_id, &sstables::sstable_directory::load_foreign_sstables, std::move(info_vec));
    });
@@ -540,7 +540,7 @@ sstable_directory::collect_output_unshared_sstables(std::vector<sstables::shared
    dirlog.debug("Collecting {} output SSTables (remote={})", resharded_sstables.size(), remote_ok);
    return parallel_for_each(std::move(resharded_sstables), [this, remote_ok] (sstables::shared_sstable sst) {
        auto shards = sst->get_shards_for_this_sstable();
-        scylla_assert(shards.size() == 1);
+        SCYLLA_ASSERT(shards.size() == 1);
        auto shard = shards[0];

        if (shard == this_shard_id()) {
--- a/sstables/sstable_set.cc
+++ b/sstables/sstable_set.cc
@@ -283,7 +283,7 @@ bool partitioned_sstable_set::store_as_unleveled(const shared_sstable& sst) cons
        }
        sstlog.info("SSTable {}, as_unleveled={}, expect_unleveled={}, sst_tr={}, overlap_ratio={}",
            sst->generation(), as_unleveled, expect_unleveled, sst_tr, dht::overlap_ratio(_token_range, sst_tr));
-        scylla_assert(as_unleveled == expect_unleveled);
+        SCYLLA_ASSERT(as_unleveled == expect_unleveled);
    });

    return as_unleveled;
@@ -712,8 +712,8 @@ public:

        // by !empty(bound) and `_it` invariant:
        //      _it != _end, _it->first <= bound, and filter(*_it->second) == true
-        scylla_assert(_cmp(_it->first, bound) <= 0);
-        // we don't scylla_assert(filter(*_it->second)) due to the requirement that `filter` is called at most once for each sstable
+        SCYLLA_ASSERT(_cmp(_it->first, bound) <= 0);
+        // we don't SCYLLA_ASSERT(filter(*_it->second)) due to the requirement that `filter` is called at most once for each sstable

        // Find all sstables with the same position as `_it` (they form a contiguous range in the container).
        auto next = std::find_if(std::next(_it), _end, [this] (const value_t& v) { return _cmp(v.first, _it->first) != 0; });
@@ -1301,7 +1301,7 @@ sstable_set::create_single_key_sstable_reader(
        mutation_reader::forwarding fwd_mr,
        const sstable_predicate& predicate,
        sstables::integrity_check integrity) const {
-    scylla_assert(pr.is_singular() && pr.start()->value().has_key());
+    SCYLLA_ASSERT(pr.is_singular() && pr.start()->value().has_key());
    return _impl->create_single_key_sstable_reader(cf, std::move(schema),
            std::move(permit), sstable_histogram, pr, slice, std::move(trace_state), fwd, fwd_mr, predicate, integrity);
 }
@@ -1408,7 +1408,7 @@ sstable_set::make_local_shard_sstable_reader(
 {
    auto reader_factory_fn = [s, permit, &slice, trace_state, fwd, fwd_mr, &monitor_generator, &predicate, integrity]
            (shared_sstable& sst, const dht::partition_range& pr) mutable {
-        scylla_assert(!sst->is_shared());
+        SCYLLA_ASSERT(!sst->is_shared());
        if (!predicate(*sst)) {
            return make_empty_mutation_reader(s, permit);
        }
--- a/sstables/sstables.cc
+++ b/sstables/sstables.cc
@@ -36,7 +36,6 @@

 #include "utils/error_injection.hh"
 #include "utils/to_string.hh"
-#include "utils/assert.hh"
 #include "data_dictionary/storage_options.hh"
 #include "dht/sharder.hh"
 #include "writer.hh"
@@ -2118,11 +2117,14 @@ sstable::write_scylla_metadata(shard_id shard, struct run_identifier identifier,
    }

    sstable_id sid;
-    if (generation().is_uuid_based()) {
+    // Force a random sstable_id for testing purposes
+    bool random_sstable_identifier = utils::get_local_injector().is_enabled("random_sstable_identifier");
+    if (!random_sstable_identifier && generation().is_uuid_based()) {
        sid = sstable_id(generation().as_uuid());
    } else {
        sid = sstable_id(utils::UUID_gen::get_time_UUID());
-        sstlog.info("SSTable {} has numerical generation. SSTable identifier in scylla_metadata set to {}", get_filename(), sid);
+        auto msg = random_sstable_identifier ? "forced random sstable_id" : "has numerical generation";
+        sstlog.info("SSTable {} {}. SSTable identifier in scylla_metadata set to {}", get_filename(), msg, sid);
    }
    _components->scylla_metadata->data.set<scylla_metadata_type::SSTableIdentifier>(scylla_metadata::sstable_identifier{sid});

@@ -2486,11 +2488,6 @@ void sstable::validate_originating_host_id() const {
        }
        return;
    }
-
-    if (*originating_host_id != local_host_id) {
-        // FIXME refrain from throwing an exception because of #10148
-        sstlog.warn("Host id {} does not match local host id {} while validating SSTable: {}. Load foreign SSTables via the upload dir instead.", *originating_host_id, local_host_id, get_filename());
-    }
 }

 sstring sstable::component_basename(const sstring& ks, const sstring& cf, version_types version, generation_type generation,
@@ -2541,8 +2538,11 @@ std::vector<std::pair<component_type, sstring>> sstable::all_components() const
    return all;
 }

-future<> sstable::snapshot(const sstring& dir) const {
-    return _storage->snapshot(*this, dir, storage::absolute_path::yes);
+future<generation_type> sstable::snapshot(const sstring& dir, bool use_sstable_identifier) const {
+    // Use the sstable identifier UUID if available to enable global de-duplication of sstables in backup.
+    generation_type gen = (use_sstable_identifier && _sstable_identifier) ? generation_type(_sstable_identifier->uuid()) : _generation;
+    co_await _storage->snapshot(*this, dir, storage::absolute_path::yes, gen);
+    co_return gen;
 }

 future<> sstable::change_state(sstable_state to, delayed_commit_changes* delay_commit) {
@@ -4162,7 +4162,7 @@ future<data_sink> file_io_extension::wrap_sink(const sstable& sst, component_typ
 }

 future<data_source> file_io_extension::wrap_source(const sstable& sst, component_type c, data_source) {
-    scylla_assert(0 && "You are not supposed to get here, file_io_extension::wrap_source() is not implemented");
+    SCYLLA_ASSERT(0 && "You are not supposed to get here, file_io_extension::wrap_source() is not implemented");
 }

 namespace trie {
--- a/sstables/sstables.hh
+++ b/sstables/sstables.hh
@@ -397,6 +397,10 @@ public:
        return _version;
    }

+    format_types get_format() const {
+        return _format;
+    }
+
    // Returns the total bytes of all components.
    uint64_t bytes_on_disk() const;
    file_size_stats get_file_size_stats() const;
@@ -438,7 +442,10 @@ public:

    std::vector<std::pair<component_type, sstring>> all_components() const;

-    future<> snapshot(const sstring& dir) const;
+    // When use_sstable_identifier is true and the sstable identifier is available,
+    // use it to name the sstable in the snapshot, rather than the sstable generation.
+    // Returns the generation used for snapshot.
+    future<generation_type> snapshot(const sstring& dir, bool use_sstable_identifier = false) const;

    // Delete the sstable by unlinking all sstable files
    // Ignores all errors.
--- a/sstables/sstables_manager.cc
+++ b/sstables/sstables_manager.cc
@@ -55,9 +55,9 @@ sstables_manager::sstables_manager(
 }

 sstables_manager::~sstables_manager() {
-    scylla_assert(_closing);
-    scylla_assert(_active.empty());
-    scylla_assert(_undergoing_close.empty());
+    SCYLLA_ASSERT(_closing);
+    SCYLLA_ASSERT(_active.empty());
+    SCYLLA_ASSERT(_undergoing_close.empty());
 }

 void sstables_manager::subscribe(sstables_manager_event_handler& handler) {
@@ -135,13 +135,17 @@ future<> storage_manager::update_config(const db::config& cfg) {
    co_return;
 }

-shared_ptr<sstables::object_storage_client> storage_manager::get_endpoint_client(sstring endpoint) {
+auto storage_manager::get_endpoint(const sstring& endpoint) -> object_storage_endpoint& {
    auto found = _object_storage_endpoints.find(endpoint);
    if (found == _object_storage_endpoints.end()) {
        smlogger.error("unable to find {} in configured object-storage endpoints", endpoint);
        throw std::invalid_argument(format("endpoint {} not found", endpoint));
    }
-    auto& ep = found->second;
+    return found->second;
+}
+
+shared_ptr<sstables::object_storage_client> storage_manager::get_endpoint_client(sstring endpoint) {
+    auto& ep = get_endpoint(endpoint);
    if (ep.client == nullptr) {
        ep.client = make_object_storage_client(ep.cfg, _object_storage_clients_memory, [&ct = container()] (std::string ep) {
            return ct.local().get_endpoint_client(ep);
@@ -150,6 +154,10 @@ shared_ptr<sstables::object_storage_client> storage_manager::get_endpoint_client
    return ep.client;
 }

+sstring storage_manager::get_endpoint_type(sstring endpoint) {
+    return get_endpoint(endpoint).cfg.type();
+}
+
 bool storage_manager::is_known_endpoint(sstring endpoint) const {
    return _object_storage_endpoints.contains(endpoint);
 }
--- a/sstables/sstables_manager.hh
+++ b/sstables/sstables_manager.hh
@@ -70,6 +70,7 @@ class storage_manager : public peering_sharded_service<storage_manager> {
    seastar::metrics::metric_groups metrics;

    future<> update_config(const db::config&);
+    object_storage_endpoint& get_endpoint(const sstring& ep);

 public:
    struct config {
@@ -80,6 +81,7 @@ public:
    storage_manager(const db::config&, config cfg);
    shared_ptr<object_storage_client> get_endpoint_client(sstring endpoint);
    bool is_known_endpoint(sstring endpoint) const;
+    sstring get_endpoint_type(sstring endpoint);
    future<> stop();
    std::vector<sstring> endpoints(sstring type = "") const noexcept;
 };
@@ -185,12 +187,12 @@ public:
            size_t buffer_size = default_sstable_buffer_size);

    shared_ptr<object_storage_client> get_endpoint_client(sstring endpoint) const {
-        scylla_assert(_storage != nullptr);
+        SCYLLA_ASSERT(_storage != nullptr);
        return _storage->get_endpoint_client(std::move(endpoint));
    }

    bool is_known_endpoint(sstring endpoint) const {
-        scylla_assert(_storage != nullptr);
+        SCYLLA_ASSERT(_storage != nullptr);
        return _storage->is_known_endpoint(std::move(endpoint));
    }

@@ -241,7 +243,7 @@ public:

    // Only for sstable::storage usage
    sstables::sstables_registry& sstables_registry() const noexcept {
-        scylla_assert(_sstables_registry && "sstables_registry is not plugged");
+        SCYLLA_ASSERT(_sstables_registry && "sstables_registry is not plugged");
        return *_sstables_registry;
    }

--- a/sstables/storage.cc
+++ b/sstables/storage.cc
@@ -109,7 +109,7 @@ future<data_sink> filesystem_storage::make_data_or_index_sink(sstable& sst, comp
    options.buffer_size = sst.sstable_buffer_size;
    options.write_behind = 10;

-    scylla_assert(
+    SCYLLA_ASSERT(
        type == component_type::Data
        || type == component_type::Index
        || type == component_type::Rows
@@ -129,7 +129,7 @@ future<data_sink> filesystem_storage::make_data_or_index_sink(sstable& sst, comp
 }

 future<data_source> filesystem_storage::make_data_or_index_source(sstable&, component_type type, file f, uint64_t offset, uint64_t len, file_input_stream_options opt) const {
-    scylla_assert(type == component_type::Data || type == component_type::Index);
+    SCYLLA_ASSERT(type == component_type::Data || type == component_type::Index);
    co_return make_file_data_source(std::move(f), offset, len, std::move(opt));
 }

@@ -717,7 +717,7 @@ static future<data_source> maybe_wrap_source(const sstable& sst, component_type
 }

 future<data_sink> object_storage_base::make_data_or_index_sink(sstable& sst, component_type type) {
-    scylla_assert(
+    SCYLLA_ASSERT(
        type == component_type::Data
        || type == component_type::Index
        || type == component_type::Rows
--- a/sstables/storage.hh
+++ b/sstables/storage.hh
@@ -83,13 +83,13 @@ class storage {

    // Internal, but can also be used by tests
    virtual future<> change_dir_for_test(sstring nd) {
-        scylla_assert(false && "Changing directory not implemented");
+        SCYLLA_ASSERT(false && "Changing directory not implemented");
    }
    virtual future<> create_links(const sstable& sst, const std::filesystem::path& dir) const {
-        scylla_assert(false && "Direct links creation not implemented");
+        SCYLLA_ASSERT(false && "Direct links creation not implemented");
    }
    virtual future<> move(const sstable& sst, sstring new_dir, generation_type generation, delayed_commit_changes* delay) {
-        scylla_assert(false && "Direct move not implemented");
+        SCYLLA_ASSERT(false && "Direct move not implemented");
    }

 public:
--- a/sstables/trie/bti_key_translation.cc
+++ b/sstables/trie/bti_key_translation.cc
@@ -8,7 +8,6 @@

 #include "bti_key_translation.hh"
 #include "sstables/mx/types.hh"
-#include "utils/assert.hh"

 namespace sstables::trie {

@@ -57,7 +56,7 @@ void lazy_comparable_bytes_from_ring_position::init_first_fragment(dht::token dh
 }

 void lazy_comparable_bytes_from_ring_position::trim(const size_t n) {
-    scylla_assert(n <= _size);
+    SCYLLA_ASSERT(n <= _size);
    _size = n;
 }

@@ -128,7 +127,7 @@ lazy_comparable_bytes_from_clustering_position::lazy_comparable_bytes_from_clust
 {}

 void lazy_comparable_bytes_from_clustering_position::trim(unsigned n) {
-    scylla_assert(n <= _size);
+    SCYLLA_ASSERT(n <= _size);
    _size = n;
 }

--- a/sstables/trie/bti_node_reader.cc
+++ b/sstables/trie/bti_node_reader.cc
@@ -8,7 +8,6 @@

 #include "bti_node_reader.hh"
 #include "bti_node_type.hh"
-#include "utils/assert.hh"

 namespace sstables::trie {

@@ -449,37 +448,37 @@ seastar::future<> bti_node_reader::load(int64_t pos, const reader_permit& permit
 }

 trie::load_final_node_result bti_node_reader::read_node(int64_t pos) {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_read_node(pos, sp);
 }

 trie::node_traverse_result bti_node_reader::walk_down_along_key(int64_t pos, const_bytes key) {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_walk_down_along_key(pos, sp, key);
 }

 trie::node_traverse_sidemost_result bti_node_reader::walk_down_leftmost_path(int64_t pos) {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_walk_down_leftmost_path(pos, sp);
 }

 trie::node_traverse_sidemost_result bti_node_reader::walk_down_rightmost_path(int64_t pos) {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_walk_down_rightmost_path(pos, sp);
 }

 trie::get_child_result bti_node_reader::get_child(int64_t pos, int child_idx, bool forward) const {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_get_child(pos, sp, child_idx, forward);
 }

 const_bytes bti_node_reader::get_payload(int64_t pos) const {
-    scylla_assert(cached(pos));
+    SCYLLA_ASSERT(cached(pos));
    auto sp = _cached_page->get_view().subspan(pos % cached_file::page_size);
    return bti_get_payload(pos, sp);
 }
--- a/sstables/trie/trie_traversal.hh
+++ b/sstables/trie/trie_traversal.hh
@@ -204,7 +204,7 @@ inline void descend_leftmost_single_page(
            next_pos = -1;
            trail.back().child_idx = -1;
        } else {
-            scylla_assert(traverse_one.n_children >= 1);
+            SCYLLA_ASSERT(traverse_one.n_children >= 1);
            next_pos = traverse_one.body_pos - traverse_one.child_offset;
        }
    }
--- a/sstables/trie/trie_writer.cc
+++ b/sstables/trie/trie_writer.cc
@@ -9,7 +9,6 @@
 #include <seastar/util/log.hh>
 #include "writer_node.hh"
 #include "common.hh"
-#include "utils/assert.hh"

 seastar::logger trie_logger("trie");

@@ -28,7 +27,7 @@ auto writer_node::create(const_bytes b, bump_allocator& alctr) -> ptr<writer_nod
 }

 auto writer_node::add_child(const_bytes b, bump_allocator& alctr) -> ptr<writer_node> {
-    scylla_assert(get_children().empty() || b[0] > get_children().back()->_transition[0]);
+    SCYLLA_ASSERT(get_children().empty() || b[0] > get_children().back()->_transition[0]);
    reserve_children(get_children().size() + 1, alctr);
    auto new_child = create(b, alctr);
    push_child(new_child, alctr);
--- a/sstables/trie/trie_writer.hh
+++ b/sstables/trie/trie_writer.hh
@@ -406,7 +406,7 @@ inline void trie_writer<Output>::complete_until_depth(size_t depth) {

 template <trie_writer_sink Output>
 inline void trie_writer<Output>::add(size_t depth, const_bytes key_tail, const trie_payload& p) {
-    scylla_assert(p._payload_bits);
+    SCYLLA_ASSERT(p._payload_bits);
    add_partial(depth, key_tail);
    _stack.back()->set_payload(p);
 }
@@ -416,10 +416,10 @@ template <trie_writer_sink Output>
 inline void trie_writer<Output>::add_partial(size_t depth, const_bytes key_frag) {
    expensive_log("writer_node::add_partial: end, stack={}, depth={}, _current_depth={} tail={}", _stack.size(), depth, _current_depth, fmt_hex(key_frag));
    expensive_assert(_stack.size() >= 1);
-    scylla_assert(_current_depth >= depth);
+    SCYLLA_ASSERT(_current_depth >= depth);
    // There is only one case where a zero-length tail is legal:
    // when inserting the empty key.
-    scylla_assert(!key_frag.empty() || depth == 0);
+    SCYLLA_ASSERT(!key_frag.empty() || depth == 0);

    complete_until_depth(depth);
    if (key_frag.size()) {
@@ -444,7 +444,7 @@ inline sink_pos trie_writer<Output>::finish() {
    if (!try_write(_stack[0])) {
        _out.pad_to_page_boundary();
        bool ok = try_write(_stack[0]);
-        scylla_assert(ok);
+        SCYLLA_ASSERT(ok);
    }
    auto root_pos = _stack[0]->_pos;

--- a/sstables/trie/writer_node.hh
+++ b/sstables/trie/writer_node.hh
@@ -203,7 +203,7 @@ private:
    [[nodiscard]] ptr<T> alloc_impl(size_t n) {
        using value_type = ptr<T>::value_type;
        expensive_assert(n < _segment_size / sizeof(value_type));
-        scylla_assert(n > 0);
+        SCYLLA_ASSERT(n > 0);
        auto sz = n * sizeof(value_type);
        _remaining -= _remaining % alignof(value_type);
        if (sz > _remaining) [[unlikely]] {
@@ -230,7 +230,7 @@ private:

 public:
    bump_allocator(size_t segment_size) : _segment_size(segment_size) {
-        scylla_assert(_segment_size % alignof(max_align_t) == 0);
+        SCYLLA_ASSERT(_segment_size % alignof(max_align_t) == 0);
    }

    // Total memory usage by this allocator.
--- a/sstables/trie/writer_node.impl.hh
+++ b/sstables/trie/writer_node.impl.hh
@@ -9,7 +9,6 @@
 #pragma once

 #include "writer_node.hh"
-#include "utils/assert.hh"
 #include "utils/small_vector.hh"

 namespace sstables::trie {
@@ -112,9 +111,9 @@ void writer_node::write(ptr<writer_node> self, Output& out, bool guaranteed_fit)
            fmt::ptr(node.get()), out.pos().value, node->get_children().size(), node->_node_size.value, node->_transition_length);

        if (guaranteed_fit) {
-            scylla_assert(out.pos() - startpos == node->_branch_size);
+            SCYLLA_ASSERT(out.pos() - startpos == node->_branch_size);
            node->_pos = sink_pos(out.write(*node, sink_pos(out.pos())));
-            scylla_assert(out.pos() - startpos == node->_branch_size + node->_node_size);
+            SCYLLA_ASSERT(out.pos() - startpos == node->_branch_size + node->_node_size);
        } else {
            if (uint64_t(out.serialized_size(*node, sink_pos(out.pos())).value) > out.bytes_left_in_page()) {
                out.pad_to_page_boundary();
--- a/sstables_loader.cc
+++ b/sstables_loader.cc
@@ -205,6 +205,13 @@ private:
    }

    bool tablet_in_scope(locator::tablet_id) const;
+
+    friend future<std::vector<tablet_sstable_collection>> get_sstables_for_tablets_for_tests(const std::vector<sstables::shared_sstable>& sstables,
+                                                                                             std::vector<dht::token_range>&& tablets_ranges);
+    // Pay attention, while working with tablet ranges, the `erm` must be held alive as long as we retrieve (and use here) tablet ranges from
+    // the tablet map. This is already done when using `tablet_sstable_streamer` class but tread carefully if you plan to use this method somewhere else.
+    static future<std::vector<tablet_sstable_collection>> get_sstables_for_tablets(const std::vector<sstables::shared_sstable>& sstables,
+                                                                                   std::vector<dht::token_range>&& tablets_ranges);
 };

 host_id_vector_replica_set sstable_streamer::get_endpoints(const dht::token& token) const {
@@ -343,55 +350,52 @@ public:
    }
 };

+future<std::vector<tablet_sstable_collection>> tablet_sstable_streamer::get_sstables_for_tablets(const std::vector<sstables::shared_sstable>& sstables,
+                                                                                                 std::vector<dht::token_range>&& tablets_ranges) {
+    auto tablets_sstables =
+        tablets_ranges | std::views::transform([](auto range) { return tablet_sstable_collection{.tablet_range = range}; }) | std::ranges::to<std::vector>();
+    if (sstables.empty() || tablets_sstables.empty()) {
+        co_return std::move(tablets_sstables);
+    }
+    // sstables are sorted by first key in reverse order.
+    auto reversed_sstables = sstables | std::views::reverse;
+
+    for (auto& [tablet_range, sstables_fully_contained, sstables_partially_contained] : tablets_sstables) {
+        for (const auto& sst : reversed_sstables) {
+            auto sst_first = sst->get_first_decorated_key().token();
+            auto sst_last = sst->get_last_decorated_key().token();
+
+            // SSTable entirely after tablet -> no further SSTables (larger keys) can overlap
+            if (tablet_range.after(sst_first, dht::token_comparator{})) {
+                break;
+            }
+            // SSTable entirely before tablet -> skip and continue scanning later (larger keys)
+            if (tablet_range.before(sst_last, dht::token_comparator{})) {
+                continue;
+            }
+
+            if (tablet_range.contains(dht::token_range{sst_first, sst_last}, dht::token_comparator{})) {
+                sstables_fully_contained.push_back(sst);
+            } else {
+                sstables_partially_contained.push_back(sst);
+            }
+            co_await coroutine::maybe_yield();
+        }
+    }
+    co_return std::move(tablets_sstables);
+}
+
 future<> tablet_sstable_streamer::stream(shared_ptr<stream_progress> progress) {
    if (progress) {
        progress->start(_tablet_map.tablet_count());
    }

-    // sstables are sorted by first key in reverse order.
-    auto sstable_it = _sstables.rbegin();
-
-    for (auto tablet_id : _tablet_map.tablet_ids() | std::views::filter([this] (auto tid) { return tablet_in_scope(tid); })) {
-        auto tablet_range = _tablet_map.get_token_range(tablet_id);
-
-        auto sstable_token_range = [] (const sstables::shared_sstable& sst) {
-            return dht::token_range(sst->get_first_decorated_key().token(),
-                                    sst->get_last_decorated_key().token());
-        };
-
-        std::vector<sstables::shared_sstable> sstables_fully_contained;
-        std::vector<sstables::shared_sstable> sstables_partially_contained;
-
-        // sstable is exhausted if its last key is before the current tablet range
-        auto exhausted = [&tablet_range] (const sstables::shared_sstable& sst) {
-            return tablet_range.before(sst->get_last_decorated_key().token(), dht::token_comparator{});
-        };
-        while (sstable_it != _sstables.rend() && exhausted(*sstable_it)) {
-            sstable_it++;
-        }
-
-        for (auto sst_it = sstable_it; sst_it != _sstables.rend(); sst_it++) {
-            auto sst_token_range = sstable_token_range(*sst_it);
-
-            // sstables are sorted by first key, so should skip this SSTable since it
-            // doesn't overlap with the current tablet range.
-            if (!tablet_range.overlaps(sst_token_range, dht::token_comparator{})) {
-                // If the start of the next SSTable's token range lies beyond the current tablet's token
-                // range, we can safely conclude that no more relevant SSTables remain for this tablet.
-                if (tablet_range.after(sst_token_range.start()->value(), dht::token_comparator{})) {
-                    break;
-                }
-                continue;
-            }
-
-            if (tablet_range.contains(sst_token_range, dht::token_comparator{})) {
-                sstables_fully_contained.push_back(*sst_it);
-            } else {
-                sstables_partially_contained.push_back(*sst_it);
-            }
-            co_await coroutine::maybe_yield();
-        }
+    auto classified_sstables = co_await get_sstables_for_tablets(
+        _sstables, _tablet_map.tablet_ids() | std::views::filter([this](auto tid) { return tablet_in_scope(tid); }) | std::views::transform([this](auto tid) {
+                       return _tablet_map.get_token_range(tid);
+                   }) | std::ranges::to<std::vector>());

+    for (auto& [tablet_range, sstables_fully_contained, sstables_partially_contained] : classified_sstables) {
        auto per_tablet_progress = make_shared<per_tablet_stream_progress>(
            progress,
            sstables_fully_contained.size() + sstables_partially_contained.size());
@@ -751,8 +755,9 @@ future<> sstables_loader::download_task_impl::run() {
    };
    llog.debug("Loading sstables from {}({}/{})", _endpoint, _bucket, _prefix);

+    auto ep_type = _loader.local()._storage_manager.get_endpoint_type(_endpoint);
    std::vector<seastar::abort_source> shard_aborts(smp::count);
-    auto [ table_id, sstables_on_shards ] = co_await replica::distributed_loader::get_sstables_from_object_store(_loader.local()._db, _ks, _cf, _sstables, _endpoint, _bucket, _prefix, cfg, [&] {
+    auto [ table_id, sstables_on_shards ] = co_await replica::distributed_loader::get_sstables_from_object_store(_loader.local()._db, _ks, _cf, _sstables, _endpoint, ep_type, _bucket, _prefix, cfg, [&] {
        return &shard_aborts[this_shard_id()];
    });
    llog.debug("Streaming sstables from {}({}/{})", _endpoint, _bucket, _prefix);
@@ -832,3 +837,7 @@ future<tasks::task_id> sstables_loader::download_new_sstables(sstring ks_name, s
                                                                                       std::move(prefix), std::move(sstables), scope, primary_replica_only(primary_replica));
    co_return task->id();
 }
+future<std::vector<tablet_sstable_collection>> get_sstables_for_tablets_for_tests(const std::vector<sstables::shared_sstable>& sstables,
+                                                                                  std::vector<dht::token_range>&& tablets_ranges) {
+    return tablet_sstable_streamer::get_sstables_for_tablets(sstables, std::move(tablets_ranges));
+}
--- a/sstables_loader.hh
+++ b/sstables_loader.hh
@@ -10,6 +10,8 @@

 #include <vector>
 #include <seastar/core/sharded.hh>
+#include "dht/i_partitioner_fwd.hh"
+#include "dht/token.hh"
 #include "schema/schema_fwd.hh"
 #include "sstables/shared_sstable.hh"
 #include "tasks/task_manager.hh"
@@ -152,3 +154,18 @@ struct fmt::formatter<sstables_loader::stream_scope> : fmt::formatter<string_vie
        }
    }
 };
+
+struct tablet_sstable_collection {
+    dht::token_range tablet_range;
+    std::vector<sstables::shared_sstable> sstables_fully_contained;
+    std::vector<sstables::shared_sstable> sstables_partially_contained;
+};
+
+// This function is intended for test purposes only.
+// It assigns the given sstables to the given tablet ranges based on token containment.
+// It returns a vector of tablet_sstable_collection, each containing the tablet range
+// and the sstables that are fully or partially contained within that range.
+// The prerequisite is the tablet ranges are sorted by the range in ascending order and non-overlapping.
+// Another prerequisite is that the sstables' token ranges are sorted by its `start` in descending order.
+future<std::vector<tablet_sstable_collection>> get_sstables_for_tablets_for_tests(const std::vector<sstables::shared_sstable>& sstables,
+                                                                                  std::vector<dht::token_range>&& tablets_ranges);
--- a/test/alternator/test_batch.py
+++ b/test/alternator/test_batch.py
@@ -205,7 +205,7 @@ def test_batch_write_invalid_operation(test_table_s):

 # In test_item.py we have a bunch of test_empty_* tests on different ways to
 # create an empty item (which in Scylla requires the special CQL row marker
-# to be supported correctly). BatchWriteItems provides yet another way of
+# to be supported correctly). BatchWriteItem provides yet another way of
 # creating items, so check the empty case here too:
 def test_empty_batch_write(test_table):
    p = random_string()
@@ -214,7 +214,7 @@ def test_empty_batch_write(test_table):
        batch.put_item({'p': p, 'c': c})
    assert test_table.get_item(Key={'p': p, 'c': c}, ConsistentRead=True)['Item'] == {'p': p, 'c': c}

-# Test that BatchWriteItems allows writing to multiple tables in one operation
+# Test that BatchWriteItem allows writing to multiple tables in one operation
 def test_batch_write_multiple_tables(test_table_s, test_table):
    p1 = random_string()
    c1 = random_string()
--- a/test/boost/CMakeLists.txt
+++ b/test/boost/CMakeLists.txt
@@ -370,6 +370,7 @@ add_scylla_test(combined_tests
    sstable_compression_config_test.cc
    sstable_directory_test.cc
    sstable_set_test.cc
+    sstable_tablet_streaming.cc
    statement_restrictions_test.cc
    storage_proxy_test.cc
    tablets_test.cc
--- a/test/boost/database_test.cc
+++ b/test/boost/database_test.cc
@@ -31,6 +31,7 @@
 #include "replica/database.hh"
 #include "utils/assert.hh"
 #include "utils/lister.hh"
+#include "utils/rjson.hh"
 #include "partition_slice_builder.hh"
 #include "mutation/frozen_mutation.hh"
 #include "test/lib/mutation_source_test.hh"
@@ -38,6 +39,7 @@
 #include "service/migration_manager.hh"
 #include "sstables/sstables.hh"
 #include "sstables/generation_type.hh"
+#include "sstables/sstable_version.hh"
 #include "db/config.hh"
 #include "db/commitlog/commitlog_replayer.hh"
 #include "db/commitlog/commitlog.hh"
@@ -51,6 +53,7 @@
 #include "db/system_keyspace.hh"
 #include "db/view/view_builder.hh"
 #include "replica/mutation_dump.hh"
+#include "utils/error_injection.hh"

 using namespace std::chrono_literals;
 using namespace sstables;
@@ -612,13 +615,13 @@ future<> do_with_some_data(std::vector<sstring> cf_names, std::function<future<>
    });
 }

-future<> take_snapshot(cql_test_env& e, sstring ks_name = "ks", sstring cf_name = "cf", sstring snapshot_name = "test", bool skip_flush = false) {
+future<> take_snapshot(cql_test_env& e, sstring ks_name = "ks", sstring cf_name = "cf", sstring snapshot_name = "test", db::snapshot_options opts = {}) {
    try {
        auto uuid = e.db().local().find_uuid(ks_name, cf_name);
-        co_await replica::database::snapshot_table_on_all_shards(e.db(), uuid, snapshot_name, skip_flush);
+        co_await replica::database::snapshot_table_on_all_shards(e.db(), uuid, snapshot_name, opts);
    } catch (...) {
-        testlog.error("Could not take snapshot for {}.{} snapshot_name={} skip_flush={}: {}",
-                ks_name, cf_name, snapshot_name, skip_flush, std::current_exception());
+        testlog.error("Could not take snapshot for {}.{} snapshot_name={} skip_flush={} use_sstable_identifier={}: {}",
+                ks_name, cf_name, snapshot_name, opts.skip_flush, opts.use_sstable_identifier, std::current_exception());
        throw;
    }
 }
@@ -632,6 +635,37 @@ future<std::set<sstring>> collect_files(fs::path path) {
    co_return ret;
 }

+static bool is_component(const sstring& fname, const sstring& suffix) {
+    return fname.ends_with(suffix);
+}
+
+static std::set<sstring> collect_sstables(const std::set<sstring>& all_files, const sstring& suffix) {
+    // Verify manifest against the files in the snapshots dir
+    auto pred = [&suffix] (const sstring& fname) {
+        return is_component(fname, suffix);
+    };
+    return std::ranges::filter_view(all_files, pred) | std::ranges::to<std::set<sstring>>();
+}
+
+// Validate that the manifest.json lists exactly the SSTables present in the snapshot directory
+static future<> validate_manifest(const fs::path& snapshot_dir, const std::set<sstring>& in_snapshot_dir) {
+    sstring suffix = "-Data.db";
+    auto sstables_in_snapshot = collect_sstables(in_snapshot_dir, suffix);
+
+    std::set<sstring> sstables_in_manifest;
+    auto manifest_str = co_await util::read_entire_file_contiguous(snapshot_dir / "manifest.json");
+    auto manifest_json = rjson::parse(manifest_str);
+    auto& manifest_files = manifest_json["files"];
+    BOOST_REQUIRE(manifest_files.IsArray());
+    for (auto& f : manifest_files.GetArray()) {
+        if (is_component(f.GetString(), suffix)) {
+            sstables_in_manifest.insert(f.GetString());
+        }
+    }
+    testlog.debug("SSTables in manifest.json: {}", fmt::join(sstables_in_manifest, ", "));
+    BOOST_REQUIRE_EQUAL(sstables_in_snapshot, sstables_in_manifest);
+}
+
 static future<> snapshot_works(const std::string& table_name) {
    return do_with_some_data({"cf"}, [table_name] (cql_test_env& e) {
        take_snapshot(e, "ks", table_name).get();
@@ -651,6 +685,8 @@ static future<> snapshot_works(const std::string& table_name) {
        // all files were copied and manifest was generated
        BOOST_REQUIRE_EQUAL(in_table_dir, in_snapshot_dir);

+        validate_manifest(snapshot_dir, in_snapshot_dir).get();
+
        return make_ready_future<>();
    }, true);
 }
@@ -669,7 +705,8 @@ SEASTAR_TEST_CASE(index_snapshot_works) {

 SEASTAR_TEST_CASE(snapshot_skip_flush_works) {
    return do_with_some_data({"cf"}, [] (cql_test_env& e) {
-        take_snapshot(e, "ks", "cf", "test", true /* skip_flush */).get();
+        db::snapshot_options opts = {.skip_flush = true};
+        take_snapshot(e, "ks", "cf", "test", opts).get();

        auto& cf = e.local_db().find_column_family("ks", "cf");

@@ -682,6 +719,41 @@ SEASTAR_TEST_CASE(snapshot_skip_flush_works) {
    });
 }

+SEASTAR_TEST_CASE(snapshot_use_sstable_identifier_works) {
+#ifndef SCYLLA_ENABLE_ERROR_INJECTION
+        fmt::print("Skipping test as it depends on error injection. Please run in mode where it's enabled (debug,dev).\n");
+        return make_ready_future<>();
+#endif
+    sstring table_name = "cf";
+    // Force random sstable identifiers, otherwise the initial sstable_id is equal
+    // to the sstable generation and the test can't distinguish between them.
+    utils::get_local_injector().enable("random_sstable_identifier", false);
+    return do_with_some_data({table_name}, [table_name] (cql_test_env& e) -> future<> {
+        sstring tag = "test";
+        db::snapshot_options opts = {.use_sstable_identifier = true};
+        co_await take_snapshot(e, "ks", table_name, tag, opts);
+
+        auto& cf = e.local_db().find_column_family("ks", table_name);
+        auto table_directory = table_dir(cf);
+        auto snapshot_dir = table_directory / sstables::snapshots_dir / tag;
+        auto in_table_dir = co_await collect_files(table_directory);
+        // snapshot triggered a flush and wrote the data down.
+        BOOST_REQUIRE_GE(in_table_dir.size(), 9);
+        testlog.info("Files in table dir: {}", fmt::join(in_table_dir, ", "));
+
+        auto in_snapshot_dir = co_await collect_files(snapshot_dir);
+        testlog.info("Files in snapshot dir: {}", fmt::join(in_snapshot_dir, ", "));
+
+        in_table_dir.insert("manifest.json");
+        in_table_dir.insert("schema.cql");
+        // all files were copied and manifest was generated
+        BOOST_REQUIRE_EQUAL(in_table_dir.size(), in_snapshot_dir.size());
+        BOOST_REQUIRE_NE(in_table_dir, in_snapshot_dir);
+
+        co_await validate_manifest(snapshot_dir, in_snapshot_dir);
+    }, true);
+}
+
 SEASTAR_TEST_CASE(snapshot_list_okay) {
    return do_with_some_data({"cf"}, [] (cql_test_env& e) {
        auto& cf = e.local_db().find_column_family("ks", "cf");
@@ -1456,7 +1528,7 @@ SEASTAR_TEST_CASE(snapshot_with_quarantine_works) {
        }
        BOOST_REQUIRE(found);

-        co_await take_snapshot(e, "ks", "cf", "test", true /* skip_flush */);
+        co_await take_snapshot(e, "ks", "cf", "test", db::snapshot_options{.skip_flush = true});

        testlog.debug("Expected: {}", expected);

--- a/test/boost/network_topology_strategy_test.cc
+++ b/test/boost/network_topology_strategy_test.cc
@@ -1450,8 +1450,7 @@ SEASTAR_THREAD_TEST_CASE(tablets_simple_rack_aware_view_pairing_test) {
        std::map<sstring, replication_strategy_config_option> options;
        for (const auto& dc : option_dcs) {
            auto num_racks = node_count_per_rack.at(dc).size();
-            auto max_rf_factor = std::ranges::min(std::ranges::views::transform(node_count_per_rack.at(dc), [] (auto& x) { return x.second; }));
-            auto rf = num_racks * tests::random::get_int(1UL, max_rf_factor);
+            auto rf = num_racks;
            options.emplace(dc, fmt::to_string(rf));
        }
        return options;
@@ -1487,8 +1486,7 @@ SEASTAR_THREAD_TEST_CASE(tablets_simple_rack_aware_view_pairing_test) {
    // Test tablets rack-aware base-view pairing
    auto base_token = dht::token::get_random_token();
    auto view_token = dht::token::get_random_token();
-    bool use_legacy_self_pairing = false;
-    bool use_tablets_basic_rack_aware_view_pairing = true;
+    bool use_tablets = true;
    const auto& base_replicas = base_tmap.get_tablet_info(base_tmap.get_tablet_id(base_token)).replicas;
    replica::cf_stats cf_stats;
    std::unordered_map<locator::host_id, locator::host_id> base_to_view_pairing;
@@ -1502,8 +1500,7 @@ SEASTAR_THREAD_TEST_CASE(tablets_simple_rack_aware_view_pairing_test) {
            *ars_ptr,
            base_token,
            view_token,
-            use_legacy_self_pairing,
-            use_tablets_basic_rack_aware_view_pairing,
+            use_tablets,
            cf_stats).natural_endpoint;

        // view pair must be found
@@ -1525,181 +1522,6 @@ SEASTAR_THREAD_TEST_CASE(tablets_simple_rack_aware_view_pairing_test) {
    }
 }

-// Called in a seastar thread
-void test_complex_rack_aware_view_pairing_test(bool more_or_less) {
-    auto my_address = gms::inet_address("localhost");
-
-    // Create the RackInferringSnitch
-    snitch_config cfg;
-    cfg.listen_address = my_address;
-    cfg.broadcast_address = my_address;
-    cfg.name = "RackInferringSnitch";
-    sharded<snitch_ptr> snitch;
-    snitch.start(cfg).get();
-    auto stop_snitch = defer([&snitch] { snitch.stop().get(); });
-    snitch.invoke_on_all(&snitch_ptr::start).get();
-
-    locator::token_metadata::config tm_cfg;
-    tm_cfg.topo_cfg.this_endpoint = my_address;
-    tm_cfg.topo_cfg.local_dc_rack = { snitch.local()->get_datacenter(), snitch.local()->get_rack() };
-
-    std::map<sstring, size_t> node_count_per_dc;
-    std::map<sstring, std::map<sstring, size_t>> node_count_per_rack;
-    std::vector<ring_point> ring_points;
-
-    auto& random_engine = seastar::testing::local_random_engine;
-    unsigned shard_count = 2;
-    size_t num_dcs = 1 + tests::random::get_int(3);
-
-    // Generate a random cluster
-    double point = 1;
-    for (size_t dc = 0; dc < num_dcs; ++dc) {
-        sstring dc_name = fmt::format("{}", 100 + dc);
-        size_t num_racks = 2 + tests::random::get_int(4);
-        for (size_t rack = 0; rack < num_racks; ++rack) {
-            sstring rack_name = fmt::format("{}", 10 + rack);
-            size_t rack_nodes = 1 + tests::random::get_int(2);
-            for (size_t i = 1; i <= rack_nodes; ++i) {
-                ring_points.emplace_back(point, inet_address(format("192.{}.{}.{}", dc_name, rack_name, i)));
-                node_count_per_dc[dc_name]++;
-                node_count_per_rack[dc_name][rack_name]++;
-                point++;
-            }
-        }
-    }
-
-    testlog.debug("node_count_per_rack={}", node_count_per_rack);
-
-    // Initialize the token_metadata
-    locator::shared_token_metadata stm([] () noexcept { return db::schema_tables::hold_merge_lock(); }, tm_cfg);
-    auto stop_stm = deferred_stop(stm);
-    stm.mutate_token_metadata([&] (token_metadata& tm) -> future<> {
-        auto& topo = tm.get_topology();
-        for (const auto& [ring_point, endpoint, id] : ring_points) {
-            std::unordered_set<token> tokens;
-            tokens.insert(token{tests::d2t(ring_point / ring_points.size())});
-            topo.add_node(id, make_endpoint_dc_rack(endpoint), locator::node::state::normal, shard_count);
-            co_await tm.update_normal_tokens(std::move(tokens), id);
-        }
-    }).get();
-
-    auto base_schema = schema_builder("ks", "base")
-        .with_column("k", utf8_type, column_kind::partition_key)
-        .with_column("v", utf8_type)
-        .build();
-
-    auto view_schema = schema_builder("ks", "view")
-        .with_column("v", utf8_type, column_kind::partition_key)
-        .with_column("k", utf8_type)
-        .build();
-
-    auto tmptr = stm.get();
-
-    // Create the replication strategy
-    auto make_random_options = [&] () {
-        auto option_dcs = node_count_per_dc | std::views::keys | std::ranges::to<std::vector>();
-        std::shuffle(option_dcs.begin(), option_dcs.end(), random_engine);
-        std::map<sstring, replication_strategy_config_option> options;
-        for (const auto& dc : option_dcs) {
-            auto num_racks = node_count_per_rack.at(dc).size();
-            auto rf = more_or_less ?
-                    tests::random::get_int(num_racks, node_count_per_dc[dc]) :
-                    tests::random::get_int(1UL, num_racks);
-            options.emplace(dc, fmt::to_string(rf));
-        }
-        return options;
-    };
-
-    auto options = make_random_options();
-    size_t tablet_count = 1 + tests::random::get_int(99);
-    testlog.debug("tablet_count={} rf_options={}", tablet_count, options);
-    locator::replication_strategy_params params(options, tablet_count, std::nullopt);
-    auto ars_ptr = abstract_replication_strategy::create_replication_strategy(
-            "NetworkTopologyStrategy", params, tmptr->get_topology());
-    auto tab_awr_ptr = ars_ptr->maybe_as_tablet_aware();
-    BOOST_REQUIRE(tab_awr_ptr);
-    auto base_tmap = tab_awr_ptr->allocate_tablets_for_new_table(base_schema, tmptr, 1).get();
-    auto base_table_id = base_schema->id();
-    testlog.debug("base_table_id={}", base_table_id);
-    auto view_table_id = view_schema->id();
-    auto view_tmap = tab_awr_ptr->allocate_tablets_for_new_table(view_schema, tmptr, 1).get();
-    testlog.debug("view_table_id={}", view_table_id);
-
-    stm.mutate_token_metadata([&] (token_metadata& tm) -> future<> {
-        tm.tablets().set_tablet_map(base_table_id, co_await base_tmap.clone_gently());
-        tm.tablets().set_tablet_map(view_table_id, co_await view_tmap.clone_gently());
-    }).get();
-
-    tmptr = stm.get();
-    auto base_erm = tab_awr_ptr->make_replication_map(base_table_id, tmptr);
-    auto view_erm = tab_awr_ptr->make_replication_map(view_table_id, tmptr);
-
-    auto& topology = tmptr->get_topology();
-    testlog.debug("topology: {}", topology.get_datacenter_racks());
-
-    // Test tablets rack-aware base-view pairing
-    auto base_token = dht::token::get_random_token();
-    auto view_token = dht::token::get_random_token();
-    bool use_legacy_self_pairing = false;
-    bool use_tablets_basic_rack_aware_view_pairing = true;
-    const auto& base_replicas = base_tmap.get_tablet_info(base_tmap.get_tablet_id(base_token)).replicas;
-    replica::cf_stats cf_stats;
-    std::unordered_map<locator::host_id, locator::host_id> base_to_view_pairing;
-    std::unordered_map<locator::host_id, locator::host_id> view_to_base_pairing;
-    std::unordered_map<sstring, size_t> same_rack_pairs;
-    std::unordered_map<sstring, size_t> cross_rack_pairs;
-    for (const auto& base_replica : base_replicas) {
-        auto& base_host = base_replica.host;
-        auto view_ep_opt = db::view::get_view_natural_endpoint(
-            base_host,
-            base_erm,
-            view_erm,
-            *ars_ptr,
-            base_token,
-            view_token,
-            use_legacy_self_pairing,
-            use_tablets_basic_rack_aware_view_pairing,
-            cf_stats).natural_endpoint;
-
-        // view pair must be found
-        if (!view_ep_opt) {
-            BOOST_FAIL(format("Could not pair base_host={} base_token={} view_token={}", base_host, base_token, view_token));
-        }
-        BOOST_REQUIRE(view_ep_opt);
-        auto& view_ep = *view_ep_opt;
-
-        // Assert pairing uniqueness
-        auto [base_it, inserted_base_pair] = base_to_view_pairing.emplace(base_host, view_ep);
-        BOOST_REQUIRE(inserted_base_pair);
-        auto [view_it, inserted_view_pair] = view_to_base_pairing.emplace(view_ep, base_host);
-        BOOST_REQUIRE(inserted_view_pair);
-
-        auto& base_location = topology.find_node(base_host)->dc_rack();
-        auto& view_location = topology.find_node(view_ep)->dc_rack();
-
-        // Assert dc- and rack- aware pairing
-        BOOST_REQUIRE_EQUAL(base_location.dc, view_location.dc);
-
-        if (base_location.rack == view_location.rack) {
-            same_rack_pairs[base_location.dc]++;
-        } else {
-            cross_rack_pairs[base_location.dc]++;
-        }
-    }
-    for (const auto& [dc, rf_opt] : options) {
-        auto rf = locator::get_replication_factor(rf_opt);
-        BOOST_REQUIRE_EQUAL(same_rack_pairs[dc] + cross_rack_pairs[dc], rf);
-    }
-}
-
-SEASTAR_THREAD_TEST_CASE(tablets_complex_rack_aware_view_pairing_test_rf_lt_racks) {
-    test_complex_rack_aware_view_pairing_test(false);
-}
-
-SEASTAR_THREAD_TEST_CASE(tablets_complex_rack_aware_view_pairing_test_rf_gt_racks) {
-    test_complex_rack_aware_view_pairing_test(true);
-}
-
 SEASTAR_THREAD_TEST_CASE(test_rack_diff) {
    BOOST_REQUIRE(diff_racks({}, {}).empty());

--- a/test/boost/repair_test.cc
+++ b/test/boost/repair_test.cc
@@ -346,4 +346,60 @@ SEASTAR_TEST_CASE(repair_rows_size_considers_external_memory) {
    });
 }

+SEASTAR_TEST_CASE(test_tablet_token_range_count) {
+    {
+        // Simple case: one large range covers a smaller one
+        utils::chunked_vector<tablet_token_range> r1 = {{10, 20}};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 100}};
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 1);
+    }
+    {
+        // r2 ranges overlap and should merge to cover r1
+        // r2: [0, 50] + [40, 100] -> merges to [0, 100]
+        // r1: [10, 90] should be covered
+        utils::chunked_vector<tablet_token_range> r1 = {{10, 90}};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 50}, {40, 100}};
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 1);
+    }
+    {
+        // r2 ranges are adjacent (contiguous) and should merge
+        // r2: [0, 10] + [11, 20] -> merges to [0, 20]
+        // r1: [5, 15] should be covered
+        utils::chunked_vector<tablet_token_range> r1 = {{5, 15}};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 10}, {11, 20}};
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 1);
+    }
+    {
+        // r1 overlaps r2 but is not FULLY contained
+        // r2: [0, 10]
+        // r1: [5, 15] (Ends too late), [ -5, 5 ] (Starts too early)
+        utils::chunked_vector<tablet_token_range> r1 = {{5, 15}, {-5, 5}};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 10}};
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 0);
+    }
+    {
+        // A single merged range in r2 covers multiple distinct ranges in r1
+        utils::chunked_vector<tablet_token_range> r1 = {{10, 20}, {30, 40}, {50, 60}};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 100}};
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 3);
+    }
+    {
+        // Inputs are provided in random order, ensuring the internal sort works
+        utils::chunked_vector<tablet_token_range> r1 = {{50, 60}, {10, 20}};
+        utils::chunked_vector<tablet_token_range> r2 = {{50, 100}, {0, 40}};
+        // r2 merges effectively to [0, 40] and [50, 100]
+        // Both r1 items are covered
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2) == 2);
+    }
+    {
+        utils::chunked_vector<tablet_token_range> r1 = {{10, 20}};
+        utils::chunked_vector<tablet_token_range> r2_empty = {};
+        utils::chunked_vector<tablet_token_range> r1_empty = {};
+        utils::chunked_vector<tablet_token_range> r2 = {{0, 100}};
+
+        BOOST_REQUIRE(co_await count_finished_tablets(r1, r2_empty) == 0);
+        BOOST_REQUIRE(co_await count_finished_tablets(r1_empty, r2) == 0);
+    }
+}
+
 BOOST_AUTO_TEST_SUITE_END()
--- a/test/boost/sstable_tablet_streaming.cc
+++ b/test/boost/sstable_tablet_streaming.cc
@@ -0,0 +1,367 @@
+/*
+ * Copyright (C) 2025-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#undef SEASTAR_TESTING_MAIN
+#include <seastar/testing/test_case.hh>
+#include "dht/token.hh"
+#include "sstable_test.hh"
+#include "sstables_loader.hh"
+#include "test/lib/sstable_test_env.hh"
+
+BOOST_AUTO_TEST_SUITE(sstable_tablet_streaming_test)
+
+using namespace sstables;
+
+std::vector<shared_sstable> make_sstables_with_ranges(test_env& env, const std::vector<std::pair<int64_t, int64_t>>& ranges) {
+    std::vector<shared_sstable> ssts;
+    for (const auto& [first, last] : ranges) {
+        auto sst = env.make_sstable(uncompressed_schema(), uncompressed_dir());
+        test(sst).set_first_and_last_keys(dht::decorated_key(dht::token{first}, partition_key(std::vector<bytes>{"1"})),
+                                          dht::decorated_key(dht::token{last}, partition_key(std::vector<bytes>{"1"})));
+        ssts.push_back(std::move(sst));
+    }
+    // By sorting SSTables by their primary key, we enable runs to be
+    // streamed incrementally. Overlapping fragments can be deduplicated,
+    // reducing the amount of data sent over the wire. Elements are
+    // popped from the back of the vector, so we sort in descending
+    // order to begin with the smaller tokens.
+    // See sstable_streamer constructor for more details.
+    std::ranges::sort(ssts, [](const shared_sstable& x, const shared_sstable& y) { return x->compare_by_first_key(*y) > 0; });
+    return ssts;
+}
+
+std::vector<dht::token_range> get_tablet_sstable_collection(auto&&... tablet_ranges) {
+    // tablet ranges are left-non-inclusive, see `tablet_map::get_token_range` for details
+    std::vector<dht::token_range> collections{dht::token_range::make({tablet_ranges.start()->value(), false}, {tablet_ranges.end()->value(), true})...};
+
+    std::sort(collections.begin(), collections.end(), [](auto const& a, auto const& b) { return a.start()->value() < b.start()->value(); });
+
+    return collections;
+}
+
+#define REQUIRE_WITH_CONTEXT(sstables, expected_size)                                                                                                          \
+    BOOST_TEST_CONTEXT("Testing with ranges: " << [&] {                                                                                                        \
+        std::stringstream ss;                                                                                                                                  \
+        for (const auto& sst : (sstables)) {                                                                                                                   \
+            ss << dht::token_range(sst->get_first_decorated_key().token(), sst->get_last_decorated_key().token()) << ", ";                                     \
+        }                                                                                                                                                      \
+        return ss.str();                                                                                                                                       \
+    }())                                                                                                                                                       \
+    BOOST_REQUIRE_EQUAL(sstables.size(), expected_size)
+
+SEASTAR_TEST_CASE(test_streaming_ranges_distribution) {
+    return test_env::do_with_async([](test_env& env) {
+        // 1) Exact boundary equality: SSTable == tablet
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {5, 10},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+        }
+
+        // 2) Single-point overlaps at start/end
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {4, 5},   // touches start, non-inclusive, skip
+                                                      {10, 11}, // touches end
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+        }
+
+        // 3) Tablet fully inside a large SSTable
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 20},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+        }
+
+        // 4) Multiple SSTables fully contained in tablet
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {6, 7},
+                                                      {7, 8},
+                                                      {8, 9},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 3);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 0);
+        }
+
+        // 5) Two overlapping but not fully contained SSTables
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 6},  // overlaps at left
+                                                      {9, 15}, // overlaps at right
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 2);
+        }
+
+        // 6) Unsorted input (helper sorts) + mixed overlaps
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{50}, dht::token{100}});
+            // Intentionally unsorted by first token
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {120, 130},
+                                                      {0, 10},
+                                                      {60, 70},  // fully contained
+                                                      {40, 55},  // partial
+                                                      {95, 105}, // partial
+                                                      {80, 90},  // fully contained
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 2);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 2);
+        }
+
+        // 7) Empty SSTable list
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            std::vector<shared_sstable> ssts;
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 0);
+        }
+
+        // 8) Tablet outside all SSTables
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{100}, dht::token{200}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {1, 2},
+                                                      {3, 4},
+                                                      {10, 20},
+                                                      {300, 400},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 0);
+        }
+
+        // 9) Boundary adjacency with multiple fragments
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{100}, dht::token{200}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {50, 100},  // touches start -> non-inclusive, skip
+                                                      {100, 120}, // starts at start -> partially contained
+                                                      {180, 200}, // ends at end   -> fully contained
+                                                      {200, 220}, // touches end   -> partial
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 1);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 2);
+        }
+
+        // 10) Large SSTable set where early break should occur
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{1000}, dht::token{2000}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {100, 200},
+                                                      {300, 400},
+                                                      {900, 950},
+                                                      {1001, 1100}, // fully contained
+                                                      {1500, 1600}, // fully contained
+                                                      {2101, 2200}, // entirely after -> should trigger early break in ascending scan
+                                                      {1999, 2100}, // overlap, partially contained
+                                                      {3000, 3100},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 2);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+        }
+
+        // 10) https://github.com/scylladb/scylladb/pull/26980 example, tested
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{4}, dht::token{5}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 5},
+                                                      {0, 3},
+                                                      {2, 5},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            // None fully contained; three partial overlaps
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 2);
+        }
+    });
+}
+
+SEASTAR_TEST_CASE(test_streaming_ranges_distribution_in_tablets) {
+    return test_env::do_with_async([](test_env& env) {
+        {
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}}, dht::token_range{dht::token{11}, dht::token{15}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {5, 10},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 0);
+        }
+
+        {
+            // Multiple tablets with a hole between [10,11]
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{0}, dht::token{4}},
+                                                            dht::token_range{dht::token{5}, dht::token{9}},
+                                                            dht::token_range{dht::token{12}, dht::token{15}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 4},   // T.start==S.start, but non-inclusive -> partial
+                                                      {5, 9},   // same as above
+                                                      {6, 8},   // fully in second tablet
+                                                      {10, 11}, // falls in the hole, should be rejected
+                                                      {8, 13},  // overlaps second and third tablets (partial in both)
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+
+            REQUIRE_WITH_CONTEXT(res[1].sstables_fully_contained, 1);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 2);
+
+            REQUIRE_WITH_CONTEXT(res[2].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[2].sstables_partially_contained, 1);
+        }
+
+        {
+            // SSTables outside any tablet range
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{20}, dht::token{25}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 5},   // before
+                                                      {30, 35}, // after
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 0);
+        }
+
+        {
+            // Edge case: SSTable touching tablet boundary
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{5}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {4, 5},   // touches start, non-inclusive, skip
+                                                      {10, 11}, // touches end
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+        }
+
+        {
+            // No tablets, but some SSTables
+            auto collection = get_tablet_sstable_collection();
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 5},
+                                                      {10, 15},
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            BOOST_REQUIRE_EQUAL(res.size(), 0); // no tablets → nothing to classify
+        }
+
+        {
+            // No SSTables, but some tablets
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{0}, dht::token{5}}, dht::token_range{dht::token{10}, dht::token{15}});
+            std::vector<shared_sstable> ssts; // empty
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 0);
+        }
+
+        {
+            // No tablets and no SSTables
+            auto collection = get_tablet_sstable_collection();
+            std::vector<shared_sstable> ssts; // empty
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+            BOOST_REQUIRE_EQUAL(res.size(), 0);
+        }
+        {
+            // SSTable spanning two tablets
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{0}, dht::token{4}}, dht::token_range{dht::token{5}, dht::token{9}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {2, 7}, // spans both tablets
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            // Tablet [0,4] sees partial overlap
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+
+            // Tablet [5,9] sees partial overlap
+            REQUIRE_WITH_CONTEXT(res[1].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 1);
+        }
+
+        {
+            // SSTable spanning three tablets with a hole in between
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{0}, dht::token{3}},
+                                                            dht::token_range{dht::token{4}, dht::token{6}},
+                                                            dht::token_range{dht::token{8}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {2, 9}, // spans across tablets 1,2,3 and hole [7]
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 1);
+            REQUIRE_WITH_CONTEXT(res[2].sstables_partially_contained, 1);
+        }
+
+        {
+            // SSTable fully covering one tablet and partially overlapping another
+            auto collection = get_tablet_sstable_collection(dht::token_range{dht::token{0}, dht::token{5}}, dht::token_range{dht::token{6}, dht::token{10}});
+            auto ssts = make_sstables_with_ranges(env,
+                                                  {
+                                                      {0, 7}, // fully covers first tablet, partial in second
+                                                  });
+            auto res = get_sstables_for_tablets_for_tests(ssts, std::move(collection)).get();
+
+            REQUIRE_WITH_CONTEXT(res[0].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[0].sstables_partially_contained, 1);
+
+            REQUIRE_WITH_CONTEXT(res[1].sstables_fully_contained, 0);
+            REQUIRE_WITH_CONTEXT(res[1].sstables_partially_contained, 1);
+        }
+    });
+}
+
+BOOST_AUTO_TEST_SUITE_END()
--- a/Show More
+++ b/Show More