[Backport 6.1] replica/table: check memtable before discarding tombstone during read

On the read path, the compacting reader is applied only to the sstable reader. This can cause an expired tombstone from an sstable to be purged from the request before it has a chance to merge with deleted data in the memtable leading to data resurrection. Fix this by checking the memtables before deciding to purge tombstones from the request on the read path. A tombstone will not be purged if a key exists in any of the table's memtables with a minimum live timestamp that is lower than the maximum purgeable timestamp. Fixes #20916 `perf-simple-query` stats before and after this fix : `build/Dev/scylla perf-simple-query --smp=1 --flush` : ``` // Before this Fix // --------------- 94941.79 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59393 insns/op, 24029 cycles/op, 0 errors) 97551.14 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59376 insns/op, 23966 cycles/op, 0 errors) 96599.92 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59367 insns/op, 23998 cycles/op, 0 errors) 97774.91 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59370 insns/op, 23968 cycles/op, 0 errors) 97796.13 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59368 insns/op, 23947 cycles/op, 0 errors) throughput: mean=96932.78 standard-deviation=1215.71 median=97551.14 median-absolute-deviation=842.13 maximum=97796.13 minimum=94941.79 instructions_per_op: mean=59374.78 standard-deviation=10.78 median=59369.59 median-absolute-deviation=6.36 maximum=59393.12 minimum=59367.02 cpu_cycles_per_op: mean=23981.67 standard-deviation=32.29 median=23967.76 median-absolute-deviation=16.33 maximum=24029.38 minimum=23947.19 // After this Fix // -------------- 95313.53 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59392 insns/op, 24058 cycles/op, 0 errors) 97311.48 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59375 insns/op, 24005 cycles/op, 0 errors) 98043.10 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59381 insns/op, 23941 cycles/op, 0 errors) 96750.31 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59396 insns/op, 24025 cycles/op, 0 errors) 93381.21 tps ( 71.1 allocs/op, 0.0 logallocs/op, 14.1 tasks/op, 59390 insns/op, 24097 cycles/op, 0 errors) throughput: mean=96159.93 standard-deviation=1847.88 median=96750.31 median-absolute-deviation=1151.55 maximum=98043.10 minimum=93381.21 instructions_per_op: mean=59386.60 standard-deviation=8.78 median=59389.55 median-absolute-deviation=6.02 maximum=59396.40 minimum=59374.73 cpu_cycles_per_op: mean=24025.13 standard-deviation=58.39 median=24025.17 median-absolute-deviation=32.67 maximum=24096.66 minimum=23941.22 ``` This PR fixes a regression introduced in ce96b472d3 and should be backported to older versions. Closes scylladb/scylladb#20985 * github.com:scylladb/scylladb: topology-custom: add test to verify tombstone gc in read path replica/table: check memtable before discarding tombstone during read compaction_group: track maximum timestamp across all sstables (cherry picked from commit 519e167611) Backported from #20985 to 6.1. Signed-off-by: Lakshmi Narayanan Sreethar <lakshmi.sreethar@scylladb.com> Closes scylladb/scylladb#21250
SCYLLA-VERSION-GEN: correct the logic for skipping SCYLLA-*-FILE
2024-10-25 11:13:54 +03:00 · 2024-10-25 11:09:51 +03:00 · 2024-10-25 11:06:38 +03:00 · 2024-10-23 11:41:36 +02:00 · 2024-10-23 10:02:13 +03:00 · 2024-10-22 13:17:00 +03:00
238 changed files with 4182 additions and 1658 deletions
--- a/.gitignore
+++ b/.gitignore
@@ -19,7 +19,7 @@ CMakeLists.txt.user
 *.egg-info
 __pycache__CMakeLists.txt.user
 .gdbinit
-resources
+/resources
 .pytest_cache
 /expressions.tokens
 tags
--- a/4
+++ b/4
@@ -78,7 +78,7 @@ fi

 # Default scylla product/version tags
 PRODUCT=scylla
-VERSION=6.1.0-dev
+VERSION=6.1.3

 if test -f version
 then
@@ -104,7 +104,7 @@ else
 fi

 if [ -f "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" ]; then
-	GIT_COMMIT_FILE=$(cat "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" |cut -d . -f 3)
+	GIT_COMMIT_FILE=$(cat "$OUTPUT_DIR/SCYLLA-RELEASE-FILE" | rev | cut -d . -f 1 | rev)
 	if [ "$GIT_COMMIT" = "$GIT_COMMIT_FILE" ]; then
 		exit 0
 	fi
--- a/alternator/auth.cc
+++ b/alternator/auth.cc
@@ -19,6 +19,7 @@
 #include "alternator/executor.hh"
 #include "cql3/selection/selection.hh"
 #include "cql3/result_set.hh"
+#include "types/types.hh"
 #include <seastar/core/coroutine.hh>

 namespace alternator {
@@ -31,11 +32,12 @@ future<std::string> get_key_from_roles(service::storage_proxy& proxy, auth::serv
    dht::partition_range_vector partition_ranges{dht::partition_range(dht::decorate_key(*schema, pk))};
    std::vector<query::clustering_range> bounds{query::clustering_range::make_open_ended_both_sides()};
    const column_definition* salted_hash_col = schema->get_column_definition(bytes("salted_hash"));
-    if (!salted_hash_col) {
+    const column_definition* can_login_col = schema->get_column_definition(bytes("can_login"));
+    if (!salted_hash_col || !can_login_col) {
        co_await coroutine::return_exception(api_error::unrecognized_client(format("Credentials cannot be fetched for: {}", username)));
    }
-    auto selection = cql3::selection::selection::for_columns(schema, {salted_hash_col});
-    auto partition_slice = query::partition_slice(std::move(bounds), {}, query::column_id_vector{salted_hash_col->id}, selection->get_query_options());
+    auto selection = cql3::selection::selection::for_columns(schema, {salted_hash_col, can_login_col});
+    auto partition_slice = query::partition_slice(std::move(bounds), {}, query::column_id_vector{salted_hash_col->id, can_login_col->id}, selection->get_query_options());
    auto command = ::make_lw_shared<query::read_command>(schema->id(), schema->version(), partition_slice,
            proxy.get_max_result_size(partition_slice), query::tombstone_limit(proxy.get_tombstone_limit()));
    auto cl = auth::password_authenticator::consistency_for_user(username);
@@ -51,7 +53,14 @@ future<std::string> get_key_from_roles(service::storage_proxy& proxy, auth::serv
    if (result_set->empty()) {
        co_await coroutine::return_exception(api_error::unrecognized_client(format("User not found: {}", username)));
    }
-    const managed_bytes_opt& salted_hash = result_set->rows().front().front(); // We only asked for 1 row and 1 column
+    const auto& result = result_set->rows().front();
+    bool can_login = result[1] && value_cast<bool>(boolean_type->deserialize(*result[1]));
+    if (!can_login) {
+        // This is a valid role name, but has "login=False" so should not be
+        // usable for authentication (see #19735).
+        co_await coroutine::return_exception(api_error::unrecognized_client(format("Role {} has login=false so cannot be used for login", username)));
+    }
+    const managed_bytes_opt& salted_hash = result.front();
    if (!salted_hash) {
        co_await coroutine::return_exception(api_error::unrecognized_client(format("No password found for user: {}", username)));
    }
--- a/alternator/executor.cc
+++ b/alternator/executor.cc
@@ -9,6 +9,7 @@
 #include <fmt/ranges.h>
 #include <seastar/core/sleep.hh>
 #include "alternator/executor.hh"
+#include "cdc/log.hh"
 #include "db/config.hh"
 #include "log.hh"
 #include "schema/schema_builder.hh"
@@ -4439,8 +4440,10 @@ future<executor::request_return_type> executor::list_tables(client_state& client

    auto tables = _proxy.data_dictionary().get_tables(); // hold on to temporary, table_names isn't a container, it's a view
    auto table_names = tables
-            | boost::adaptors::filtered([] (data_dictionary::table t) {
-                        return t.schema()->ks_name().find(KEYSPACE_NAME_PREFIX) == 0 && !t.schema()->is_view();
+            | boost::adaptors::filtered([this] (data_dictionary::table t) {
+                        return t.schema()->ks_name().find(KEYSPACE_NAME_PREFIX) == 0 &&
+                            !t.schema()->is_view() &&
+                            !cdc::is_log_for_some_table(_proxy.local_db(), t.schema()->ks_name(), t.schema()->cf_name());
                    })
            | boost::adaptors::transformed([] (data_dictionary::table t) {
                        return t.schema()->cf_name();
--- a/alternator/server.cc
+++ b/alternator/server.cc
@@ -211,7 +211,10 @@ protected:
        sstring local_dc = topology.get_datacenter();
        std::unordered_set<gms::inet_address> local_dc_nodes = topology.get_datacenter_endpoints().at(local_dc);
        for (auto& ip : local_dc_nodes) {
-            if (_gossiper.is_alive(ip)) {
+            // Note that it's not enough for the node to be is_alive() - a
+            // node joining the cluster is also "alive" but not responsive to
+            // requests. We need the node to be in normal state. See #19694.
+            if (_gossiper.is_normal(ip)) {
                // Use the gossiped broadcast_rpc_address if available instead
                // of the internal IP address "ip". See discussion in #18711.
                rjson::push_back(results, rjson::from_string(_gossiper.get_rpc_address(ip)));
--- a/alternator/ttl.cc
+++ b/alternator/ttl.cc
@@ -26,6 +26,7 @@
 #include "log.hh"
 #include "gc_clock.hh"
 #include "replica/database.hh"
+#include "service/client_state.hh"
 #include "service_permit.hh"
 #include "timestamp.hh"
 #include "service/storage_proxy.hh"
@@ -312,7 +313,7 @@ static size_t random_offset(size_t min, size_t max) {
 // this range's primary node is down. For this we need to return not just
 // a list of this node's secondary ranges - but also the primary owner of
 // each of those ranges.
-static std::vector<std::pair<dht::token_range, gms::inet_address>> get_secondary_ranges(
+static future<std::vector<std::pair<dht::token_range, gms::inet_address>>> get_secondary_ranges(
        const locator::effective_replication_map_ptr& erm,
        gms::inet_address ep) {
    const auto& tm = *erm->get_token_metadata_ptr();
@@ -323,6 +324,7 @@ static std::vector<std::pair<dht::token_range, gms::inet_address>> get_secondary
    }
    auto prev_tok = sorted_tokens.back();
    for (const auto& tok : sorted_tokens) {
+        co_await coroutine::maybe_yield();
        inet_address_vector_replica_set eps = erm->get_natural_endpoints(tok);
        if (eps.size() <= 1 || eps[1] != ep) {
            prev_tok = tok;
@@ -350,7 +352,7 @@ static std::vector<std::pair<dht::token_range, gms::inet_address>> get_secondary
        }
        prev_tok = tok;
    }
-    return ret;
+    co_return ret;
 }


@@ -386,63 +388,63 @@ static std::vector<std::pair<dht::token_range, gms::inet_address>> get_secondary
 //
 // FIXME: Check if this algorithm is safe with tablet migration.
 // https://github.com/scylladb/scylladb/issues/16567
-enum primary_or_secondary_t {primary, secondary};
-template<primary_or_secondary_t primary_or_secondary>
-class token_ranges_owned_by_this_shard {
-    // ranges_holder_primary holds just the primary ranges themselves
-    class ranges_holder_primary {
-        const dht::token_range_vector _token_ranges;
-     public:
-        ranges_holder_primary(const locator::vnode_effective_replication_map_ptr& erm, gms::gossiper& g, gms::inet_address ep)
-            : _token_ranges(erm->get_primary_ranges(ep)) {}
-        std::size_t size() const { return _token_ranges.size(); }
-        const dht::token_range& operator[](std::size_t i) const {
-            return _token_ranges[i];
-        }
-        bool should_skip(std::size_t i) const {
-            return false;
-        }
-    };
-    // ranges_holder<secondary> holds the secondary token ranges plus each
-    // range's primary owner, needed to implement should_skip().
-    class ranges_holder_secondary {
-        std::vector<std::pair<dht::token_range, gms::inet_address>> _token_ranges;
-        gms::gossiper& _gossiper;
-     public:
-        ranges_holder_secondary(const locator::effective_replication_map_ptr& erm, gms::gossiper& g, gms::inet_address ep)
-            : _token_ranges(get_secondary_ranges(erm, ep))
-            , _gossiper(g) {}
-        std::size_t size() const { return _token_ranges.size(); }
-        const dht::token_range& operator[](std::size_t i) const {
-            return _token_ranges[i].first;
-        }
-        // range i should be skipped if its primary owner is alive.
-        bool should_skip(std::size_t i) const {
-            return _gossiper.is_alive(_token_ranges[i].second);
-        }
-    };

+// ranges_holder_primary holds just the primary ranges themselves
+class ranges_holder_primary {
+    dht::token_range_vector _token_ranges;
+public:
+    explicit ranges_holder_primary(dht::token_range_vector token_ranges) : _token_ranges(std::move(token_ranges)) {}
+    static future<ranges_holder_primary> make(const locator::vnode_effective_replication_map_ptr& erm, gms::inet_address ep) {
+        co_return ranges_holder_primary(co_await erm->get_primary_ranges(ep));
+    }
+    std::size_t size() const { return _token_ranges.size(); }
+    const dht::token_range& operator[](std::size_t i) const {
+        return _token_ranges[i];
+    }
+    bool should_skip(std::size_t i) const {
+        return false;
+    }
+};
+// ranges_holder<secondary> holds the secondary token ranges plus each
+// range's primary owner, needed to implement should_skip().
+class ranges_holder_secondary {
+    std::vector<std::pair<dht::token_range, gms::inet_address>> _token_ranges;
+    const gms::gossiper& _gossiper;
+public:
+    explicit ranges_holder_secondary(std::vector<std::pair<dht::token_range, gms::inet_address>> token_ranges, const gms::gossiper& g)
+        : _token_ranges(std::move(token_ranges))
+        , _gossiper(g) {}
+    static future<ranges_holder_secondary> make(const locator::effective_replication_map_ptr& erm, gms::inet_address ep, const gms::gossiper& g) {
+        co_return ranges_holder_secondary(co_await get_secondary_ranges(erm, ep), g);
+    }
+    std::size_t size() const { return _token_ranges.size(); }
+    const dht::token_range& operator[](std::size_t i) const {
+        return _token_ranges[i].first;
+    }
+    // range i should be skipped if its primary owner is alive.
+    bool should_skip(std::size_t i) const {
+        return _gossiper.is_alive(_token_ranges[i].second);
+    }
+};
+
+template<class primary_or_secondary_t>
+class token_ranges_owned_by_this_shard {
    schema_ptr _s;
    locator::effective_replication_map_ptr _erm;
    // _token_ranges will contain a list of token ranges owned by this node.
    // We'll further need to split each such range to the pieces owned by
    // the current shard, using _intersecter.
-    using ranges_holder = std::conditional_t<
-            primary_or_secondary == primary_or_secondary_t::primary,
-            ranges_holder_primary,
-            ranges_holder_secondary>;
-    const ranges_holder _token_ranges;
+    const primary_or_secondary_t _token_ranges;
    // NOTICE: _range_idx is used modulo _token_ranges size when accessing
    // the data to ensure that it doesn't go out of bounds
    size_t _range_idx;
    size_t _end_idx;
    std::optional<dht::selective_token_range_sharder> _intersecter;
 public:
-    token_ranges_owned_by_this_shard(replica::database& db, gms::gossiper& g, schema_ptr s)
+    token_ranges_owned_by_this_shard(schema_ptr s, primary_or_secondary_t token_ranges)
        :  _s(s)
        , _erm(s->table().get_effective_replication_map())
-        , _token_ranges(db.find_keyspace(s->ks_name()).get_vnode_effective_replication_map(),
-                g, _erm->get_topology().my_address())
+        , _token_ranges(std::move(token_ranges))
        , _range_idx(random_offset(0, _token_ranges.size() - 1))
        , _end_idx(_range_idx + _token_ranges.size())
    {
@@ -498,6 +500,7 @@ struct scan_ranges_context {
    bytes column_name;
    std::optional<std::string> member;

+    service::client_state internal_client_state;
    ::shared_ptr<cql3::selection::selection> selection;
    std::unique_ptr<service::query_state> query_state_ptr;
    std::unique_ptr<cql3::query_options> query_options;
@@ -507,6 +510,7 @@ struct scan_ranges_context {
        : s(s)
        , column_name(column_name)
        , member(member)
+        , internal_client_state(service::client_state::internal_tag())
    {
        // FIXME: don't read the entire items - read only parts of it.
        // We must read the key columns (to be able to delete) and also
@@ -525,10 +529,9 @@ struct scan_ranges_context {
        std::vector<query::clustering_range> ck_bounds{query::clustering_range::make_open_ended_both_sides()};
        auto partition_slice = query::partition_slice(std::move(ck_bounds), {}, std::move(regular_columns), opts);
        command = ::make_lw_shared<query::read_command>(s->id(), s->version(), partition_slice, proxy.get_max_result_size(partition_slice), query::tombstone_limit(proxy.get_tombstone_limit()));
-        executor::client_state client_state{executor::client_state::internal_tag()};
        tracing::trace_state_ptr trace_state;
        // NOTICE: empty_service_permit is used because the TTL service has fixed parallelism
-        query_state_ptr = std::make_unique<service::query_state>(client_state, trace_state, empty_service_permit());
+        query_state_ptr = std::make_unique<service::query_state>(internal_client_state, trace_state, empty_service_permit());
        // FIXME: What should we do on multi-DC? Will we run the expiration on the same ranges on all
        // DCs or only once for each range? If the latter, we need to change the CLs in the
        // scanner and deleter.
@@ -724,7 +727,9 @@ static future<bool> scan_table(
    expiration_stats.scan_table++;
    // FIXME: need to pace the scan, not do it all at once.
    scan_ranges_context scan_ctx{s, proxy, std::move(column_name), std::move(member)};
-    token_ranges_owned_by_this_shard<primary> my_ranges(db.real_database(), gossiper, s);
+    auto erm = db.real_database().find_keyspace(s->ks_name()).get_vnode_effective_replication_map();
+    auto my_address = erm->get_topology().my_address();
+    token_ranges_owned_by_this_shard my_ranges(s, co_await ranges_holder_primary::make(erm, my_address));
    while (std::optional<dht::partition_range> range = my_ranges.next_partition_range()) {
        // Note that because of issue #9167 we need to run a separate
        // query on each partition range, and can't pass several of
@@ -744,7 +749,7 @@ static future<bool> scan_table(
    // by tasking another node to take over scanning of the dead node's primary
    // ranges. What we do here is that this node will also check expiration
    // on its *secondary* ranges - but only those whose primary owner is down.
-    token_ranges_owned_by_this_shard<secondary> my_secondary_ranges(db.real_database(), gossiper, s);
+    token_ranges_owned_by_this_shard my_secondary_ranges(s, co_await ranges_holder_secondary::make(erm, my_address, gossiper));
    while (std::optional<dht::partition_range> range = my_secondary_ranges.next_partition_range()) {
        expiration_stats.secondary_ranges_scanned++;
        dht::partition_range_vector partition_ranges;
--- a/api/api-doc/storage_service.json
+++ b/api/api-doc/storage_service.json
@@ -1891,6 +1891,14 @@
                     "allowMultiple":false,
                     "type":"string",
                     "paramType":"query"
+                  },
+                  {
+                     "name":"force",
+                     "description":"Enforce the source_dc option, even if it unsafe to use for rebuild",
+                     "required":false,
+                     "allowMultiple":false,
+                     "type":"boolean",
+                     "paramType":"query"
                  }
               ]
            }
--- a/api/api-doc/system.json
+++ b/api/api-doc/system.json
@@ -194,6 +194,21 @@
               "parameters":[]
            }
         ]
+      },
+      {
+         "path":"/system/highest_supported_sstable_version",
+         "operations":[
+            {
+               "method":"GET",
+               "summary":"Get highest supported sstable version",
+               "type":"string",
+               "nickname":"get_highest_supported_sstable_version",
+               "produces":[
+                  "application/json"
+               ],
+               "parameters":[]
+            }
+         ]
      }
   ]
 }
--- a/api/storage_service.cc
+++ b/api/storage_service.cc
@@ -54,6 +54,7 @@
 #include "locator/abstract_replication_strategy.hh"
 #include "sstables_loader.hh"
 #include "db/view/view_builder.hh"
+#include "utils/user_provided_param.hh"

 using namespace seastar::httpd;
 using namespace std::chrono_literals;
@@ -1096,7 +1097,16 @@ void set_storage_service(http_context& ctx, routes& r, sharded<service::storage_
    });

    ss::rebuild.set(r, [&ss](std::unique_ptr<http::request> req) {
-        auto source_dc = req->get_query_param("source_dc");
+        utils::optional_param source_dc;
+        if (auto source_dc_str = req->get_query_param("source_dc"); !source_dc_str.empty()) {
+            source_dc.emplace(std::move(source_dc_str)).set_user_provided();
+        }
+        if (auto force_str = req->get_query_param("force"); !force_str.empty() && service::loosen_constraints(validate_bool(force_str))) {
+            if (!source_dc) {
+                throw bad_param_exception("The `source_dc` option must be provided for using the `force` option");
+            }
+            source_dc.set_force();
+        }
        apilog.info("rebuild: source_dc={}", source_dc);
        return ss.local().rebuild(std::move(source_dc)).then([] {
            return make_ready_future<json::json_return_type>(json_void());
--- a/api/system.cc
+++ b/api/system.cc
@@ -10,6 +10,7 @@
 #include "api/api-doc/system.json.hh"
 #include "api/api-doc/metrics.json.hh"
 #include "replica/database.hh"
+#include "sstables/sstables_manager.hh"

 #include <rapidjson/document.h>
 #include <seastar/core/reactor.hh>
@@ -182,6 +183,11 @@ void set_system(http_context& ctx, routes& r) {
        apilog.info("Profile dumped to {}", profile_dest);
        return make_ready_future<json::json_return_type>(json::json_return_type(json::json_void()));
    }) ;
+
+    hs::get_highest_supported_sstable_version.set(r, [&ctx] (const_req req) {
+        auto& table = ctx.db.local().find_column_family("system", "local");
+        return seastar::to_sstring(table.get_sstables_manager().get_highest_supported_format());
+    });
 }

 }
--- a/auth/certificate_authenticator.cc
+++ b/auth/certificate_authenticator.cc
@@ -76,7 +76,7 @@ auth::certificate_authenticator::certificate_authenticator(cql3::query_processor
                    continue;
                } catch (std::out_of_range&) {
                    // just fallthrough
-                } catch (std::regex_error&) {
+                } catch (boost::regex_error&) {
                    std::throw_with_nested(std::invalid_argument(fmt::format("Invalid query expression: {}", map.at(cfg_query_attr))));
                }
            }
--- a/auth/common.cc
+++ b/auth/common.cc
@@ -71,7 +71,7 @@ static future<> create_legacy_metadata_table_if_missing_impl(
    assert(this_shard_id() == 0); // once_among_shards makes sure a function is executed on shard 0 only

    auto db = qp.db();
-    auto parsed_statement = cql3::query_processor::parse_statement(cql);
+    auto parsed_statement = cql3::query_processor::parse_statement(cql, cql3::dialect{});
    auto& parsed_cf_statement = static_cast<cql3::statements::raw::cf_statement&>(*parsed_statement);

    parsed_cf_statement.prepare_keyspace(meta::legacy::AUTH_KS);
@@ -121,7 +121,7 @@ static future<> announce_mutations_with_guard(
        ::service::raft_group0_client& group0_client,
        std::vector<canonical_mutation> muts,
        ::service::group0_guard group0_guard,
-        seastar::abort_source* as,
+        seastar::abort_source& as,
        std::optional<::service::raft_timeout> timeout) {
    auto group0_cmd = group0_client.prepare_command(
        ::service::write_mutations{
@@ -137,7 +137,7 @@ future<> announce_mutations_with_batching(
        ::service::raft_group0_client& group0_client,
        start_operation_func_t start_operation_func,
        std::function<::service::mutations_generator(api::timestamp_type t)> gen,
-        seastar::abort_source* as,
+        seastar::abort_source& as,
        std::optional<::service::raft_timeout> timeout) {
    // account for command's overhead, it's better to use smaller threshold than constantly bounce off the limit
    size_t memory_threshold = group0_client.max_command_size() * 0.75;
@@ -188,7 +188,7 @@ future<> announce_mutations(
        ::service::raft_group0_client& group0_client,
        const sstring query_string,
        std::vector<data_value_or_unset> values,
-        seastar::abort_source* as,
+        seastar::abort_source& as,
        std::optional<::service::raft_timeout> timeout) {
    auto group0_guard = co_await group0_client.start_operation(as, timeout);
    auto timestamp = group0_guard.write_timestamp();
--- a/auth/common.hh
+++ b/auth/common.hh
@@ -80,7 +80,7 @@ future<> create_legacy_metadata_table_if_missing(
 // Execute update query via group0 mechanism, mutations will be applied on all nodes.
 // Use this function when need to perform read before write on a single guard or if
 // you have more than one mutation and potentially exceed single command size limit.
-using start_operation_func_t = std::function<future<::service::group0_guard>(abort_source*)>;
+using start_operation_func_t = std::function<future<::service::group0_guard>(abort_source&)>;
 future<> announce_mutations_with_batching(
        ::service::raft_group0_client& group0_client,
        // since we can operate also in topology coordinator context where we need stronger
@@ -88,7 +88,7 @@ future<> announce_mutations_with_batching(
        // function here
        start_operation_func_t start_operation_func,
        std::function<::service::mutations_generator(api::timestamp_type t)> gen,
-        seastar::abort_source* as,
+        seastar::abort_source& as,
        std::optional<::service::raft_timeout> timeout);

 // Execute update query via group0 mechanism, mutations will be applied on all nodes.
@@ -97,7 +97,7 @@ future<> announce_mutations(
        ::service::raft_group0_client& group0_client,
        const sstring query_string,
        std::vector<data_value_or_unset> values,
-        seastar::abort_source* as,
+        seastar::abort_source& as,
        std::optional<::service::raft_timeout> timeout);

 // Appends mutations to a collector, they will be applied later on all nodes via group0 mechanism.
--- a/auth/password_authenticator.cc
+++ b/auth/password_authenticator.cc
@@ -136,7 +136,7 @@ future<> password_authenticator::create_default_if_missing() {
        plogger.info("Created default superuser authentication record.");
    } else {
        co_await announce_mutations(_qp, _group0_client, query,
-            {salted_pwd, _superuser}, &_as, ::service::raft_timeout{});
+            {salted_pwd, _superuser}, _as, ::service::raft_timeout{});
        plogger.info("Created default superuser authentication record.");
    }
 }
--- a/auth/service.cc
+++ b/auth/service.cc
@@ -681,7 +681,7 @@ future<> migrate_to_auth_v2(db::system_keyspace& sys_ks, ::service::raft_group0_
    co_await announce_mutations_with_batching(g0,
            start_operation_func,
            std::move(gen),
-            &as,
+            as,
            std::nullopt);
 }

--- a/auth/standard_role_manager.cc
+++ b/auth/standard_role_manager.cc
@@ -192,7 +192,7 @@ future<> standard_role_manager::create_default_role_if_missing() {
                    {_superuser},
                    cql3::query_processor::cache_internal::no).discard_result();
        } else {
-            co_await announce_mutations(_qp, _group0_client, query, {_superuser}, &_as, ::service::raft_timeout{});
+            co_await announce_mutations(_qp, _group0_client, query, {_superuser}, _as, ::service::raft_timeout{});
        }
        log.info("Created default superuser role '{}'.", _superuser);
    } catch(const exceptions::unavailable_exception& e) {
--- a/compaction/compaction.cc
+++ b/compaction/compaction.cc
@@ -172,7 +172,8 @@ static api::timestamp_type get_max_purgeable_timestamp(const table_state& table_
 }

 static std::vector<shared_sstable> get_uncompacting_sstables(const table_state& table_s, std::vector<shared_sstable> sstables) {
-    auto all_sstables = boost::copy_range<std::vector<shared_sstable>>(*table_s.main_sstable_set().all());
+    auto sstable_set = table_s.sstable_set_for_tombstone_gc();
+    auto all_sstables = boost::copy_range<std::vector<shared_sstable>>(*sstable_set->all());
    auto& compacted_undeleted = table_s.compacted_undeleted_sstables();
    all_sstables.insert(all_sstables.end(), compacted_undeleted.begin(), compacted_undeleted.end());
    boost::sort(all_sstables, [] (const shared_sstable& x, const shared_sstable& y) {
--- a/compaction/compaction_manager.cc
+++ b/compaction/compaction_manager.cc
@@ -387,11 +387,26 @@ future<sstables::compaction_result> compaction_task_executor::compact_sstables_a

    co_return res;
 }
+
+future<sstables::sstable_set> compaction_task_executor::sstable_set_for_tombstone_gc(table_state& t) {
+    auto compound_set = t.sstable_set_for_tombstone_gc();
+    // Compound set will be linearized into a single set, since compaction might add or remove sstables
+    // to it for incremental compaction to work.
+    auto new_set = sstables::make_partitioned_sstable_set(t.schema(), false);
+    co_await compound_set->for_each_sstable_gently([&] (const sstables::shared_sstable& sst) {
+        auto inserted = new_set.insert(sst);
+        if (!inserted) {
+            on_internal_error(cmlog, format("Unable to insert SSTable {} into set used for tombstone GC", sst->get_filename()));
+        }
+    });
+    co_return std::move(new_set);
+}
+
 future<sstables::compaction_result> compaction_task_executor::compact_sstables(sstables::compaction_descriptor descriptor, sstables::compaction_data& cdata, on_replacement& on_replace, compaction_manager::can_purge_tombstones can_purge,
                                                                               sstables::offstrategy offstrategy) {
    table_state& t = *_compacting_table;
    if (can_purge) {
-        descriptor.enable_garbage_collection(t.main_sstable_set());
+        descriptor.enable_garbage_collection(co_await sstable_set_for_tombstone_gc(t));
    }
    descriptor.creator = [&t] (shard_id dummy) {
        auto sst = t.make_sstable();
@@ -1553,11 +1568,16 @@ protected:
        co_return stats;
    }

-    virtual sstables::compaction_descriptor make_descriptor(const sstables::shared_sstable& sst) const {
+    static sstables::compaction_descriptor
+    make_descriptor(const sstables::shared_sstable& sst, const sstables::compaction_type_options& opt, owned_ranges_ptr owned_ranges = {}) {
        auto sstable_level = sst->get_sstable_level();
        auto run_identifier = sst->run_identifier();
        return sstables::compaction_descriptor({ sst },
-            sstable_level, sstables::compaction_descriptor::default_max_sstable_bytes, run_identifier, _options, _owned_ranges_ptr);
+            sstable_level, sstables::compaction_descriptor::default_max_sstable_bytes, run_identifier, opt, owned_ranges);
+    }
+
+    virtual sstables::compaction_descriptor make_descriptor(const sstables::shared_sstable& sst) const {
+        return make_descriptor(sst, _options, _owned_ranges_ptr);
    }

    virtual future<sstables::compaction_result> rewrite_sstable(const sstables::shared_sstable sst) {
@@ -1610,19 +1630,30 @@ public:
                std::move(sstables), std::move(compacting), compaction_manager::can_purge_tombstones::yes)
            , _opt(options.as<sstables::compaction_type_options::split>())
    {
+        if (utils::get_local_injector().is_enabled("split_sstable_rewrite")) {
+            _do_throw_if_stopping = throw_if_stopping::yes;
+        }
+    }
+
+    static bool sstable_needs_split(const sstables::shared_sstable& sst, const sstables::compaction_type_options::split& opt) {
+        return opt.classifier(sst->get_first_decorated_key().token()) != opt.classifier(sst->get_last_decorated_key().token());
+    }
+
+    static sstables::compaction_descriptor
+    make_descriptor(const sstables::shared_sstable& sst, const sstables::compaction_type_options::split& split_opt) {
+        auto opt = sstables::compaction_type_options::make_split(split_opt.classifier);
+        return rewrite_sstables_compaction_task_executor::make_descriptor(sst, std::move(opt));
    }
 private:
    bool sstable_needs_split(const sstables::shared_sstable& sst) const {
-        return _opt.classifier(sst->get_first_decorated_key().token()) != _opt.classifier(sst->get_last_decorated_key().token());
+        return sstable_needs_split(sst, _opt);
    }
 protected:
    sstables::compaction_descriptor make_descriptor(const sstables::shared_sstable& sst) const override {
-        auto desc = rewrite_sstables_compaction_task_executor::make_descriptor(sst);
-        desc.options = sstables::compaction_type_options::make_split(_opt.classifier);
-        return desc;
+        return make_descriptor(sst, _opt);
    }

-    future<sstables::compaction_result> rewrite_sstable(const sstables::shared_sstable sst) override {
+    future<sstables::compaction_result> do_rewrite_sstable(const sstables::shared_sstable sst) {
        if (sstable_needs_split(sst)) {
            return rewrite_sstables_compaction_task_executor::rewrite_sstable(std::move(sst));
        }
@@ -1635,6 +1666,20 @@ protected:
            return sstables::compaction_result{};
        });
    }
+
+    future<sstables::compaction_result> rewrite_sstable(const sstables::shared_sstable sst) override {
+        co_await utils::get_local_injector().inject("split_sstable_rewrite", [this] (auto& handler) -> future<> {
+            cmlog.info("split_sstable_rewrite: waiting");
+            while (!handler.poll_for_message() && !_compaction_data.is_stop_requested()) {
+                co_await sleep(std::chrono::milliseconds(5));
+            }
+            cmlog.info("split_sstable_rewrite: released");
+            if (_compaction_data.is_stop_requested()) {
+                throw make_compaction_stopped_exception();
+            }
+        }, false);
+        co_return co_await do_rewrite_sstable(std::move(sst));
+    }
 };

 }
@@ -2056,6 +2101,31 @@ future<compaction_manager::compaction_stats_opt> compaction_manager::perform_spl
    return perform_task_on_all_files<split_compaction_task_executor>(info, t, std::move(options), std::move(owned_ranges_ptr), std::move(get_sstables));
 }

+future<std::vector<sstables::shared_sstable>>
+compaction_manager::maybe_split_sstable(sstables::shared_sstable sst, table_state& t, sstables::compaction_type_options::split opt) {
+    if (!split_compaction_task_executor::sstable_needs_split(sst, opt)) {
+        co_return std::vector<sstables::shared_sstable>{sst};
+    }
+    std::vector<sstables::shared_sstable> ret;
+
+        // FIXME: indentation.
+        auto gate = get_compaction_state(&t).gate.hold();
+        sstables::compaction_progress_monitor monitor;
+        sstables::compaction_data info = create_compaction_data();
+        sstables::compaction_descriptor desc = split_compaction_task_executor::make_descriptor(sst, opt);
+        desc.creator = [&t] (shard_id _) {
+            return t.make_sstable();
+        };
+        desc.replacer = [&] (sstables::compaction_completion_desc d) {
+            std::move(d.new_sstables.begin(), d.new_sstables.end(), std::back_inserter(ret));
+        };
+
+        co_await sstables::compact_sstables(std::move(desc), info, t, monitor);
+        co_await sst->unlink();
+
+    co_return ret;
+}
+
 // Submit a table to be scrubbed and wait for its termination.
 future<compaction_manager::compaction_stats_opt> compaction_manager::perform_sstable_scrub(table_state& t, sstables::compaction_type_options::scrub opts, std::optional<tasks::task_info> info) {
    auto scrub_mode = opts.operation_mode;
--- a/compaction/compaction_manager.hh
+++ b/compaction/compaction_manager.hh
@@ -347,6 +347,11 @@ public:
    // or user aborted splitting using stop API.
    future<compaction_stats_opt> perform_split_compaction(compaction::table_state& t, sstables::compaction_type_options::split opt, std::optional<tasks::task_info> info = std::nullopt);

+    // Splits a single SSTable by segregating all its data according to the classifier.
+    // If SSTable doesn't need split, the same input SSTable is returned as output.
+    // If SSTable needs split, then output SSTables are returned and the input SSTable is deleted.
+    future<std::vector<sstables::shared_sstable>> maybe_split_sstable(sstables::shared_sstable sst, table_state& t, sstables::compaction_type_options::split opt);
+
    // Run a custom job for a given table, defined by a function
    // it completes when future returned by job is ready or returns immediately
    // if manager was asked to stop.
@@ -586,6 +591,8 @@ private:
    future<compaction_manager::compaction_stats_opt> compaction_done() noexcept {
        return _compaction_done.get_future();
    }
+
+    future<sstables::sstable_set> sstable_set_for_tombstone_gc(::compaction::table_state& t);
 public:
    bool stopping() const noexcept {
        return _compaction_data.abort.abort_requested();
--- a/compaction/table_state.hh
+++ b/compaction/table_state.hh
@@ -39,6 +39,7 @@ public:
    virtual bool compaction_enforce_min_threshold() const noexcept = 0;
    virtual const sstables::sstable_set& main_sstable_set() const = 0;
    virtual const sstables::sstable_set& maintenance_sstable_set() const = 0;
+    virtual lw_shared_ptr<const sstables::sstable_set> sstable_set_for_tombstone_gc() const = 0;
    virtual std::unordered_set<sstables::shared_sstable> fully_expired_sstables(const std::vector<sstables::shared_sstable>& sstables, gc_clock::time_point compaction_time) const = 0;
    virtual const std::vector<sstables::shared_sstable>& compacted_undeleted_sstables() const noexcept = 0;
    virtual sstables::compaction_strategy& get_compaction_strategy() const noexcept = 0;
--- a/compaction/task_manager_module.cc
+++ b/compaction/task_manager_module.cc
@@ -467,7 +467,16 @@ future<> shard_cleanup_keyspace_compaction_task_impl::run() {

 future<> table_cleanup_keyspace_compaction_task_impl::run() {
    co_await wait_for_your_turn(_cv, _current_task, _status.id);
-    auto owned_ranges_ptr = compaction::make_owned_ranges_ptr(_db.get_keyspace_local_ranges(_status.keyspace));
+    // Note that we do not hold an effective_replication_map_ptr throughout
+    // the cleanup operation, so the topology might change.
+    // Since clenaup is an admin operation required for vnodes,
+    // it is the responsibility of the system operator to not
+    // perform additional incompatible range movements during cleanup.
+    auto get_owned_ranges = [&] (std::string_view ks_name) -> future<owned_ranges_ptr> {
+        const auto& erm = _db.find_keyspace(ks_name).get_vnode_effective_replication_map();
+        co_return compaction::make_owned_ranges_ptr(co_await _db.get_keyspace_local_ranges(erm));
+    };
+    auto owned_ranges_ptr = co_await get_owned_ranges(_status.keyspace);
    co_await run_on_table("force_keyspace_cleanup", _db, _status.keyspace, _ti, [&] (replica::table& t) {
        // skip the flush, as cleanup_keyspace_compaction_task_impl::run should have done this.
        return t.perform_cleanup_compaction(owned_ranges_ptr, tasks::task_info{_status.id, _status.shard}, replica::table::do_flush::no);
@@ -531,8 +540,15 @@ future<> shard_upgrade_sstables_compaction_task_impl::run() {

 future<> table_upgrade_sstables_compaction_task_impl::run() {
    co_await wait_for_your_turn(_cv, _current_task, _status.id);
-    auto owned_ranges = _db.maybe_get_keyspace_local_ranges(_status.keyspace);
-    auto owned_ranges_ptr = owned_ranges ? compaction::make_owned_ranges_ptr(std::move(owned_ranges.value())) : nullptr;
+    auto get_owned_ranges = [&] (std::string_view keyspace_name) -> future<owned_ranges_ptr> {
+        const auto& ks = _db.find_keyspace(keyspace_name);
+        if (ks.get_replication_strategy().is_per_table()) {
+            co_return nullptr;
+        }
+        const auto& erm = ks.get_vnode_effective_replication_map();
+        co_return compaction::make_owned_ranges_ptr(co_await _db.get_keyspace_local_ranges(erm));
+    };
+    auto owned_ranges_ptr = co_await get_owned_ranges(_status.keyspace);
    tasks::task_info info{_status.id, _status.shard};
    co_await run_on_table("upgrade_sstables", _db, _status.keyspace, _ti, [&] (replica::table& t) -> future<> {
        return t.parallel_foreach_table_state([&] (compaction::table_state& ts) -> future<> {
--- a/compaction/time_window_compaction_strategy.cc
+++ b/compaction/time_window_compaction_strategy.cc
@@ -295,7 +295,8 @@ time_window_compaction_strategy::get_reshaping_job(std::vector<shared_sstable> i
            // When trimming, let's keep sstables with overlapping time window, so as to reduce write amplification.
            // For example, if there are N sstables spanning window W, where N <= 32, then we can produce all data for W
            // in a single compaction round, removing the need to later compact W to reduce its number of files.
-            boost::partial_sort(multi_window, multi_window.begin() + max_sstables, [](const shared_sstable &a, const shared_sstable &b) {
+            auto sort_size = std::min(max_sstables, multi_window.size());
+            boost::partial_sort(multi_window, multi_window.begin() + sort_size, [](const shared_sstable &a, const shared_sstable &b) {
                return a->get_stats_metadata().max_timestamp < b->get_stats_metadata().max_timestamp;
            });
            maybe_trim_job(multi_window, job_size, disjoint);
--- a/configure.py
+++ b/configure.py
@@ -431,8 +431,6 @@ modes = {

 scylla_tests = set([
    'test/boost/UUID_test',
-    'test/boost/pretty_printers_test',
-    'test/boost/cdc_generation_test',
    'test/boost/aggregate_fcts_test',
    'test/boost/allocation_strategy_test',
    'test/boost/alternator_unit_test',
@@ -443,7 +441,9 @@ scylla_tests = set([
    'test/boost/batchlog_manager_test',
    'test/boost/big_decimal_test',
    'test/boost/bloom_filter_test',
+    'test/boost/bptree_test',
    'test/boost/broken_sstable_test',
+    'test/boost/btree_test',
    'test/boost/bytes_ostream_test',
    'test/boost/cache_algorithm_test',
    'test/boost/cache_mutation_reader_test',
@@ -452,13 +452,15 @@ scylla_tests = set([
    'test/boost/canonical_mutation_test',
    'test/boost/cartesian_product_test',
    'test/boost/castas_fcts_test',
+    'test/boost/cdc_generation_test',
    'test/boost/cdc_test',
    'test/boost/cell_locker_test',
    'test/boost/checksum_utils_test',
-    'test/boost/chunked_vector_test',
    'test/boost/chunked_managed_vector_test',
+    'test/boost/chunked_vector_test',
    'test/boost/clustering_ranges_walker_test',
    'test/boost/column_mapping_test',
+    'test/boost/commitlog_cleanup_test',
    'test/boost/commitlog_test',
    'test/boost/compaction_group_test',
    'test/boost/compound_test',
@@ -468,102 +470,124 @@ scylla_tests = set([
    'test/boost/counter_test',
    'test/boost/cql_auth_query_test',
    'test/boost/cql_auth_syntax_test',
-    'test/boost/cql_query_test',
+    'test/boost/cql_functions_test',
+    'test/boost/cql_query_group_test',
    'test/boost/cql_query_large_test',
    'test/boost/cql_query_like_test',
-    'test/boost/cql_query_group_test',
-    'test/boost/cql_functions_test',
+    'test/boost/cql_query_test',
    'test/boost/crc_test',
    'test/boost/data_listeners_test',
    'test/boost/database_test',
-    'test/boost/commitlog_cleanup_test',
    'test/boost/dirty_memory_manager_test',
+    'test/boost/double_decker_test',
    'test/boost/duration_test',
    'test/boost/dynamic_bitset_test',
    'test/boost/enum_option_test',
    'test/boost/enum_set_test',
-    'test/boost/extensions_test',
    'test/boost/error_injection_test',
+    'test/boost/estimated_histogram_test',
+    'test/boost/exception_container_test',
+    'test/boost/exceptions_fallback_test',
+    'test/boost/exceptions_optimized_test',
+    'test/boost/expr_test',
+    'test/boost/extensions_test',
    'test/boost/filtering_test',
-    'test/boost/mutation_reader_another_test',
    'test/boost/flush_queue_test',
    'test/boost/fragmented_temporary_buffer_test',
    'test/boost/frozen_mutation_test',
+    'test/boost/generic_server_test',
    'test/boost/gossiping_property_file_snitch_test',
+    'test/boost/group0_cmd_merge_test',
+    'test/boost/group0_test',
    'test/boost/hash_test',
    'test/boost/hashers_test',
    'test/boost/hint_test',
    'test/boost/idl_test',
+    'test/boost/index_with_paging_test',
    'test/boost/input_stream_test',
+    'test/boost/intrusive_array_test',
    'test/boost/json_cql_query_test',
    'test/boost/json_test',
    'test/boost/keys_test',
    'test/boost/large_paging_state_test',
-    'test/boost/recent_entries_map_test',
    'test/boost/like_matcher_test',
    'test/boost/limiting_data_source_test',
    'test/boost/linearizing_input_stream_test',
+    'test/boost/lister_test',
    'test/boost/loading_cache_test',
+    'test/boost/locator_topology_test',
    'test/boost/log_heap_test',
-    'test/boost/estimated_histogram_test',
-    'test/boost/summary_test',
-    'test/boost/logalloc_test',
    'test/boost/logalloc_standard_allocator_segment_pool_backend_test',
-    'test/boost/managed_vector_test',
+    'test/boost/logalloc_test',
    'test/boost/managed_bytes_test',
-    'test/boost/intrusive_array_test',
+    'test/boost/managed_vector_test',
    'test/boost/map_difference_test',
    'test/boost/memtable_test',
+    'test/boost/multishard_combining_reader_as_mutation_source_test',
    'test/boost/multishard_mutation_query_test',
    'test/boost/murmur_hash_test',
    'test/boost/mutation_fragment_test',
    'test/boost/mutation_query_test',
+    'test/boost/mutation_reader_another_test',
    'test/boost/mutation_reader_test',
-    'test/boost/multishard_combining_reader_as_mutation_source_test',
    'test/boost/mutation_test',
    'test/boost/mutation_writer_test',
    'test/boost/mvcc_test',
    'test/boost/network_topology_strategy_test',
-    'test/boost/token_metadata_test',
-    'test/boost/tablets_test',
-    'test/boost/sessions_test',
    'test/boost/nonwrapping_interval_test',
    'test/boost/observable_test',
    'test/boost/partitioner_test',
+    'test/boost/per_partition_rate_limit_test',
+    'test/boost/pretty_printers_test',
    'test/boost/querier_cache_test',
    'test/boost/query_processor_test',
-    'test/boost/wrapping_interval_test',
+    'test/boost/radix_tree_test',
    'test/boost/range_tombstone_list_test',
-    'test/boost/reusable_buffer_test',
-    'test/boost/restrictions_test',
+    'test/boost/rate_limiter_test',
+    'test/boost/reader_concurrency_semaphore_test',
+    'test/boost/recent_entries_map_test',
    'test/boost/repair_test',
+    'test/boost/restrictions_test',
+    'test/boost/result_utils_test',
+    'test/boost/reusable_buffer_test',
    'test/boost/role_manager_test',
    'test/boost/row_cache_test',
    'test/boost/rust_test',
+    'test/boost/s3_test',
    'test/boost/schema_change_test',
+    'test/boost/schema_changes_test',
+    'test/boost/schema_loader_test',
    'test/boost/schema_registry_test',
    'test/boost/secondary_index_test',
-    'test/boost/tracing_test',
-    'test/boost/index_with_paging_test',
    'test/boost/serialization_test',
    'test/boost/serialized_action_test',
+    'test/boost/service_level_controller_test',
+    'test/boost/sessions_test',
    'test/boost/small_vector_test',
    'test/boost/snitch_reset_test',
+    'test/boost/sorting_test',
    'test/boost/sstable_3_x_test',
+    'test/boost/sstable_compaction_test',
+    'test/boost/sstable_conforms_to_mutation_source_test',
    'test/boost/sstable_datafile_test',
+    'test/boost/sstable_directory_test',
    'test/boost/sstable_generation_test',
+    'test/boost/sstable_move_test',
    'test/boost/sstable_mutation_test',
    'test/boost/sstable_partition_index_cache_test',
-    'test/boost/schema_changes_test',
-    'test/boost/sstable_conforms_to_mutation_source_test',
-    'test/boost/sstable_compaction_test',
    'test/boost/sstable_resharding_test',
-    'test/boost/sstable_directory_test',
+    'test/boost/sstable_set_test',
    'test/boost/sstable_test',
-    'test/boost/sstable_move_test',
+    'test/boost/stall_free_test',
    'test/boost/statement_restrictions_test',
    'test/boost/storage_proxy_test',
+    'test/boost/string_format_test',
+    'test/boost/summary_test',
+    'test/boost/tablets_test',
+    'test/boost/tagged_integer_test',
+    'test/boost/token_metadata_test',
    'test/boost/top_k_test',
+    'test/boost/tracing_test',
    'test/boost/transport_test',
    'test/boost/types_test',
    'test/boost/user_function_test',
@@ -571,39 +595,16 @@ scylla_tests = set([
    'test/boost/utf8_test',
    'test/boost/view_build_test',
    'test/boost/view_complex_test',
-    'test/boost/view_schema_test',
-    'test/boost/view_schema_pkey_test',
    'test/boost/view_schema_ckey_test',
+    'test/boost/view_schema_pkey_test',
+    'test/boost/view_schema_test',
    'test/boost/vint_serialization_test',
    'test/boost/virtual_reader_test',
    'test/boost/virtual_table_mutation_source_test',
    'test/boost/virtual_table_test',
-    'test/boost/wasm_test',
    'test/boost/wasm_alloc_test',
-    'test/boost/bptree_test',
-    'test/boost/btree_test',
-    'test/boost/radix_tree_test',
-    'test/boost/double_decker_test',
-    'test/boost/stall_free_test',
-    'test/boost/sstable_set_test',
-    'test/boost/reader_concurrency_semaphore_test',
-    'test/boost/service_level_controller_test',
-    'test/boost/schema_loader_test',
-    'test/boost/lister_test',
-    'test/boost/group0_test',
-    'test/boost/exception_container_test',
-    'test/boost/result_utils_test',
-    'test/boost/rate_limiter_test',
-    'test/boost/per_partition_rate_limit_test',
-    'test/boost/expr_test',
-    'test/boost/exceptions_optimized_test',
-    'test/boost/exceptions_fallback_test',
-    'test/boost/s3_test',
-    'test/boost/locator_topology_test',
-    'test/boost/string_format_test',
-    'test/boost/tagged_integer_test',
-    'test/boost/group0_cmd_merge_test',
-    'test/boost/sorting_test',
+    'test/boost/wasm_test',
+    'test/boost/wrapping_interval_test',
    'test/manual/ec2_snitch_test',
    'test/manual/enormous_table_scan_test',
    'test/manual/gce_snitch_test',
--- a/cql3/Cql.g
+++ b/cql3/Cql.g
@@ -68,6 +68,7 @@ options {
 #include "cql3/statements/ks_prop_defs.hh"
 #include "cql3/selection/raw_selector.hh"
 #include "cql3/selection/selectable-expr.hh"
+#include "cql3/dialect.hh"
 #include "cql3/keyspace_element_name.hh"
 #include "cql3/constants.hh"
 #include "cql3/operation_impl.hh"
@@ -148,6 +149,8 @@ using uexpression = uninitialized<expression>;

    listener_type* listener;

+    dialect _dialect;
+
    // Keeps the names of all bind variables. For bind variables without a name ('?'), the name is nullptr.
    // Maps bind_index -> name.
    std::vector<::shared_ptr<cql3::column_identifier>> _bind_variable_names;
@@ -171,9 +174,14 @@ using uexpression = uninitialized<expression>;
        return s;
    }

+    void set_dialect(dialect d) {
+        _dialect = d;
+    }
+
    bind_variable new_bind_variables(shared_ptr<cql3::column_identifier> name)
    {
-        if (name && _named_bind_variables_indexes.contains(*name)) {
+        if (_dialect.duplicate_bind_variable_names_refer_to_same_variable
+                && name && _named_bind_variables_indexes.contains(*name)) {
            return bind_variable{_named_bind_variables_indexes[*name]};
        }
        auto marker = bind_variable{_bind_variable_names.size()};
--- a/cql3/cql3_type.cc
+++ b/cql3/cql3_type.cc
@@ -449,7 +449,8 @@ sstring maybe_quote(const sstring& identifier) {
        // many keywords but allow keywords listed as "unreserved keywords".
        // So we can use any of them, for example cident.
        try {
-            cql3::util::do_with_parser(identifier, std::mem_fn(&cql3_parser::CqlParser::cident));
+            // In general it's not a good idea to use the default dialect, but for parsing an identifier, it's okay.
+            cql3::util::do_with_parser(identifier, dialect{}, std::mem_fn(&cql3_parser::CqlParser::cident));
            return identifier;
        } catch(exceptions::syntax_exception&) {
            // This alphanumeric string is not a valid identifier, so fall
--- a/cql3/dialect.hh
+++ b/cql3/dialect.hh
@@ -0,0 +1,34 @@
+// Copyright (C) 2024-present ScyllaDB
+// SPDX-License-Identifier: AGPL-3.0-or-later
+
+#pragma once
+
+#include <fmt/core.h>
+
+namespace cql3 {
+
+struct dialect {
+    bool duplicate_bind_variable_names_refer_to_same_variable = true;  // if :a is found twice in a query, the two references are to the same variable (see #15559)
+    bool operator==(const dialect&) const = default;
+};
+
+inline
+dialect
+internal_dialect() {
+    return dialect{
+        .duplicate_bind_variable_names_refer_to_same_variable = true,
+    };
+}
+
+}
+
+template <>
+struct fmt::formatter<cql3::dialect> {
+    constexpr auto parse(format_parse_context& ctx) { return ctx.begin(); }
+
+    template <typename FormatContext>
+    auto format(const cql3::dialect& d, FormatContext& ctx) const {
+        return fmt::format_to(ctx.out(), "cql3::dialect{{duplicate_bind_variable_names_refer_to_same_variable={}}}",
+                d.duplicate_bind_variable_names_refer_to_same_variable);
+    }
+};
--- a/cql3/prepared_statements_cache.hh
+++ b/cql3/prepared_statements_cache.hh
@@ -14,6 +14,7 @@
 #include "utils/hash.hh"
 #include "cql3/statements/prepared_statement.hh"
 #include "cql3/column_specification.hh"
+#include "cql3/dialect.hh"

 namespace cql3 {

@@ -37,14 +38,17 @@ class prepared_cache_key_type {
 public:
    // derive from cql_prepared_id_type so we can customize the formatter of
    // cache_key_type
-    struct cache_key_type : public cql_prepared_id_type {};
+    struct cache_key_type : public cql_prepared_id_type {
+        cache_key_type(cql_prepared_id_type&& id, cql3::dialect d) : cql_prepared_id_type(std::move(id)), dialect(d) {}
+        cql3::dialect dialect; // Not part of hash, but we don't expect collisions because of that
+        bool operator==(const cache_key_type& other) const = default;
+    };

 private:
    cache_key_type _key;

 public:
-    prepared_cache_key_type() = default;
-    explicit prepared_cache_key_type(cql_prepared_id_type cql_id) : _key(std::move(cql_id)) {}
+    explicit prepared_cache_key_type(cql_prepared_id_type cql_id, dialect d) : _key(std::move(cql_id), d) {}

    cache_key_type& key() { return _key; }
    const cache_key_type& key() const { return _key; }
@@ -176,7 +180,7 @@ struct hash<cql3::prepared_cache_key_type> final {
 template <> struct fmt::formatter<cql3::prepared_cache_key_type::cache_key_type> {
    constexpr auto parse(format_parse_context& ctx) { return ctx.begin(); }
    auto format(const cql3::prepared_cache_key_type::cache_key_type& p, fmt::format_context& ctx) const {
-        return fmt::format_to(ctx.out(), "{{cql_id: {}}}", static_cast<const cql3::cql_prepared_id_type&>(p));
+        return fmt::format_to(ctx.out(), "{{cql_id: {}, dialect: {}}}", static_cast<const cql3::cql_prepared_id_type&>(p), p.dialect);
    }
 };

--- a/cql3/query_processor.cc
+++ b/cql3/query_processor.cc
@@ -566,10 +566,10 @@ query_processor::execute_maybe_with_guard(service::query_state& query_state, ::s
 }

 future<::shared_ptr<result_message>>
-query_processor::execute_direct_without_checking_exception_message(const sstring_view& query_string, service::query_state& query_state, query_options& options) {
+query_processor::execute_direct_without_checking_exception_message(const sstring_view& query_string, service::query_state& query_state, dialect d, query_options& options) {
    log.trace("execute_direct: \"{}\"", query_string);
    tracing::trace(query_state.get_trace_state(), "Parsing a statement");
-    auto p = get_statement(query_string, query_state.get_client_state());
+    auto p = get_statement(query_string, query_state.get_client_state(), d);
    auto statement = p->statement;
    const auto warnings = std::move(p->warnings);
    if (statement->get_bound_terms() != options.get_values_count()) {
@@ -653,18 +653,21 @@ query_processor::process_authorized_statement(const ::shared_ptr<cql_statement>
 }

 future<::shared_ptr<cql_transport::messages::result_message::prepared>>
-query_processor::prepare(sstring query_string, service::query_state& query_state) {
+query_processor::prepare(sstring query_string, service::query_state& query_state, cql3::dialect d) {
    auto& client_state = query_state.get_client_state();
-    return prepare(std::move(query_string), client_state);
+    return prepare(std::move(query_string), client_state, d);
 }

 future<::shared_ptr<cql_transport::messages::result_message::prepared>>
-query_processor::prepare(sstring query_string, const service::client_state& client_state) {
+query_processor::prepare(sstring query_string, const service::client_state& client_state, cql3::dialect d) {
    using namespace cql_transport::messages;
    return prepare_one<result_message::prepared::cql>(
            std::move(query_string),
            client_state,
-            compute_id,
+            d,
+            [d] (std::string_view query_string, std::string_view keyspace) {
+                return compute_id(query_string, keyspace, d);
+            },
            prepared_cache_key_type::cql_id);
 }

@@ -676,13 +679,14 @@ static std::string hash_target(std::string_view query_string, std::string_view k

 prepared_cache_key_type query_processor::compute_id(
        std::string_view query_string,
-        std::string_view keyspace) {
-    return prepared_cache_key_type(md5_hasher::calculate(hash_target(query_string, keyspace)));
+        std::string_view keyspace,
+        dialect d) {
+    return prepared_cache_key_type(md5_hasher::calculate(hash_target(query_string, keyspace)), d);
 }

 std::unique_ptr<prepared_statement>
-query_processor::get_statement(const sstring_view& query, const service::client_state& client_state) {
-    std::unique_ptr<raw::parsed_statement> statement = parse_statement(query);
+query_processor::get_statement(const sstring_view& query, const service::client_state& client_state, dialect d) {
+    std::unique_ptr<raw::parsed_statement> statement = parse_statement(query, d);

    // Set keyspace for statement that require login
    auto cf_stmt = dynamic_cast<raw::cf_statement*>(statement.get());
@@ -696,7 +700,7 @@ query_processor::get_statement(const sstring_view& query, const service::client_
 }

 std::unique_ptr<raw::parsed_statement>
-query_processor::parse_statement(const sstring_view& query) {
+query_processor::parse_statement(const sstring_view& query, dialect d) {
    try {
        {
            const char* error_injection_key = "query_processor-parse_statement-test_failure";
@@ -706,7 +710,7 @@ query_processor::parse_statement(const sstring_view& query) {
                }
            });
        }
-        auto statement = util::do_with_parser(query,  std::mem_fn(&cql3_parser::CqlParser::query));
+        auto statement = util::do_with_parser(query, d, std::mem_fn(&cql3_parser::CqlParser::query));
        if (!statement) {
            throw exceptions::syntax_exception("Parsing failed");
        }
@@ -722,9 +726,9 @@ query_processor::parse_statement(const sstring_view& query) {
 }

 std::vector<std::unique_ptr<raw::parsed_statement>>
-query_processor::parse_statements(std::string_view queries) {
+query_processor::parse_statements(std::string_view queries, dialect d) {
    try {
-        auto statements = util::do_with_parser(queries, std::mem_fn(&cql3_parser::CqlParser::queries));
+        auto statements = util::do_with_parser(queries, d, std::mem_fn(&cql3_parser::CqlParser::queries));
        if (statements.empty()) {
            throw exceptions::syntax_exception("Parsing failed");
        }
@@ -797,7 +801,7 @@ query_options query_processor::make_internal_options(
 statements::prepared_statement::checked_weak_ptr query_processor::prepare_internal(const sstring& query_string) {
    auto& p = _internal_statements[query_string];
    if (p == nullptr) {
-        auto np = parse_statement(query_string)->prepare(_db, _cql_stats);
+        auto np = parse_statement(query_string, internal_dialect())->prepare(_db, _cql_stats);
        np->statement->raw_cql_statement = query_string;
        p = std::move(np); // inserts it into map
    }
@@ -903,7 +907,8 @@ query_processor::execute_internal(
        auto p = prepare_internal(query_string);
        return execute_with_params(std::move(p), cl, query_state, values);
    } else {
-        auto p = parse_statement(query_string)->prepare(_db, _cql_stats);
+        // For internal queries, we want the default dialect, not the user provided one
+        auto p = parse_statement(query_string, dialect{})->prepare(_db, _cql_stats);
        p->statement->raw_cql_statement = query_string;
        auto checked_weak_ptr = p->checked_weak_from_this();
        return execute_with_params(std::move(checked_weak_ptr), cl, query_state, values).finally([p = std::move(p)] {});
--- a/cql3/query_processor.hh
+++ b/cql3/query_processor.hh
@@ -21,6 +21,7 @@
 #include "cql3/authorized_prepared_statements_cache.hh"
 #include "cql3/statements/prepared_statement.hh"
 #include "cql3/cql_statement.hh"
+#include "cql3/dialect.hh"
 #include "exceptions/exceptions.hh"
 #include "service/migration_listener.hh"
 #include "timestamp.hh"
@@ -137,10 +138,11 @@ public:

    static prepared_cache_key_type compute_id(
            std::string_view query_string,
-            std::string_view keyspace);
+            std::string_view keyspace,
+            dialect d);

-    static std::unique_ptr<statements::raw::parsed_statement> parse_statement(const std::string_view& query);
-    static std::vector<std::unique_ptr<statements::raw::parsed_statement>> parse_statements(std::string_view queries);
+    static std::unique_ptr<statements::raw::parsed_statement> parse_statement(const std::string_view& query, dialect d);
+    static std::vector<std::unique_ptr<statements::raw::parsed_statement>> parse_statements(std::string_view queries, dialect d);

    query_processor(service::storage_proxy& proxy, data_dictionary::database db, service::migration_notifier& mn, memory_config mcfg, cql_config& cql_cfg, utils::loading_cache_config auth_prep_cache_cfg, lang::manager& langm);

@@ -249,10 +251,12 @@ public:
    execute_direct(
            const std::string_view& query_string,
            service::query_state& query_state,
+            dialect d,
            query_options& options) {
        return execute_direct_without_checking_exception_message(
                query_string,
                query_state,
+                d,
                options)
                .then(cql_transport::messages::propagate_exception_as_future<::shared_ptr<cql_transport::messages::result_message>>);
    }
@@ -263,6 +267,7 @@ public:
    execute_direct_without_checking_exception_message(
            const std::string_view& query_string,
            service::query_state& query_state,
+            dialect d,
            query_options& options);

    future<::shared_ptr<cql_transport::messages::result_message>>
@@ -397,10 +402,10 @@ public:


    future<::shared_ptr<cql_transport::messages::result_message::prepared>>
-    prepare(sstring query_string, service::query_state& query_state);
+    prepare(sstring query_string, service::query_state& query_state, dialect d);

    future<::shared_ptr<cql_transport::messages::result_message::prepared>>
-    prepare(sstring query_string, const service::client_state& client_state);
+    prepare(sstring query_string, const service::client_state& client_state, dialect d);

    future<> stop();

@@ -443,7 +448,8 @@ public:

    std::unique_ptr<statements::prepared_statement> get_statement(
            const std::string_view& query,
-            const service::client_state& client_state);
+            const service::client_state& client_state,
+            dialect d);

    friend class migration_subscriber;

@@ -527,14 +533,15 @@ private:
    prepare_one(
            sstring query_string,
            const service::client_state& client_state,
+            dialect d,
            PreparedKeyGenerator&& id_gen,
            IdGetter&& id_getter) {
        return do_with(
                id_gen(query_string, client_state.get_raw_keyspace()),
                std::move(query_string),
-                [this, &client_state, &id_getter](const prepared_cache_key_type& key, const sstring& query_string) {
-            return _prepared_cache.get(key, [this, &query_string, &client_state] {
-                auto prepared = get_statement(query_string, client_state);
+                [this, &client_state, &id_getter, d](const prepared_cache_key_type& key, const sstring& query_string) {
+            return _prepared_cache.get(key, [this, &query_string, &client_state, d] {
+                auto prepared = get_statement(query_string, client_state, d);
                auto bound_terms = prepared->statement->get_bound_terms();
                if (bound_terms > std::numeric_limits<uint16_t>::max()) {
                    throw exceptions::invalid_request_exception(
--- a/cql3/selection/selection.cc
+++ b/cql3/selection/selection.cc
@@ -503,10 +503,12 @@ selection::collect_metadata(const schema& schema, const std::vector<prepared_sel
 }

 result_set_builder::result_set_builder(const selection& s, gc_clock::time_point now,
-                                       std::vector<size_t> group_by_cell_indices)
+                                       std::vector<size_t> group_by_cell_indices,
+                                       uint64_t limit)
    : _result_set(std::make_unique<result_set>(::make_shared<metadata>(*(s.get_result_metadata()))))
    , _selectors(s.new_selectors())
    , _group_by_cell_indices(std::move(group_by_cell_indices))
+    , _limit(limit)
    , _last_group(_group_by_cell_indices.size())
    , _group_began(false)
    , _now(now)
@@ -577,8 +579,10 @@ void result_set_builder::flush_selectors() {
        // handled by process_current_row
        return;
    }
-    _result_set->add_row(_selectors->get_output_row());
-    _selectors->reset();
+    if (_result_set->size() < _limit) {
+        _result_set->add_row(_selectors->get_output_row());
+        _selectors->reset();
+    }
 }

 void result_set_builder::complete_row() {
@@ -790,6 +794,10 @@ int32_t result_set_builder::ttl_of(size_t idx) {
    return _ttls[idx];
 }

+size_t result_set_builder::result_set_size() const {
+    return _result_set->size();
+}
+
 bytes_opt result_set_builder::get_value(data_type t, query::result_atomic_cell_view c) {
    return {c.value().linearize()};
 }
--- a/cql3/selection/selection.hh
+++ b/cql3/selection/selection.hh
@@ -172,6 +172,7 @@ private:
    std::unique_ptr<result_set> _result_set;
    std::unique_ptr<selectors> _selectors;
    const std::vector<size_t> _group_by_cell_indices; ///< Indices in \c current of cells holding GROUP BY values.
+    const uint64_t _limit; ///< Maximum number of rows to return.
    std::vector<managed_bytes_opt> _last_group; ///< Previous row's group: all of GROUP BY column values.
    bool _group_began; ///< Whether a group began being formed.
 public:
@@ -236,7 +237,8 @@ public:
    };

    result_set_builder(const selection& s, gc_clock::time_point now,
-                       std::vector<size_t> group_by_cell_indices = {});
+                       std::vector<size_t> group_by_cell_indices = {},
+                       uint64_t limit = std::numeric_limits<uint64_t>::max());
    void add_empty();
    void add(bytes_opt value);
    void add(const column_definition& def, const query::result_atomic_cell_view& c);
@@ -246,6 +248,7 @@ public:
    std::unique_ptr<result_set> build();
    api::timestamp_type timestamp_of(size_t idx);
    int32_t ttl_of(size_t idx);
+    size_t result_set_size() const;

    // Implements ResultVisitor concept from query.hh
    template<typename Filter = nop_filter>
--- a/cql3/statements/alter_keyspace_statement.cc
+++ b/cql3/statements/alter_keyspace_statement.cc
@@ -11,6 +11,7 @@
 #include <boost/range/algorithm.hpp>
 #include <fmt/format.h>
 #include <seastar/core/coroutine.hh>
+#include <seastar/core/on_internal_error.hh>
 #include <stdexcept>
 #include "alter_keyspace_statement.hh"
 #include "prepared_statement.hh"
@@ -43,18 +44,16 @@ future<> cql3::statements::alter_keyspace_statement::check_access(query_processo
    return state.has_keyspace_access(_name, auth::permission::ALTER);
 }

-static bool validate_rf_difference(const std::string_view curr_rf, const std::string_view new_rf) {
-    auto to_number = [] (const std::string_view rf) {
-        int result;
-        // We assume the passed string view represents a valid decimal number,
-        // so we don't need the error code.
-        (void) std::from_chars(rf.begin(), rf.end(), result);
-        return result;
-    };
-
-    // We want to ensure that each DC's RF is going to change by at most 1
-    // because in that case the old and new quorums must overlap.
-    return std::abs(to_number(curr_rf) - to_number(new_rf)) <= 1;
+static unsigned get_abs_rf_diff(const std::string& curr_rf, const std::string& new_rf) {
+    try {
+        return std::abs(std::stoi(curr_rf) - std::stoi(new_rf));
+    } catch (std::invalid_argument const& ex) {
+        on_internal_error(mylogger, fmt::format("get_abs_rf_diff expects integer arguments, "
+                                                "but got curr_rf:{} and new_rf:{}", curr_rf, new_rf));
+    } catch (std::out_of_range const& ex) {
+        on_internal_error(mylogger, fmt::format("get_abs_rf_diff expects integer arguments to fit into `int` type, "
+                                                "but got curr_rf:{} and new_rf:{}", curr_rf, new_rf));
+    }
 }

 void cql3::statements::alter_keyspace_statement::validate(query_processor& qp, const service::client_state& state) const {
@@ -84,11 +83,24 @@ void cql3::statements::alter_keyspace_statement::validate(query_processor& qp, c
            auto new_ks = _attrs->as_ks_metadata_update(ks.metadata(), *qp.proxy().get_token_metadata_ptr(), qp.proxy().features());

            if (ks.get_replication_strategy().uses_tablets()) {
-                const std::map<sstring, sstring>& current_rfs = ks.metadata()->strategy_options();
-                for (const auto& [new_dc, new_rf] : _attrs->get_replication_options()) {
-                    auto it = current_rfs.find(new_dc);
-                    if (it != current_rfs.end() && !validate_rf_difference(it->second, new_rf)) {
-                        throw exceptions::invalid_request_exception("Cannot modify replication factor of any DC by more than 1 at a time.");
+                const std::map<sstring, sstring>& current_rf_per_dc = ks.metadata()->strategy_options();
+                auto new_rf_per_dc = _attrs->get_replication_options();
+                new_rf_per_dc.erase(ks_prop_defs::REPLICATION_STRATEGY_CLASS_KEY);
+                unsigned total_abs_rfs_diff = 0;
+                for (const auto& [new_dc, new_rf] : new_rf_per_dc) {
+                    sstring old_rf = "0";
+                    if (auto new_dc_in_current_mapping = current_rf_per_dc.find(new_dc);
+                             new_dc_in_current_mapping != current_rf_per_dc.end()) {
+                        old_rf = new_dc_in_current_mapping->second;
+                    } else if (!qp.proxy().get_token_metadata_ptr()->get_topology().get_datacenters().contains(new_dc)) {
+                        // This means that the DC listed in ALTER doesn't exist. This error will be reported later,
+                        // during validation in abstract_replication_strategy::validate_replication_strategy.
+                        // We can't report this error now, because it'd change the order of errors reported:
+                        // first we need to report non-existing DCs, then if RFs aren't changed by too much.
+                        continue;
+                    }
+                    if (total_abs_rfs_diff += get_abs_rf_diff(old_rf, new_rf); total_abs_rfs_diff >= 2) {
+                        throw exceptions::invalid_request_exception("Only one DC's RF can be changed at a time and not by more than 1");
                    }
                }
            }
@@ -118,6 +130,63 @@ bool cql3::statements::alter_keyspace_statement::changes_tablets(query_processor
    return ks.get_replication_strategy().uses_tablets() && !_attrs->get_replication_options().empty();
 }

+namespace {
+// These functions are used to flatten all the options in the keyspace definition into a single-level map<string, string>.
+// (Currently options are stored in a nested structure that looks more like a map<string, map<string, string>>).
+// Flattening is simply joining the keys of maps from both levels with a colon ':' character,
+// or in other words: prefixing the keys in the output map with the option type, e.g. 'replication', 'storage', etc.,
+// so that the output map contains entries like: "replication:dc1" -> "3".
+// This is done to avoid key conflicts and to be able to de-flatten the map back into the original structure.
+
+void add_prefixed_key(const sstring& prefix, const std::map<sstring, sstring>& in, std::map<sstring, sstring>& out) {
+    for (const auto& [in_key, in_value]: in) {
+        out[prefix + ":" + in_key] = in_value;
+    }
+};
+
+std::map<sstring, sstring> get_current_options_flattened(const shared_ptr<cql3::statements::ks_prop_defs>& ks,
+                                                         bool include_tablet_options,
+                                                         const gms::feature_service& feat) {
+    std::map<sstring, sstring> all_options;
+
+    add_prefixed_key(ks->KW_REPLICATION, ks->get_replication_options(), all_options);
+    add_prefixed_key(ks->KW_STORAGE, ks->get_storage_options().to_map(), all_options);
+    // if no tablet options are specified in ATLER KS statement,
+    // we want to preserve the old ones and hence cannot overwrite them with defaults
+    if (include_tablet_options) {
+        auto initial_tablets = ks->get_initial_tablets(std::nullopt);
+        add_prefixed_key(ks->KW_TABLETS,
+                         {{"enabled", initial_tablets ? "true" : "false"},
+                         {"initial", std::to_string(initial_tablets.value_or(0))}},
+                         all_options);
+    }
+    add_prefixed_key(ks->KW_DURABLE_WRITES,
+                     {{sstring(ks->KW_DURABLE_WRITES), to_sstring(ks->get_boolean(ks->KW_DURABLE_WRITES, true))}},
+                     all_options);
+
+    return all_options;
+}
+
+std::map<sstring, sstring> get_old_options_flattened(const data_dictionary::keyspace& ks, bool include_tablet_options) {
+    std::map<sstring, sstring> all_options;
+
+    using namespace cql3::statements;
+    add_prefixed_key(ks_prop_defs::KW_REPLICATION, ks.get_replication_strategy().get_config_options(), all_options);
+    add_prefixed_key(ks_prop_defs::KW_STORAGE, ks.metadata()->get_storage_options().to_map(), all_options);
+    if (include_tablet_options) {
+        add_prefixed_key(ks_prop_defs::KW_TABLETS,
+                         {{"enabled", ks.metadata()->initial_tablets() ? "true" : "false"},
+                          {"initial", std::to_string(ks.metadata()->initial_tablets().value_or(0))}},
+                         all_options);
+    }
+    add_prefixed_key(ks_prop_defs::KW_DURABLE_WRITES,
+                     {{sstring(ks_prop_defs::KW_DURABLE_WRITES), to_sstring(ks.metadata()->durable_writes())}},
+                     all_options);
+
+    return all_options;
+}
+} // <anonymous> namespace
+
 future<std::tuple<::shared_ptr<cql_transport::event::schema_change>, cql3::cql_warnings_vec>>
 cql3::statements::alter_keyspace_statement::prepare_schema_mutations(query_processor& qp, service::query_state& state, const query_options& options, service::group0_batch& mc) const {
    using namespace cql_transport;
@@ -130,11 +199,18 @@ cql3::statements::alter_keyspace_statement::prepare_schema_mutations(query_proce
        auto ks_md_update = _attrs->as_ks_metadata_update(ks_md, tm, feat);
        std::vector<mutation> muts;
        std::vector<sstring> warnings;
-        auto ks_options = _attrs->get_all_options_flattened(feat);
+        bool include_tablet_options = _attrs->get_map(_attrs->KW_TABLETS).has_value();
+        auto old_ks_options = get_old_options_flattened(ks, include_tablet_options);
+        auto ks_options = get_current_options_flattened(_attrs, include_tablet_options, feat);
+        ks_options.merge(old_ks_options);
+
        auto ts = mc.write_timestamp();
        auto global_request_id = mc.new_group0_state_id();

        // we only want to run the tablets path if there are actually any tablets changes, not only schema changes
+        // TODO: the current `if (changes_tablets(qp))` is insufficient: someone may set the same RFs as before,
+        //       and we'll unnecessarily trigger the processing path for ALTER tablets KS,
+        //       when in reality nothing or only schema is being changed
        if (changes_tablets(qp)) {
            if (!qp.topology_global_queue_empty()) {
                return make_exception_future<std::tuple<::shared_ptr<::cql_transport::event::schema_change>, cql3::cql_warnings_vec>>(
--- a/cql3/statements/alter_table_statement.cc
+++ b/cql3/statements/alter_table_statement.cc
@@ -384,7 +384,8 @@ std::pair<schema_builder, std::vector<view_ptr>> alter_table_statement::prepare_
                    auto new_where = util::rename_column_in_where_clause(
                            view->view_info()->where_clause(),
                            column_identifier::raw(view_from->text(), true),
-                            column_identifier::raw(view_to->text(), true));
+                            column_identifier::raw(view_to->text(), true),
+                            cql3::dialect{});
                    builder.with_view_info(view->view_info()->base_id(), view->view_info()->base_name(),
                            view->view_info()->include_all_columns(), std::move(new_where));

--- a/cql3/statements/create_service_level_statement.cc
+++ b/cql3/statements/create_service_level_statement.cc
@@ -7,6 +7,7 @@
 */

 #include "auth/service.hh"
+#include "exceptions/exceptions.hh"
 #include "seastarx.hh"
 #include "cql3/statements/create_service_level_statement.hh"
 #include "service/qos/service_level_controller.hh"
@@ -38,6 +39,10 @@ create_service_level_statement::execute(query_processor& qp,
        service::query_state &state,
        const query_options &,
        std::optional<service::group0_guard> guard) const {
+    if (_service_level.starts_with('$')) {
+        throw exceptions::invalid_request_exception("Names starting with '$' are reserved for internal tenants. Use a different name.");
+    }
+
    service::group0_batch mc{std::move(guard)};
    qos::service_level_options slo = _slo.replace_defaults(qos::service_level_options{});
    auto& sl = state.get_service_level_controller();
--- a/cql3/statements/create_table_statement.cc
+++ b/cql3/statements/create_table_statement.cc
@@ -192,6 +192,13 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa

    auto stmt = ::make_shared<create_table_statement>(*_cf_name, _properties.properties(), _if_not_exists, _static_columns, _properties.properties()->get_id());

+    bool ks_uses_tablets;
+    try {
+        ks_uses_tablets = db.find_keyspace(keyspace()).get_replication_strategy().uses_tablets();
+    } catch (const data_dictionary::no_such_keyspace& e) {
+        throw exceptions::invalid_request_exception("Cannot create a table in a non-existent keyspace: " + keyspace());
+    }
+
    std::optional<std::map<bytes, data_type>> defined_multi_cell_columns;
    for (auto&& entry : _definitions) {
        ::shared_ptr<column_identifier> id = entry.first;
@@ -201,7 +208,7 @@ std::unique_ptr<prepared_statement> create_table_statement::raw_statement::prepa
            throw exceptions::invalid_request_exception("Cannot set default_time_to_live on a table with counters");
        }

-        if (db.find_keyspace(keyspace()).get_replication_strategy().uses_tablets() && pt.is_counter()) {
+        if (ks_uses_tablets && pt.is_counter()) {
            throw exceptions::invalid_request_exception(format("Cannot use the 'counter' type for table {}.{}: Counters are not yet supported with tablets", keyspace(), cf_name));
        }

--- a/cql3/statements/ks_prop_defs.cc
+++ b/cql3/statements/ks_prop_defs.cc
@@ -138,28 +138,22 @@ data_dictionary::storage_options ks_prop_defs::get_storage_options() const {
    return opts;
 }

-ks_prop_defs::init_tablets_options ks_prop_defs::get_initial_tablets(const sstring& strategy_class, bool enabled_by_default) const {
-    // FIXME -- this should be ignored somehow else
-    init_tablets_options ret{ .enabled = false, .specified_count = std::nullopt };
-    if (locator::abstract_replication_strategy::to_qualified_class_name(strategy_class) != "org.apache.cassandra.locator.NetworkTopologyStrategy") {
-        return ret;
-    }
-
+std::optional<unsigned> ks_prop_defs::get_initial_tablets(std::optional<unsigned> default_value) const {
    auto tablets_options = get_map(KW_TABLETS);
    if (!tablets_options) {
-        return enabled_by_default ? init_tablets_options{ .enabled = true } : ret;
+        return default_value;
    }

+    unsigned initial_count = 0;
    auto it = tablets_options->find("enabled");
    if (it != tablets_options->end()) {
        auto enabled = it->second;
        tablets_options->erase(it);

        if (enabled == "true") {
-            ret = init_tablets_options{ .enabled = true, .specified_count = 0 }; // even if 'initial' is not set, it'll start with auto-detection
+            // nothing
        } else if (enabled == "false") {
-            assert(!ret.enabled);
-            return ret;
+            return std::nullopt;
        } else {
            throw exceptions::configuration_exception(sstring("Tablets enabled value must be true or false; found: ") + enabled);
        }
@@ -168,7 +162,7 @@ ks_prop_defs::init_tablets_options ks_prop_defs::get_initial_tablets(const sstri
    it = tablets_options->find("initial");
    if (it != tablets_options->end()) {
        try {
-            ret = init_tablets_options{ .enabled = true, .specified_count = std::stol(it->second)};
+            initial_count = std::stol(it->second);
        } catch (...) {
            throw exceptions::configuration_exception(sstring("Initial tablets value should be numeric; found ") + it->second);
        }
@@ -179,7 +173,7 @@ ks_prop_defs::init_tablets_options ks_prop_defs::get_initial_tablets(const sstri
        throw exceptions::configuration_exception(sstring("Unrecognized tablets option ") + tablets_options->begin()->first);
    }

-    return ret;
+    return initial_count;
 }

 std::optional<sstring> ks_prop_defs::get_replication_strategy_class() const {
@@ -190,32 +184,13 @@ bool ks_prop_defs::get_durable_writes() const {
    return get_boolean(KW_DURABLE_WRITES, true);
 }

-std::map<sstring, sstring> ks_prop_defs::get_all_options_flattened(const gms::feature_service& feat) const {
-    std::map<sstring, sstring> all_options;
-
-    auto ingest_flattened_options = [&all_options](const std::map<sstring, sstring>& options, const sstring& prefix) {
-        for (auto& option: options) {
-            all_options[prefix + ":" + option.first] = option.second;
-        }
-    };
-    ingest_flattened_options(get_replication_options(), KW_REPLICATION);
-    ingest_flattened_options(get_storage_options().to_map(), KW_STORAGE);
-    ingest_flattened_options(get_map(KW_TABLETS).value_or(std::map<sstring, sstring>{}), KW_TABLETS);
-    ingest_flattened_options({{sstring(KW_DURABLE_WRITES), to_sstring(get_boolean(KW_DURABLE_WRITES, true))}}, KW_DURABLE_WRITES);
-
-    return all_options;
-}
-
 lw_shared_ptr<data_dictionary::keyspace_metadata> ks_prop_defs::as_ks_metadata(sstring ks_name, const locator::token_metadata& tm, const gms::feature_service& feat) {
    auto sc = get_replication_strategy_class().value();
-    auto initial_tablets = get_initial_tablets(sc, feat.tablets);
-    // if tablets options have not been specified, but tablets are globally enabled, set the value to 0
-    if (initial_tablets.enabled && !initial_tablets.specified_count) {
-        initial_tablets.specified_count = 0;
-    }
+    // if tablets options have not been specified, but tablets are globally enabled, set the value to 0 for N.T.S. only
+    auto initial_tablets = get_initial_tablets(feat.tablets && locator::abstract_replication_strategy::to_qualified_class_name(sc) == "org.apache.cassandra.locator.NetworkTopologyStrategy" ? std::optional<unsigned>(0) : std::nullopt);
    auto options = prepare_options(sc, tm, get_replication_options());
    return data_dictionary::keyspace_metadata::new_keyspace(ks_name, sc,
-            std::move(options), initial_tablets.specified_count, get_boolean(KW_DURABLE_WRITES, true), get_storage_options());
+            std::move(options), initial_tablets, get_boolean(KW_DURABLE_WRITES, true), get_storage_options());
 }

 lw_shared_ptr<data_dictionary::keyspace_metadata> ks_prop_defs::as_ks_metadata_update(lw_shared_ptr<data_dictionary::keyspace_metadata> old, const locator::token_metadata& tm, const gms::feature_service& feat) {
@@ -228,13 +203,9 @@ lw_shared_ptr<data_dictionary::keyspace_metadata> ks_prop_defs::as_ks_metadata_u
        sc = old->strategy_name();
        options = old_options;
    }
-    auto initial_tablets = get_initial_tablets(*sc, old->initial_tablets().has_value());
    // if tablets options have not been specified, inherit them if it's tablets-enabled KS
-    if (initial_tablets.enabled && !initial_tablets.specified_count) {
-        initial_tablets.specified_count = old->initial_tablets();
-    }
-
-    return data_dictionary::keyspace_metadata::new_keyspace(old->name(), *sc, options, initial_tablets.specified_count, get_boolean(KW_DURABLE_WRITES, true), get_storage_options());
+    auto initial_tablets = get_initial_tablets(old->initial_tablets());
+    return data_dictionary::keyspace_metadata::new_keyspace(old->name(), *sc, options, initial_tablets, get_boolean(KW_DURABLE_WRITES, true), get_storage_options());
 }


--- a/cql3/statements/ks_prop_defs.hh
+++ b/cql3/statements/ks_prop_defs.hh
@@ -49,21 +49,15 @@ public:
 private:
    std::optional<sstring> _strategy_class;
 public:
-    struct init_tablets_options {
-        bool enabled;
-        std::optional<unsigned> specified_count;
-    };
-
    ks_prop_defs() = default;
    explicit ks_prop_defs(std::map<sstring, sstring> options);

    void validate();
    std::map<sstring, sstring> get_replication_options() const;
    std::optional<sstring> get_replication_strategy_class() const;
-    init_tablets_options get_initial_tablets(const sstring& strategy_class, bool enabled_by_default) const;
+    std::optional<unsigned> get_initial_tablets(std::optional<unsigned> default_value) const;
    data_dictionary::storage_options get_storage_options() const;
    bool get_durable_writes() const;
-    std::map<sstring, sstring> get_all_options_flattened(const gms::feature_service& feat) const;
    lw_shared_ptr<data_dictionary::keyspace_metadata> as_ks_metadata(sstring ks_name, const locator::token_metadata&, const gms::feature_service&);
    lw_shared_ptr<data_dictionary::keyspace_metadata> as_ks_metadata_update(lw_shared_ptr<data_dictionary::keyspace_metadata> old, const locator::token_metadata&, const gms::feature_service&);
 };
--- a/cql3/statements/property_definitions.hh
+++ b/cql3/statements/property_definitions.hh
@@ -46,14 +46,14 @@ public:
 protected:
    std::optional<sstring> get_simple(const sstring& name) const;

-    std::optional<std::map<sstring, sstring>> get_map(const sstring& name) const;
-
    void remove_from_map_if_exists(const sstring& name, const sstring& key) const;
 public:
    bool has_property(const sstring& name) const;

    std::optional<value_type> get(const sstring& name) const;

+    std::optional<std::map<sstring, sstring>> get_map(const sstring& name) const;
+
    sstring get_string(sstring key, sstring default_value) const;

    // Return a property value, typed as a Boolean
--- a/cql3/statements/select_statement.cc
+++ b/cql3/statements/select_statement.cc
@@ -283,33 +283,44 @@ select_statement::make_partition_slice(const query_options& options) const
        std::reverse(bounds.begin(), bounds.end());
        ++_stats.reverse_queries;
    }
+
+    const uint64_t per_partition_limit = get_inner_loop_limit(get_limit(options, _per_partition_limit),
+        _selection->is_aggregate());
    return query::partition_slice(std::move(bounds),
-        std::move(static_columns), std::move(regular_columns), _opts, nullptr, get_per_partition_limit(options));
+        std::move(static_columns), std::move(regular_columns), _opts, nullptr, per_partition_limit);
 }

-uint64_t select_statement::do_get_limit(const query_options& options,
-                                        const std::optional<expr::expression>& limit,
-                                        const expr::unset_bind_variable_guard& limit_unset_guard,
-                                        uint64_t default_limit) const {
-    if (!limit.has_value() || limit_unset_guard.is_unset(options) || _selection->is_aggregate()) {
-        return default_limit;
-    }
-
-    auto val = expr::evaluate(*limit, options);
-    if (val.is_null()) {
-        throw exceptions::invalid_request_exception("Invalid null value of limit");
+select_statement::get_limit_result select_statement::get_limit(
+    const query_options& options, const std::optional<expr::expression>& limit) const
+{
+    if (!limit.has_value()) {
+        return bo::success(query::max_rows);
    }
    try {
+        auto val = expr::evaluate(*limit, options);
+        if (val.is_null()) {
+            return bo::failure(exceptions::invalid_request_exception("Invalid null value of limit"));
+        }
        auto l = val.view().validate_and_deserialize<int32_t>(*int32_type);
        if (l <= 0) {
-            throw exceptions::invalid_request_exception("LIMIT must be strictly positive");
+            return bo::failure(exceptions::invalid_request_exception("LIMIT must be strictly positive"));
        }
-        return l;
+        return bo::success(l);
    } catch (const marshal_exception& e) {
-        throw exceptions::invalid_request_exception("Invalid limit value");
+        return bo::failure(exceptions::invalid_request_exception("Invalid limit value"));
+    } catch (const exceptions::invalid_request_exception& e) {
+        return bo::failure(e);
    }
 }

+uint64_t select_statement::get_inner_loop_limit(const select_statement::get_limit_result& limit, bool is_aggregate)
+{
+    if (!limit.has_value() || is_aggregate) {
+        return query::max_rows;
+    }
+    return limit.value();
+}
+
 bool select_statement::needs_post_query_ordering() const {
    // We need post-query ordering only for queries with IN on the partition key and an ORDER BY.
    return _restrictions->key_is_in_relation() && !_parameters->orderings().empty();
@@ -358,7 +369,8 @@ select_statement::do_execute(query_processor& qp,

    validate_for_read(cl);

-    uint64_t limit = get_limit(options);
+    const auto parsed_limit = get_limit(options, _limit);
+    const uint64_t inner_loop_limit = get_inner_loop_limit(parsed_limit, _selection->is_aggregate());
    auto now = gc_clock::now();

    _stats.filtered_reads += _restrictions_need_filtering;
@@ -380,7 +392,7 @@ select_statement::do_execute(query_processor& qp,
            std::move(slice),
            max_result_size,
            query::tombstone_limit(qp.proxy().get_tombstone_limit()),
-            query::row_limit(limit),
+            query::row_limit(inner_loop_limit),
            query::partition_limit(query::max_partitions),
            now,
            tracing::make_trace_info(state.get_trace_state()),
@@ -393,14 +405,13 @@ select_statement::do_execute(query_processor& qp,

    _stats.unpaged_select_queries(_ks_sel) += page_size <= 0;

-    // An aggregation query will never be paged for the user, but we always page it internally to avoid OOM.
-    // If we user provided a page_size we'll use that to page internally (because why not), otherwise we use our default
-    // Note that if there are some nodes in the cluster with a version less than 2.0, we can't use paging (CASSANDRA-6707).
+    // An aggregation query may not be paged for the user, but we always page it internally to avoid OOM.
+    // If the user provided a page_size we'll use that to page internally (because why not), otherwise we use our default
    // Also note: all GROUP BY queries are considered aggregation.
    const bool aggregate = _selection->is_aggregate() || has_group_by();
    const bool nonpaged_filtering = _restrictions_need_filtering && page_size <= 0;
    if (aggregate || nonpaged_filtering) {
-        page_size = internal_paging_size;
+        page_size = page_size <= 0 ? internal_paging_size : std::min(page_size, internal_paging_size);
    }

    auto key_ranges = _restrictions->get_partition_key_ranges(options);
@@ -438,7 +449,9 @@ select_statement::do_execute(query_processor& qp,
                    *command, key_ranges))) {
        f = execute_without_checking_exception_message_non_aggregate_unpaged(qp, command, std::move(key_ranges), state, options, now);
    } else {
-        f = execute_without_checking_exception_message_aggregate_or_paged(qp, command, std::move(key_ranges), state, options, now, page_size, aggregate, nonpaged_filtering);
+        f = execute_without_checking_exception_message_aggregate_or_paged(qp, command,
+            std::move(key_ranges), state, options, now, page_size, aggregate,
+            nonpaged_filtering, parsed_limit.has_value() ? parsed_limit.value() : query::max_rows);
    }

    if (!tablet_info.has_value()) {
@@ -454,7 +467,8 @@ select_statement::do_execute(query_processor& qp,
 future<::shared_ptr<cql_transport::messages::result_message>>
 select_statement::execute_without_checking_exception_message_aggregate_or_paged(query_processor& qp,
        lw_shared_ptr<query::read_command> command, dht::partition_range_vector&& key_ranges, service::query_state& state,
-        const query_options& options, gc_clock::time_point now, int32_t page_size, bool aggregate, bool nonpaged_filtering) const {
+        const query_options& options, gc_clock::time_point now, int32_t page_size, bool aggregate, bool nonpaged_filtering,
+        uint64_t limit) const {
    command->slice.options.set<query::partition_slice::option::allow_short_read>();
    auto timeout_duration = get_timeout(state.get_client_state(), options);
    auto timeout = db::timeout_clock::now() + timeout_duration;
@@ -462,8 +476,11 @@ select_statement::execute_without_checking_exception_message_aggregate_or_paged(
            state, options, command, std::move(key_ranges), _restrictions_need_filtering ? _restrictions : nullptr);

    if (aggregate || nonpaged_filtering) {
-        auto builder = cql3::selection::result_set_builder(*_selection, now, *_group_by_cell_indices);
-        coordinator_result<void> result_void = co_await utils::result_do_until([&p] {return p->is_exhausted();},
+        auto builder = cql3::selection::result_set_builder(*_selection, now, *_group_by_cell_indices, limit);
+        coordinator_result<void> result_void = co_await utils::result_do_until(
+                [&p, &builder, limit] {
+                    return p->is_exhausted() || (limit < builder.result_set_size());
+                },
                [&p, &builder, page_size, now, timeout] {
                    return p->fetch_page_result(builder, page_size, now, timeout);
                }
@@ -586,7 +603,7 @@ indexed_table_select_statement::prepare_command_for_base_query(query_processor&
            std::move(slice),
            qp.proxy().get_max_result_size(slice),
            query::tombstone_limit(qp.proxy().get_tombstone_limit()),
-            query::row_limit(get_limit(options)),
+            query::row_limit(get_inner_loop_limit(get_limit(options, _limit), _selection->is_aggregate())),
            query::partition_limit(query::max_partitions),
            now,
            tracing::make_trace_info(state.get_trace_state()),
@@ -1368,7 +1385,8 @@ indexed_table_select_statement::find_index_partition_ranges(query_processor& qp,
    using value_type = std::tuple<dht::partition_range_vector, lw_shared_ptr<const service::pager::paging_state>>;
    auto now = gc_clock::now();
    auto timeout = db::timeout_clock::now() + get_timeout(state.get_client_state(), options);
-    return read_posting_list(qp, options, get_limit(options), state, now, timeout, false).then(utils::result_wrap(
+    const uint64_t limit = get_inner_loop_limit(get_limit(options, _limit), _selection->is_aggregate());
+    return read_posting_list(qp, options, limit, state, now, timeout, false).then(utils::result_wrap(
            [this, &options] (::shared_ptr<cql_transport::messages::result_message::rows> rows) {
        auto rs = cql3::untyped_result_set(rows);
        dht::partition_range_vector partition_ranges;
@@ -1417,7 +1435,8 @@ indexed_table_select_statement::find_index_clustering_rows(query_processor& qp,
    using value_type = std::tuple<std::vector<indexed_table_select_statement::primary_key>, lw_shared_ptr<const service::pager::paging_state>>;
    auto now = gc_clock::now();
    auto timeout = db::timeout_clock::now() + get_timeout(state.get_client_state(), options);
-    return read_posting_list(qp, options, get_limit(options), state, now, timeout, true).then(utils::result_wrap(
+    const uint64_t limit = get_inner_loop_limit(get_limit(options, _limit), _selection->is_aggregate());
+    return read_posting_list(qp, options, limit, state, now, timeout, true).then(utils::result_wrap(
            [this, &options] (::shared_ptr<cql_transport::messages::result_message::rows> rows) {

        auto rs = cql3::untyped_result_set(rows);
@@ -1683,6 +1702,7 @@ schema_ptr mutation_fragments_select_statement::generate_output_schema(schema_pt

 future<exceptions::coordinator_result<service::storage_proxy_coordinator_query_result>>
 mutation_fragments_select_statement::do_query(
+        locator::effective_replication_map_ptr erm_keepalive,
        locator::host_id this_node,
        service::storage_proxy& sp,
        schema_ptr schema,
@@ -1690,7 +1710,7 @@ mutation_fragments_select_statement::do_query(
        dht::partition_range_vector partition_ranges,
        db::consistency_level cl,
        service::storage_proxy_coordinator_query_options optional_params) const {
-    auto res = co_await replica::mutation_dump::dump_mutations(sp.get_db(), schema, _underlying_schema, partition_ranges, *cmd, optional_params.timeout(sp));
+    auto res = co_await replica::mutation_dump::dump_mutations(sp.get_db(), std::move(erm_keepalive), schema, _underlying_schema, partition_ranges, *cmd, optional_params.timeout(sp));
    service::replicas_per_token_range last_replicas;
    if (this_node) {
        last_replicas.emplace(dht::token_range::make_open_ended_both_sides(), std::vector<locator::host_id>{this_node});
@@ -1704,7 +1724,7 @@ mutation_fragments_select_statement::do_execute(query_processor& qp, service::qu

    auto cl = options.get_consistency();

-    uint64_t limit = get_limit(options);
+    const uint64_t limit = get_inner_loop_limit(get_limit(options, _limit), _selection->is_aggregate());
    auto now = gc_clock::now();

    _stats.filtered_reads += _restrictions_need_filtering;
@@ -1762,7 +1782,7 @@ mutation_fragments_select_statement::do_execute(query_processor& qp, service::qu
    if (!aggregate && !_restrictions_need_filtering && (page_size <= 0
            || !service::pager::query_pagers::may_need_paging(*_schema, page_size,
                    *command, key_ranges))) {
-        return do_query({}, qp.proxy(), _schema, command, std::move(key_ranges), cl,
+        return do_query(erm_keepalive, {}, qp.proxy(), _schema, command, std::move(key_ranges), cl,
                {timeout, state.get_permit(), state.get_client_state(), state.get_trace_state(), {}, {}})
        .then(wrap_result_to_error_message([this, erm_keepalive, now, slice = command->slice] (service::storage_proxy_coordinator_query_result&& qr) mutable {
            cql3::selection::result_set_builder builder(*_selection, now);
@@ -1801,8 +1821,8 @@ mutation_fragments_select_statement::do_execute(query_processor& qp, service::qu
            std::move(key_ranges),
            _restrictions_need_filtering ? _restrictions : nullptr,
            [this, erm_keepalive, this_node] (service::storage_proxy& sp, schema_ptr schema, lw_shared_ptr<query::read_command> cmd, dht::partition_range_vector partition_ranges,
-                    db::consistency_level cl, service::storage_proxy_coordinator_query_options optional_params) {
-                return do_query(this_node, sp, std::move(schema), std::move(cmd), std::move(partition_ranges), cl, std::move(optional_params));
+                    db::consistency_level cl, service::storage_proxy_coordinator_query_options optional_params) mutable {
+                return do_query(std::move(erm_keepalive), this_node, sp, std::move(schema), std::move(cmd), std::move(partition_ranges), cl, std::move(optional_params));
            });

    if (_selection->is_trivial() && !_restrictions_need_filtering && !_per_partition_limit) {
@@ -2561,7 +2581,9 @@ std::unique_ptr<cql3::statements::raw::select_statement> build_select_statement(
    if (!where_clause.empty()) {
        out << " WHERE " << where_clause << " ALLOW FILTERING";
    }
-    return do_with_parser(out.str(), std::mem_fn(&cql3_parser::CqlParser::selectStatement));
+    // In general it's not a good idea to use the default dialect, but here the database is talking to
+    // itself, so we can hope the dialects are mutually compatible here.
+    return do_with_parser(out.str(), dialect{}, std::mem_fn(&cql3_parser::CqlParser::selectStatement));
 }

 }
--- a/cql3/statements/select_statement.hh
+++ b/cql3/statements/select_statement.hh
@@ -128,7 +128,7 @@ public:

    future<::shared_ptr<cql_transport::messages::result_message>> execute_without_checking_exception_message_aggregate_or_paged(query_processor& qp,
        lw_shared_ptr<query::read_command> cmd, dht::partition_range_vector&& partition_ranges, service::query_state& state,
-         const query_options& options, gc_clock::time_point now, int32_t page_size, bool aggregate, bool nonpaged_filtering) const;
+         const query_options& options, gc_clock::time_point now, int32_t page_size, bool aggregate, bool nonpaged_filtering, uint64_t limit) const;


    struct primary_key {
@@ -152,13 +152,10 @@ public:
    db::timeout_clock::duration get_timeout(const service::client_state& state, const query_options& options) const;

 protected:
-    uint64_t do_get_limit(const query_options& options, const std::optional<expr::expression>& limit, const expr::unset_bind_variable_guard& unset_guard, uint64_t default_limit) const;
-    uint64_t get_limit(const query_options& options) const {
-        return do_get_limit(options, _limit, _limit_unset_guard, query::max_rows);
-    }
-    uint64_t get_per_partition_limit(const query_options& options) const {
-        return do_get_limit(options, _per_partition_limit, _per_partition_limit_unset_guard, query::partition_max_rows);
-    }
+    using get_limit_result = bo::result<uint64_t, exceptions::invalid_request_exception>;
+    get_limit_result get_limit(const query_options& options, const std::optional<expr::expression>& limit) const;
+    static uint64_t get_inner_loop_limit(const select_statement::get_limit_result& limit, bool is_aggregate);
+
    bool needs_post_query_ordering() const;
    virtual void update_stats_rows_read(int64_t rows_read) const {
        _stats.rows_read += rows_read;
@@ -338,6 +335,7 @@ public:
 private:
    future<exceptions::coordinator_result<service::storage_proxy_coordinator_query_result>>
    do_query(
+            locator::effective_replication_map_ptr erm_keepalive,
            locator::host_id this_node,
            service::storage_proxy& sp,
            schema_ptr schema,
--- a/cql3/util.cc
+++ b/cql3/util.cc
@@ -20,7 +20,7 @@ void __sanitizer_finish_switch_fiber(void* fake_stack_save, const void** stack_b

 namespace cql3::util {

-static void do_with_parser_impl_impl(const sstring_view& cql, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
+static void do_with_parser_impl_impl(const sstring_view& cql, dialect d, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
    cql3_parser::CqlLexer::collector_type lexer_error_collector(cql);
    cql3_parser::CqlParser::collector_type parser_error_collector(cql);
    cql3_parser::CqlLexer::InputStreamType input{reinterpret_cast<const ANTLR_UINT8*>(cql.begin()), ANTLR_ENC_UTF8, static_cast<ANTLR_UINT32>(cql.size()), nullptr};
@@ -29,13 +29,14 @@ static void do_with_parser_impl_impl(const sstring_view& cql, noncopyable_functi
    cql3_parser::CqlParser::TokenStreamType tstream(ANTLR_SIZE_HINT, lexer.get_tokSource());
    cql3_parser::CqlParser parser{&tstream};
    parser.set_error_listener(parser_error_collector);
+    parser.set_dialect(d);
    f(parser);
 }

 #ifndef DEBUG

-void do_with_parser_impl(const sstring_view& cql, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
-    return do_with_parser_impl_impl(cql, std::move(f));
+void do_with_parser_impl(const sstring_view& cql, dialect d, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
+    return do_with_parser_impl_impl(cql, d, std::move(f));
 }

 #else
@@ -47,6 +48,7 @@ void do_with_parser_impl(const sstring_view& cql, noncopyable_function<void (cql
 struct thunk_args {
    // arguments to do_with_parser_impl_impl
    const sstring_view& cql;
+    dialect d;
    noncopyable_function<void (cql3_parser::CqlParser&)>&& func;
    // Exceptions can't be returned from another stack, so store
    // any thrown exception here
@@ -70,7 +72,7 @@ static void thunk(int p1, int p2) {
    // Complete stack switch started in do_with_parser_impl()
    __sanitizer_finish_switch_fiber(nullptr, &san.stack_bottom, &san.stack_size);
    try {
-        do_with_parser_impl_impl(args->cql, std::move(args->func));
+        do_with_parser_impl_impl(args->cql, args->d, std::move(args->func));
    } catch (...) {
        args->ex = std::current_exception();
    }
@@ -79,11 +81,12 @@ static void thunk(int p1, int p2) {
    setcontext(&args->caller_stack);
 };

-void do_with_parser_impl(const sstring_view& cql, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
+void do_with_parser_impl(const sstring_view& cql, dialect d, noncopyable_function<void (cql3_parser::CqlParser& parser)> f) {
    static constexpr size_t stack_size = 1 << 20;
    static thread_local std::unique_ptr<char[]> stack = std::make_unique<char[]>(stack_size);
    thunk_args args{
        .cql = cql,
+        .d = d,
        .func = std::move(f),
    };
    ucontext_t uc;
@@ -92,7 +95,7 @@ void do_with_parser_impl(const sstring_view& cql, noncopyable_function<void (cql
    if (stack.get() <= (char*)&uc && (char*)&uc < stack.get() + stack_size) {
        // We are already running on the large stack, so just call the
        // parser directly.
-        return do_with_parser_impl_impl(cql, std::move(f));
+        return do_with_parser_impl_impl(cql, d, std::move(f));
    }
    uc.uc_stack.ss_sp = stack.get();
    uc.uc_stack.ss_size = stack_size;
@@ -136,12 +139,12 @@ sstring relations_to_where_clause(const expr::expression& e) {
    return boost::algorithm::join(expressions, " AND ");
 }

-expr::expression where_clause_to_relations(const sstring_view& where_clause) {
-    return do_with_parser(where_clause, std::mem_fn(&cql3_parser::CqlParser::whereClause));
+expr::expression where_clause_to_relations(const sstring_view& where_clause, dialect d) {
+    return do_with_parser(where_clause, d, std::mem_fn(&cql3_parser::CqlParser::whereClause));
 }

-sstring rename_column_in_where_clause(const sstring_view& where_clause, column_identifier::raw from, column_identifier::raw to) {
-    std::vector<expr::expression> relations = boolean_factors(where_clause_to_relations(where_clause));
+sstring rename_column_in_where_clause(const sstring_view& where_clause, column_identifier::raw from, column_identifier::raw to, dialect d) {
+    std::vector<expr::expression> relations = boolean_factors(where_clause_to_relations(where_clause, d));
    std::vector<expr::expression> new_relations;
    new_relations.reserve(relations.size());

--- a/cql3/util.hh
+++ b/cql3/util.hh
@@ -21,18 +21,19 @@
 #include "cql3/CqlParser.hpp"
 #include "cql3/error_collector.hh"
 #include "cql3/statements/raw/select_statement.hh"
+#include "cql3/dialect.hh"

 namespace cql3 {

 namespace util {


-void do_with_parser_impl(const sstring_view& cql, noncopyable_function<void (cql3_parser::CqlParser& p)> func);
+void do_with_parser_impl(const sstring_view& cql, dialect d, noncopyable_function<void (cql3_parser::CqlParser& p)> func);

 template <typename Func, typename Result = cql3_parser::unwrap_uninitialized_t<std::invoke_result_t<Func, cql3_parser::CqlParser&>>>
-Result do_with_parser(const sstring_view& cql, Func&& f) {
+Result do_with_parser(const sstring_view& cql, dialect d, Func&& f) {
    std::optional<Result> ret;
-    do_with_parser_impl(cql, [&] (cql3_parser::CqlParser& parser) {
+    do_with_parser_impl(cql, d, [&] (cql3_parser::CqlParser& parser) {
        ret.emplace(f(parser));
    });
    return std::move(*ret);
@@ -40,9 +41,9 @@ Result do_with_parser(const sstring_view& cql, Func&& f) {

 sstring relations_to_where_clause(const expr::expression& e);

-expr::expression where_clause_to_relations(const sstring_view& where_clause);
+expr::expression where_clause_to_relations(const sstring_view& where_clause, dialect d);

-sstring rename_column_in_where_clause(const sstring_view& where_clause, column_identifier::raw from, column_identifier::raw to);
+sstring rename_column_in_where_clause(const sstring_view& where_clause, column_identifier::raw from, column_identifier::raw to, dialect d);

 /// build a CQL "select" statement with the desired parameters.
 /// If select_all_columns==true, all columns are selected and the value of
--- a/db/commitlog/commitlog.cc
+++ b/db/commitlog/commitlog.cc
@@ -1100,7 +1100,12 @@ public:
            write(out, uint64_t(0));
        }

-        buf.remove_suffix(buf.size_bytes() - size);
+        auto to_remove = buf.size_bytes() - size;
+        // #20862 - we decrement usage counter based on buf.size() below.
+        // Since we are shrinking buffer here, we need to also decrement
+        // counter already
+        buf.remove_suffix(to_remove);
+        _segment_manager->totals.buffer_list_bytes -= to_remove;

        // Build sector checksums.
        auto id = net::hton(_desc.id);
@@ -3221,6 +3226,10 @@ uint64_t db::commitlog::get_total_size() const {
        ;
 }

+uint64_t db::commitlog::get_buffer_size() const {
+    return _segment_manager->totals.buffer_list_bytes;
+}
+
 uint64_t db::commitlog::get_completed_tasks() const {
    return _segment_manager->totals.allocation_count;
 }
--- a/db/commitlog/commitlog.hh
+++ b/db/commitlog/commitlog.hh
@@ -297,6 +297,7 @@ public:
    future<> delete_segments(std::vector<sstring>) const;

    uint64_t get_total_size() const;
+    uint64_t get_buffer_size() const;
    uint64_t get_completed_tasks() const;
    uint64_t get_flush_count() const;
    uint64_t get_pending_tasks() const;
--- a/db/config.cc
+++ b/db/config.cc
@@ -99,6 +99,21 @@ error_injection_list_to_json(const std::vector<db::config::error_injection_at_st
    return value_to_json("error_injection_list");
 }

+template <>
+bool
+config_from_string(std::string_view value) {
+    // boost::lexical_cast doesn't accept true/false, which are our output representations
+    // for bools. We want round-tripping, so we need to accept true/false. For backward
+    // compatibility, we also accept 1/0. #19791.
+    if (value == "true" || value == "1") {
+        return true;
+    } else if (value == "false" || value == "0") {
+        return false;
+    } else {
+        throw boost::bad_lexical_cast(typeid(std::string_view), typeid(bool));
+    }
+}
+
 template <>
 const config_type config_type_for<bool> = config_type("bool", value_to_json<bool>);

@@ -177,7 +192,7 @@ struct convert<seastar::log_level> {
        if (!convert<std::string>::decode(node, tmp)) {
            return false;
        }
-        rhs = boost::lexical_cast<seastar::log_level>(tmp);
+        rhs = utils::config_from_string<seastar::log_level>(tmp);
        return true;
    }
 };
@@ -1057,6 +1072,8 @@ db::config::config(std::shared_ptr<db::extensions> exts)
            "Make the system.config table UPDATEable.")
    , enable_parallelized_aggregation(this, "enable_parallelized_aggregation", liveness::LiveUpdate, value_status::Used, true,
            "Use on a new, parallel algorithm for performing aggregate queries.")
+    , cql_duplicate_bind_variable_names_refer_to_same_variable(this, "cql_duplicate_bind_variable_names_refer_to_same_variable", liveness::LiveUpdate, value_status::Used, true,
+            "A bind variable that appears twice in a CQL query refers to a single variable (if false, no name matching is performed).")
    , alternator_port(this, "alternator_port", value_status::Used, 0, "Alternator API port.")
    , alternator_https_port(this, "alternator_https_port", value_status::Used, 0, "Alternator API HTTPS port.")
    , alternator_address(this, "alternator_address", value_status::Used, "0.0.0.0", "Alternator API listening address.")
--- a/db/config.hh
+++ b/db/config.hh
@@ -399,6 +399,7 @@ public:
    named_value<bool> enable_optimized_reversed_reads;
    named_value<bool> enable_cql_config_updates;
    named_value<bool> enable_parallelized_aggregation;
+    named_value<bool> cql_duplicate_bind_variable_names_refer_to_same_variable;

    named_value<uint16_t> alternator_port;
    named_value<uint16_t> alternator_https_port;
--- a/db/consistency_level.cc
+++ b/db/consistency_level.cc
@@ -334,7 +334,13 @@ filter_for_query(consistency_level cl,
        if (!old_node && ht_max - ht_min > 0.01) { // if there is old node or hit rates are close skip calculations
            // local node is always first if present (see storage_proxy::get_endpoints_for_reading)
            unsigned local_idx = erm.get_topology().is_me(epi[0].first) ? 0 : epi.size() + 1;
-            live_endpoints = boost::copy_range<inet_address_vector_replica_set>(miss_equalizing_combination(epi, local_idx, remaining_bf, bool(extra)));
+            auto weighted = boost::copy_range<inet_address_vector_replica_set>(miss_equalizing_combination(epi, local_idx, remaining_bf, bool(extra)));
+            // Workaround for https://github.com/scylladb/scylladb/issues/9285
+            auto last = std::adjacent_find(weighted.begin(), weighted.end());
+            if (last == weighted.end()) {
+                // No duplicates, so use the result based on hit rates
+                live_endpoints = std::move(weighted);
+            }
        }
    }

--- a/db/cql_type_parser.cc
+++ b/db/cql_type_parser.cc
@@ -20,7 +20,9 @@
 #include "utils/sorting.hh"

 static ::shared_ptr<cql3::cql3_type::raw> parse_raw(const sstring& str) {
-    return cql3::util::do_with_parser(str,
+    // In general it's a bad idea to use the default dialect, but type parsing
+    // should be dialect-agnostic.
+    return cql3::util::do_with_parser(str, cql3::dialect{},
        [] (cql3_parser::CqlParser& parser) {
            return parser.comparator_type(true);
        });
--- a/db/hints/internal/hint_endpoint_manager.cc
+++ b/db/hints/internal/hint_endpoint_manager.cc
@@ -167,6 +167,7 @@ future<db::commitlog> hint_endpoint_manager::add_store() noexcept {
        return io_check([name = _hints_dir.c_str()] { return recursive_touch_directory(name); }).then([this] () {
            commitlog::config cfg;

+            cfg.sched_group = _shard_manager.local_db().commitlog()->active_config().sched_group;
            cfg.commit_log_location = _hints_dir.c_str();
            cfg.commitlog_segment_size_in_mb = resource_manager::hint_segment_size_in_mb;
            cfg.commitlog_total_space_in_mb = resource_manager::max_hints_per_ep_size_mb;
--- a/db/hints/internal/hint_sender.cc
+++ b/db/hints/internal/hint_sender.cc
@@ -76,23 +76,6 @@ future<timespec> hint_sender::get_last_file_modification(const sstring& fname) {
    });
 }

-future<> hint_sender::do_send_one_mutation(frozen_mutation_and_schema m, locator::effective_replication_map_ptr ermp, const inet_address_vector_replica_set& natural_endpoints) {
-    return futurize_invoke([this, m = std::move(m), ermp = std::move(ermp), &natural_endpoints] () mutable -> future<> {
-        // The fact that we send with CL::ALL in both cases below ensures that new hints are not going
-        // to be generated as a result of hints sending.
-        const auto& tm = ermp->get_token_metadata();
-        const auto maybe_addr = tm.get_endpoint_for_host_id_if_known(end_point_key());
-
-        if (maybe_addr && boost::range::find(natural_endpoints, *maybe_addr) != natural_endpoints.end()) {
-            manager_logger.trace("Sending directly to {}", end_point_key());
-            return _proxy.send_hint_to_endpoint(std::move(m), std::move(ermp), *maybe_addr);
-        } else {
-            manager_logger.trace("Endpoints set has changed and {} is no longer a replica. Mutating from scratch...", end_point_key());
-            return _proxy.send_hint_to_all_replicas(std::move(m));
-        }
-    });
-}
-
 bool hint_sender::can_send() noexcept {
    if (stopping() && !draining()) {
        return false;
@@ -274,11 +257,30 @@ void hint_sender::start() {
 }

 future<> hint_sender::send_one_mutation(frozen_mutation_and_schema m) {
-    auto erm = _db.find_column_family(m.s).get_effective_replication_map();
+    auto ermp = _db.find_column_family(m.s).get_effective_replication_map();
    auto token = dht::get_token(*m.s, m.fm.key());
-    inet_address_vector_replica_set natural_endpoints = erm->get_natural_endpoints(std::move(token));
+    inet_address_vector_replica_set natural_endpoints = ermp->get_natural_endpoints(std::move(token));

-    return do_send_one_mutation(std::move(m), std::move(erm), std::move(natural_endpoints));
+    return futurize_invoke([this, m = std::move(m), ermp = std::move(ermp), &natural_endpoints] () mutable -> future<> {
+        // The fact that we send with CL::ALL in both cases below ensures that new hints are not going
+        // to be generated as a result of hints sending.
+        const auto& tm = ermp->get_token_metadata();
+        const auto maybe_addr = tm.get_endpoint_for_host_id_if_known(end_point_key());
+
+        if (maybe_addr && boost::range::find(natural_endpoints, *maybe_addr) != natural_endpoints.end() && !tm.is_leaving(end_point_key())) {
+            manager_logger.trace("Sending directly to {}", end_point_key());
+            return _proxy.send_hint_to_endpoint(std::move(m), std::move(ermp), *maybe_addr);
+        } else {
+            if (manager_logger.is_enabled(log_level::trace)) {
+                if (tm.is_leaving(end_point_key())) {
+                    manager_logger.trace("The original target endpoint {} is leaving. Mutating from scratch...", end_point_key());
+                } else {
+                    manager_logger.trace("Endpoints set has changed and {} is no longer a replica. Mutating from scratch...", end_point_key());
+                }
+            }
+            return _proxy.send_hint_to_all_replicas(std::move(m));
+        }
+    });
 }

 future<> hint_sender::send_one_hint(lw_shared_ptr<send_one_file_ctx> ctx_ptr, fragmented_temporary_buffer buf, db::replay_position rp, gc_clock::duration secs_since_file_mod, const sstring& fname) {
--- a/db/hints/internal/hint_sender.hh
+++ b/db/hints/internal/hint_sender.hh
@@ -233,18 +233,14 @@ private:
    /// \return
    const column_mapping& get_column_mapping(lw_shared_ptr<send_one_file_ctx> ctx_ptr, const frozen_mutation& fm, const hint_entry_reader& hr);

-    /// \brief Perform a single mutation send attempt.
+    /// \brief Send one mutation out.
    ///
    /// If the original destination end point is still a replica for the given mutation - send the mutation directly
    /// to it, otherwise execute the mutation "from scratch" with CL=ALL.
    ///
-    /// \param m mutation to send
-    /// \param ermp points to the effective_replication_map used to obtain \c natural_endpoints
-    /// \param natural_endpoints current replicas for the given mutation
-    /// \return future that resolves when the operation is complete
-    future<> do_send_one_mutation(frozen_mutation_and_schema m, locator::effective_replication_map_ptr ermp, const inet_address_vector_replica_set& natural_endpoints);
-
-    /// \brief Send one mutation out.
+    /// The mutation will be sent with CL=ALL semantics to all current replicas also in case if the original destination
+    /// is leaving the cluster - otherwise the hint might be applied only on the leaving node and streaming might
+    /// miss it.
    ///
    /// \param m mutation to send
    /// \return future that resolves when the mutation sending processing is complete.
--- a/db/schema_tables.cc
+++ b/db/schema_tables.cc
@@ -779,40 +779,35 @@ redact_columns_for_missing_features(mutation&& m, schema_features features) {
 */
 future<table_schema_version> calculate_schema_digest(distributed<service::storage_proxy>& proxy, schema_features features, noncopyable_function<bool(std::string_view)> accept_keyspace)
 {
-    auto map = [&proxy, features, accept_keyspace = std::move(accept_keyspace)] (sstring table) mutable -> future<std::vector<mutation>> {
+    using mutations_generator = coroutine::experimental::generator<mutation>;
+
+    auto map = [&proxy, features, accept_keyspace = std::move(accept_keyspace)] (sstring table) mutable -> mutations_generator {
        auto& db = proxy.local().get_db();
        auto rs = co_await db::system_keyspace::query_mutations(db, NAME, table);
        auto s = db.local().find_schema(NAME, table);
-        std::vector<mutation> mutations;
        for (auto&& p : rs->partitions()) {
-            auto mut = co_await unfreeze_gently(p.mut(), s);
-            auto partition_key = value_cast<sstring>(utf8_type->deserialize(mut.key().get_component(*s, 0)));
+            auto partition_key = value_cast<sstring>(utf8_type->deserialize(::partition_key(p.mut().key()).get_component(*s, 0)));
            if (!accept_keyspace(partition_key)) {
                continue;
            }
-            mut = redact_columns_for_missing_features(std::move(mut), features);
-            mutations.emplace_back(std::move(mut));
-        }
-        co_return mutations;
-    };
-    auto reduce = [features] (auto& hash, auto&& mutations) {
-        for (const mutation& m : mutations) {
-            feed_hash_for_schema_digest(hash, m, features);
+            auto mut = co_await unfreeze_gently(p.mut(), s);
+            co_yield redact_columns_for_missing_features(std::move(mut), features);
        }
    };
    auto hash = md5_hasher();
    auto tables = all_table_names(features);
    {
        for (auto& table: tables) {
-            auto mutations = co_await map(table);
-            if (diff_logger.is_enabled(logging::log_level::trace)) {
-                for (const mutation& m : mutations) {
+            auto gen_mutations = map(table);
+            while (auto mut_opt = co_await gen_mutations()) {
+                auto& m = *mut_opt;
+                feed_hash_for_schema_digest(hash, m, features);
+                if (diff_logger.is_enabled(logging::log_level::trace)) {
                    md5_hasher h;
                    feed_hash_for_schema_digest(h, m, features);
                    diff_logger.trace("Digest {} for {}, compacted={}", h.finalize(), m, compact_for_schema_digest(m));
                }
            }
-            reduce(hash, mutations);
        }
        co_return utils::UUID_gen::get_name_UUID(hash.finalize());
    }
@@ -1948,7 +1943,9 @@ static shared_ptr<cql3::functions::user_aggregate> create_aggregate(replica::dat

    bytes_opt initcond = std::nullopt;
    if (initcond_str) {
-        auto expr = cql3::util::do_with_parser(*initcond_str, std::mem_fn(&cql3_parser::CqlParser::term));
+        // In general using the default dialect is wrong, but here the database is communicating with itself,
+        // not the user, so any dialect should work.
+        auto expr = cql3::util::do_with_parser(*initcond_str, cql3::dialect{}, std::mem_fn(&cql3_parser::CqlParser::term));
        auto dummy_ident = ::make_shared<cql3::column_identifier>("", true);
        auto column_spec = make_lw_shared<cql3::column_specification>("", "", dummy_ident, state_type);
        auto raw = cql3::expr::evaluate(prepare_expression(expr, db.as_data_dictionary(), "", nullptr, {column_spec}), cql3::query_options::DEFAULT);
--- a/db/view/view.cc
+++ b/db/view/view.cc
@@ -1673,7 +1673,22 @@ get_view_natural_endpoint(
        return {};
    }
    auto replica = view_endpoints[base_it - base_endpoints.begin()];
-    return view_topology.get_node(replica).endpoint();
+
+    // https://github.com/scylladb/scylladb/issues/19439
+    // With tablets, a node being replaced might transition to "left" state
+    // but still be kept as a replica. In such case, the IP of the replaced
+    // node will be lost and `endpoint()` will return an empty IP here.
+    // As of writing this, storage proxy was not migrated to host IDs yet
+    // (#6403) and hints are not prepared to handle nodes that are left
+    // but are still replicas. Therefore, there is no other sensible option
+    // right now but to give up attempt to send the update or write a hint
+    // to the paired, permanently down replica.
+    const auto ep = view_topology.get_node(replica).endpoint();
+    if (ep != gms::inet_address{}) {
+        return ep;
+    } else {
+        return std::nullopt;
+    }
 }

 static future<> apply_to_remote_endpoints(service::storage_proxy& proxy, locator::effective_replication_map_ptr ermp,
@@ -2705,16 +2720,16 @@ future<> view_builder::register_staging_sstable(sstables::shared_sstable sst, lw
    return _vug.register_staging_sstable(std::move(sst), std::move(table));
 }

-future<bool> check_needs_view_update_path(view_builder& vb, const locator::token_metadata& tm, const replica::table& t, streaming::stream_reason reason) {
+future<bool> check_needs_view_update_path(view_builder& vb, locator::token_metadata_ptr tmptr, const replica::table& t, streaming::stream_reason reason) {
    if (is_internal_keyspace(t.schema()->ks_name())) {
        return make_ready_future<bool>(false);
    }
    if (reason == streaming::stream_reason::repair && !t.views().empty()) {
        return make_ready_future<bool>(true);
    }
-    return do_with(t.views(), [&vb, &tm] (auto& views) {
+    return do_with(std::move(tmptr), t.views(), [&vb] (locator::token_metadata_ptr& tmptr, auto& views) {
        return map_reduce(views,
-                [&vb, &tm] (const view_ptr& view) { return vb.check_view_build_ongoing(tm, view->ks_name(), view->cf_name()); },
+                [&] (const view_ptr& view) { return vb.check_view_build_ongoing(*tmptr, view->ks_name(), view->cf_name()); },
                false,
                std::logical_or<bool>());
    });
--- a/db/view/view_update_checks.hh
+++ b/db/view/view_update_checks.hh
@@ -10,20 +10,17 @@

 #include <seastar/core/future.hh>
 #include "streaming/stream_reason.hh"
+#include "locator/token_metadata_fwd.hh"
 #include "seastarx.hh"

 namespace replica {
 class table;
 }

-namespace locator {
-class token_metadata;
-}
-
 namespace db::view {
 class view_builder;

-future<bool> check_needs_view_update_path(view_builder& vb, const locator::token_metadata& tm, const replica::table& t,
+future<bool> check_needs_view_update_path(view_builder& vb, locator::token_metadata_ptr tmptr, const replica::table& t,
        streaming::stream_reason reason);

 }
--- a/dist/common/scripts/scylla_coredump_setup
+++ b/dist/common/scripts/scylla_coredump_setup
@@ -40,6 +40,25 @@ if __name__ == '__main__':
                        help='enable compress on systemd-coredump')
    args = parser.parse_args()

+    # Seems like specific version of systemd pacakge on RHEL9 has a bug on
+    # SELinux configuration, it introduced "systemd-container-coredump" module
+    # to provide rule for systemd-coredump but not enabled by default.
+    # We have to manually load it, otherwise it causes permission errror.
+    # (#19325)
+    if is_redhat_variant() and distro.major_version() == '9':
+        if not shutil.which('getenforce'):
+            pkg_install('libselinux-utils')
+        if not shutil.which('semodule'):
+            pkg_install('policycoreutils')
+        enforce = out('getenforce')
+        if enforce != "Disabled":
+            if os.path.exists('/usr/share/selinux/packages/targeted/systemd-container-coredump.pp.bz2'):
+                modules = out('semodule -l')
+                match = re.match(r'^systemd-container-coredump$', modules, re.MULTILINE)
+                if not match:
+                    run('semodule -v -i /usr/share/selinux/packages/targeted/systemd-container-coredump.pp.bz2', shell=True, check=True)
+                    run('semodule -v -e systemd-container-coredump', shell=True, check=True)
+
    # abrt-ccpp.service needs to stop before enabling systemd-coredump,
    # since both will try to install kernel coredump handler
    # (This will only requires for abrt < 2.14)
--- a/dist/common/scripts/scylla_raid_setup
+++ b/dist/common/scripts/scylla_raid_setup
@@ -325,9 +325,27 @@ WantedBy=local-fs.target
        os.chown(dpath, uid, gid)

    if is_debian_variant():
+        if not shutil.which('update-initramfs'):
+            pkg_install('initramfs-tools')
        run('update-initramfs -u', shell=True, check=True)

    if not udev_info.uuid_link:
        LOGGER.error(f'Error detected, dumping udev env parameters on {fsdev}')
        udev_info.verify()
        udev_info.dump_variables()
+
+    if is_redhat_variant():
+        if not shutil.which('getenforce'):
+            pkg_install('libselinux-utils')
+        if not shutil.which('restorecon'):
+            pkg_install('policycoreutils')
+        if not shutil.which('semanage'):
+            pkg_install('policycoreutils-python-utils')
+        selinux_status = out('getenforce')
+        selinux_context = out('matchpathcon -n /var/lib/systemd/coredump')
+        selinux_type = selinux_context.split(':')[2]
+        run(f'semanage fcontext -a -t {selinux_type} "{root}/coredump(/.*)?"', shell=True, check=True)
+        if selinux_status != 'Disabled':
+            run(f'restorecon -F -v -R {root}', shell=True, check=True)
+        else:
+            Path('/.autorelabel').touch(exist_ok=True)
--- a/dist/redhat/scylla.spec
+++ b/dist/redhat/scylla.spec
@@ -158,33 +158,6 @@ Obsoletes:      scylla-server < 1.1
 %description conf
 This package contains the main scylla configuration file.

-# we need to refuse upgrade if current scylla < 1.7.3 && commitlog remains
-%pretrans conf
-ver=$(rpm -qi scylla-server | grep Version | awk '{print $3}')
-if [ -n "$ver" ]; then
-    ver_fmt=$(echo $ver | awk -F. '{printf "%d%02d%02d", $1,$2,$3}')
-    if [ $ver_fmt -lt 10703 ]; then
-        # for <scylla-1.2
-        if [ ! -f /opt/scylladb/lib/scylla/scylla_config_get.py ]; then
-            echo
-            echo "Error: Upgrading from scylla-$ver to scylla-%{version} is not supported."
-            echo "Please upgrade to scylla-1.7.3 or later, before upgrade to %{version}."
-            echo
-            exit 1
-        fi
-        commitlog_directory=$(/opt/scylladb/lib/scylla/scylla_config_get.py -g commitlog_directory)
-        commitlog_files=$(ls $commitlog_directory | wc -l)
-        if [ $commitlog_files -ne 0 ]; then
-            echo
-            echo "Error: Upgrading from scylla-$ver to scylla-%{version} is not supported when commitlog is not clean."
-            echo "Please upgrade to scylla-1.7.3 or later, before upgrade to %{version}."
-            echo "Also make sure $commitlog_directory is empty."
-            echo
-            exit 1
-        fi
-    fi
-fi
-
 %files conf
 %defattr(-,root,root)
 %attr(0755,root,root) %dir %{_sysconfdir}/scylla
--- a/docs/_ext/scylladb_include_flag.py
+++ b/docs/_ext/scylladb_include_flag.py
@@ -1,6 +1,10 @@
+import os
 from sphinx.directives.other import Include
+from sphinx.util import logging
 from docutils.parsers.rst import directives

+LOGGER = logging.getLogger(__name__)
+
 class IncludeFlagDirective(Include):
    option_spec = Include.option_spec.copy()
    option_spec['base_path'] = directives.unchanged
@@ -8,11 +12,18 @@ class IncludeFlagDirective(Include):
    def run(self):
        env = self.state.document.settings.env
        base_path = self.options.get('base_path', '_common')
+        file_path = self.arguments[0]

        if env.app.tags.has('enterprise'):
-            self.arguments[0] = base_path + "_enterprise/" + self.arguments[0]
+            enterprise_path = os.path.join(base_path + "_enterprise", file_path)
+            _, enterprise_abs_path = env.relfn2path(enterprise_path)
+            if os.path.exists(enterprise_abs_path):
+                self.arguments[0] = enterprise_path
+            else:
+                LOGGER.info(f"Enterprise content not found: Skipping inclusion of {file_path}")
+                return []
        else:
-            self.arguments[0] = base_path + "/" + self.arguments[0]
+            self.arguments[0] = os.path.join(base_path, file_path)
        return super().run()

 def setup(app):
--- a/docs/alternator/compatibility.md
+++ b/docs/alternator/compatibility.md
@@ -123,10 +123,6 @@ the secret key is the `salted_hash`, i.e., the secret key can be found by

 <!--- REMOVE IN FUTURE VERSIONS - Remove the note below in version 6.1 -->

-(Note: If you upgraded from version 5.4 to version 6.0 without 
-[enabling consistent topology updates](../upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology.rst), 
-the table name is `system_auth.roles`.)
-
 By default, authorization is not enforced at all. It can be turned on
 by providing an entry in Scylla configuration:
    `alternator_enforce_authorization: true`
--- a/docs/architecture/_common/consistent-topology-with-raft-upgrade-info.rst
+++ b/docs/architecture/_common/consistent-topology-with-raft-upgrade-info.rst
@@ -1,3 +0,0 @@
-If you upgraded from 5.4, you must perform a manual action in order to enable
-consistent topology changes.
-See :doc:`the guide for enabling consistent topology changes</upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>` for more details.
--- a/docs/architecture/raft.rst
+++ b/docs/architecture/raft.rst
@@ -60,9 +60,8 @@ In summary, Raft makes schema changes safe, but it requires that a quorum of nod
 Verifying that the Raft upgrade procedure finished successfully
 ========================================================================

-You may need to perform the following procedure on upgrade if you explicitly
-disabled the Raft-based schema changes feature in the previous ScyllaDB
-version. Please consult the upgrade guide.
+You may need to perform the following procedure as part of
+the :ref:`manual recovery procedure <recovery-procedure>`.

 The Raft upgrade procedure requires **full cluster availability** to correctly setup the Raft algorithm; after the setup finishes, Raft can proceed with only a majority of nodes, but this initial setup is an exception.
 An unlucky event, such as a hardware failure, may cause one of your nodes to fail. If this happens before the Raft upgrade procedure finishes, the procedure will get stuck and your intervention will be required.
@@ -173,8 +172,6 @@ gossip-based topology.

 The feature is automatically enabled in new clusters.

-.. scylladb_include_flag:: consistent-topology-with-raft-upgrade-info.rst
-
 Verifying that Raft is Enabled
 ----------------------------------

--- a/docs/cql/_common/tablets-default.rst
+++ b/docs/cql/_common/tablets-default.rst
@@ -0,0 +1,3 @@
+By default, a keyspace is created with tablets enabled. The ``tablets`` option 
+is used to opt out a keyspace from tablets-based distribution; see :ref:`Enabling Tablets <tablets-enable-tablets>`
+for details.
--- a/docs/cql/compaction.rst
+++ b/docs/cql/compaction.rst
@@ -62,7 +62,7 @@ The following options are available for all compaction strategies.
 =====

 ``tombstone_compaction_interval`` (default: 86400s (1 day))
-   An SSTable that is suitable for single SSTable compaction, according to tombstone_threshold will not be compacted if it is newer than tombstone_compaction_interval. 
+  *tombstone_compaction_interval* is lower-bound for when a new tombstone compaction can start. If an SSTable was compacted at a time *X*, the earliest time it will be considered for tombstone compaction again is *X + tombstone_compaction_interval*. This does not guarantee that sstables will be considered for compaction immediately after tombstone_compaction_interval time has elapsed after the last compaction.

 =====

--- a/docs/cql/cql-extensions.md
+++ b/docs/cql/cql-extensions.md
@@ -377,6 +377,20 @@ FINALFUNC final_fct
 INITCOND (0, 0);
 ```

+### Behavior of bind variables references with the same name
+
+If a bind variable is referred to twice (example: `WHERE aa = :var AND bb = :var`; `:var`
+is referenced twice), ScyllaDB and Cassandra treat it differently:
+
+ - Cassandra ignores the double reference and treats the two as two separate variables. They
+   can have different types, and occupy two slots in the bind variable metadata (used by
+   drivers when the user provides a bind variable tuple rather than a map)
+ - ScyllaDB treats the two references as referring to the same variable. The two references
+   must have the same type, and occupy one slot in the bind variable metadata.
+
+ScyllaDB can revert to the Cassandra treatment by setting the configuration item
+`cql_duplicate_bind_variable_names_refer_to_same_variable` to `false`.
+
 ### Lists elements for filtering

 Subscripting a list in a WHERE clause is supported as are maps.
--- a/docs/cql/ddl.rst
+++ b/docs/cql/ddl.rst
@@ -116,7 +116,7 @@ name                 kind       mandatory   default   description
                                                      details below).
 ``durable_writes``   *simple*   no          true      Whether to use the commit log for updates on this keyspace
                                                      (disable this option at your own risk!).
-``tablets``          *map*      no                    Enables or disables tablets for the keyspace (see :ref:`tablets<tablets>`)
+``tablets``          *map*      no                    Enables or disables tablets for the keyspace (see :ref:`tablets <tablets>`)
 =================== ========== =========== ========= ===================================================================

 The ``replication`` property is mandatory and must at least contains the ``'class'`` sub-option, which defines the
@@ -232,9 +232,7 @@ sub-option                             type  description
 ``'initial'``                          int   The number of tablets to start with
 ===================================== ====== =============================================

-By default, a keyspace is created with tablets enabled. The ``tablets`` option 
-is used to opt out a keyspace from tablets-based distribution; see :ref:`Enabling Tablets <tablets-enable-tablets>`
-for details.
+.. scylladb_include_flag:: tablets-default.rst

 A good rule of thumb to calculate initial tablets is to divide the expected total storage used
 by tables in this keyspace by (``replication_factor`` * 5GB). For example, if you expect a 30TB
@@ -759,10 +757,8 @@ available:
 ========================= =============== =============================================================================
 Option                    Default         Description
 ========================= =============== =============================================================================
- ``sstable_compression``   LZ4Compressor   The compression algorithm to use. Default compressors are
-                                           LZ4Compressor, SnappyCompressor, and DeflateCompressor.
-                                           A custom compressor can be provided by specifying the full class
-                                           name as a “string constant”:#constants.
+ ``sstable_compression``   LZ4Compressor   The compression algorithm to use. Available compressors are
+                                           LZ4Compressor, SnappyCompressor, DeflateCompressor, and ZstdCompressor.
 ``chunk_length_in_kb``    4               On disk SSTables are compressed by block (to allow random reads). This
                                           defines the size (in KB) of the block. Bigger values may improve the
                                           compression rate, but increases the minimum size of data to be read from disk
--- a/docs/dev/docker-hub.md
+++ b/docs/dev/docker-hub.md
@@ -48,6 +48,13 @@ to calculate the proper value is:
 $ docker run --name some-scylla --hostname some-scylla -d scylladb/scylla
 ```

+If you're on macOS and plan to start a multi-node cluster (3 nodes or more), start ScyllaDB with
+`–reactor-backend=epoll` to override the default `linux-aio` reactor backend:
+
+```console
+$ docker run --name some-scylla --hostname some-scylla -d scylladb/scylla --reactor-backend=epoll
+```
+
 ### Run `nodetool` utility

 ```console
@@ -75,6 +82,11 @@ cqlsh>
 ```console
 $ docker run --name some-scylla2  --hostname some-scylla2 -d scylladb/scylla --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla)"
 ```
+If you're on macOS, ensure to add the `–reactor-backend=epoll` option when adding new nodes:
+
+```console
+$ docker run --name some-scylla2  --hostname some-scylla2 -d scylladb/scylla --reactor-backend=epoll --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla)"
+```

 #### Make a cluster with Docker Compose

--- a/docs/getting-started/_common/os-support-info.rst
+++ b/docs/getting-started/_common/os-support-info.rst
@@ -6,9 +6,9 @@ You can `build ScyllaDB from source <https://github.com/scylladb/scylladb#build-
 +----------------------------+------+------+------+-------+-------+-------+
 | ScyllaDB Version / Version |20.04 |22.04 |24.04 |  11   |   8   |   9   |
 +============================+======+======+======+=======+=======+=======+
-|   6.0                      | |v|  | |v|  | |v|  | |v|   | |v|   | |v|   |
+|   6.1                      | |v|  | |v|  | |v|  | |v|   | |v|   | |v|   |
 +----------------------------+------+------+------+-------+-------+-------+
-|   5.4                      | |v|  | |v|  | |x|  | |v|   | |v|   | |v|   |
+|   6.0                      | |v|  | |v|  | |v|  | |v|   | |v|   | |v|   |
 +----------------------------+------+------+------+-------+-------+-------+

 * The recommended OS for ScyllaDB Open Source is Ubuntu 22.04.
--- a/docs/getting-started/_common/setup-after-install.rst
+++ b/docs/getting-started/_common/setup-after-install.rst
@@ -0,0 +1,54 @@
+Configure and Run ScyllaDB
+-------------------------------
+
+#. Configure the following parameters in the ``/etc/scylla/scylla.yaml`` configuration file.
+
+   * ``cluster_name`` - The name of the cluster. All the nodes in the cluster must have the same 
+     cluster name configured.
+   * ``seeds`` - The IP address of the first node. Other nodes will use it as the first contact 
+     point to discover the cluster topology when joining the cluster.
+   * ``listen_address`` - The IP address that ScyllaDB uses to connect to other nodes in the cluster.
+   * ``rpc_address`` - The IP address of the interface for CQL client connections.
+
+#. Run the ``scylla_setup`` script to tune the system settings and determine the optimal configuration.
+
+   .. code-block:: console
+    
+      sudo scylla_setup
+
+   * The script invokes a set of :ref:`scripts <system-configuration-scripts>` to configure several operating system settings; for example, it sets 
+     RAID0 and XFS filesystem. 
+   * The script runs a short (up to a few minutes) benchmark on your storage and generates the ``/etc/scylla.d/io.conf`` 
+     configuration file. When the file is ready, you can start ScyllaDB. ScyllaDB will not run without XFS 
+     or ``io.conf`` file.
+   * You can bypass this check by running ScyllaDB in :doc:`developer mode </getting-started/installation-common/dev-mod>`. 
+     We recommend against enabling developer mode in production environments to ensure ScyllaDB's maximum performance.
+
+#. Run ScyllaDB as a service (if not already running).
+
+   .. code-block:: console
+    
+      sudo systemctl start scylla-server
+
+
+Now you can start using ScyllaDB. Here are some tools you may find useful.
+
+Run nodetool:
+   
+.. code-block:: console
+     
+     nodetool status
+
+Run cqlsh:
+
+.. code-block:: console
+     
+     cqlsh
+
+Run cassandra-stress:
+
+.. code-block:: console
+     
+     cassandra-stress write -mode cql3 native 
+
+
--- a/docs/getting-started/install-scylla/install-on-linux.rst
+++ b/docs/getting-started/install-scylla/install-on-linux.rst
@@ -154,59 +154,7 @@ Install ScyllaDB
               sudo yum install scylla-5.2.3


-Configure and Run ScyllaDB
-------------------------------
-
-#. Configure the following parameters in the ``/etc/scylla/scylla.yaml`` configuration file.
-
-   * ``cluster_name`` - The name of the cluster. All the nodes in the cluster must have the same 
-     cluster name configured.
-   * ``seeds`` - The IP address of the first node. Other nodes will use it as the first contact 
-     point to discover the cluster topology when joining the cluster.
-   * ``listen_address`` - The IP address that ScyllaDB uses to connect to other nodes in the cluster.
-   * ``rpc_address`` - The IP address of the interface for CQL client connections.
-
-#. Run the ``scylla_setup`` script to tune the system settings and determine the optimal configuration.
-
-   .. code-block:: console
-    
-      sudo scylla_setup
-
-   * The script invokes a set of :ref:`scripts <system-configuration-scripts>` to configure several operating system settings; for example, it sets 
-     RAID0 and XFS filesystem. 
-   * The script runs a short (up to a few minutes) benchmark on your storage and generates the ``/etc/scylla.d/io.conf`` 
-     configuration file. When the file is ready, you can start ScyllaDB. ScyllaDB will not run without XFS 
-     or ``io.conf`` file.
-   * You can bypass this check by running ScyllaDB in :doc:`developer mode </getting-started/installation-common/dev-mod>`. 
-     We recommend against enabling developer mode in production environments to ensure ScyllaDB's maximum performance.
-
-#. Run ScyllaDB as a service (if not already running).
-
-   .. code-block:: console
-    
-      sudo systemctl start scylla-server
-
-
-Now you can start using ScyllaDB. Here are some tools you may find useful.
-
-Run nodetool:
-   
-.. code-block:: console
-     
-     nodetool status
-
-Run cqlsh:
-
-.. code-block:: console
-     
-     cqlsh
-
-Run cassandra-stress:
-
-.. code-block:: console
-     
-     cassandra-stress write -mode cql3 native 
-
+.. include:: /getting-started/_common/setup-after-install.rst

 Next Steps
 ------------
--- a/docs/getting-started/installation-common/scylla-web-installer.rst
+++ b/docs/getting-started/installation-common/scylla-web-installer.rst
@@ -12,7 +12,7 @@ Prerequisites
 Ensure that your platform is supported by the ScyllaDB version you want to install. 
 See :doc:`OS Support by Platform and Version </getting-started/os-support/>`.

-Installing ScyllaDB with Web Installer
+Install ScyllaDB with Web Installer
 ---------------------------------------
 To install ScyllaDB with Web Installer, run:

@@ -40,22 +40,24 @@ options to install a different version or ScyllaDB Enterprise:
 You can run the command with the ``-h`` or ``--help`` flag to print information about the script.

 Examples
---------
+===========

-Installing ScyllaDB Open Source 4.6.1:
+Installing ScyllaDB Open Source 6.0.1:

 .. code:: console

-    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-version 4.6.1
+    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-version 6.0.1

-Installing the latest patch release for ScyllaDB Open Source 4.6:
+Installing the latest patch release for ScyllaDB Open Source 6.0:

 .. code:: console

-    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-version 4.6
+    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-version 6.0

-Installing ScyllaDB Enterprise 2021.1:
+Installing ScyllaDB Enterprise 2024.1:

 .. code:: console

-    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-product scylla-enterprise --scylla-version 2021.1
+    curl -sSf get.scylladb.com/server | sudo bash -s -- --scylla-product scylla-enterprise --scylla-version 2024.1
+
+.. include:: /getting-started/_common/setup-after-install.rst
--- a/docs/getting-started/installation-common/unified-installer.rst
+++ b/docs/getting-started/installation-common/unified-installer.rst
@@ -1,8 +1,3 @@
-.. |SCYLLADB_VERSION| replace:: 5.2
-
-.. update the version folder URL below (variables won't work):
-    https://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-5.2/
-
 ====================================================
 Install ScyllaDB Without root Privileges
 ====================================================
@@ -24,14 +19,17 @@ Note that if you're on CentOS 7, only root offline installation is supported.
 Download and Install
 -----------------------

-#. Download the latest tar.gz file for ScyllaDB |SCYLLADB_VERSION| (x86 or ARM) from https://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-5.2/.
+#. Download the latest tar.gz file for ScyllaDB version (x86 or ARM) from ``https://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-<version>/``.
+
+   Example for version 6.1: https://downloads.scylladb.com/downloads/scylla/relocatable/scylladb-6.1/
+
 #. Uncompress the downloaded package.

-   The following example shows the package for ScyllaDB 5.2.4 (x86):
+   The following example shows the package for ScyllaDB 6.1.1 (x86):

   .. code:: console

-    tar xvfz scylla-unified-5.2.4-0.20230623.cebbf6c5df2b.x86_64.tar.gz
+    tar xvfz scylla-unified-6.1.1-0.20240814.8d90b817660a.x86_64.tar.gz

 #. Install OpenJDK 8 or 11.

--- a/docs/operating-scylla/nodetool-commands/rebuild.rst
+++ b/docs/operating-scylla/nodetool-commands/rebuild.rst
@@ -1,8 +1,17 @@
 Nodetool rebuild
 ================

-**rebuild** ``[<src-dc-name>]`` - This command rebuilds a node's data by streaming data from other nodes in the cluster (similarly to bootstrap).
-Rebuild operates on multiple nodes in a ScyllaDB cluster. It streams data from a single source replica when rebuilding a token range. When executing the command, ScyllaDB first figures out which ranges the local node (the one we want to rebuild) is responsible for. Then which node in the cluster contains the same ranges. Finally, ScyllaDB streams the data to the local node.
+**rebuild** ``[[--force] <source-dc-name>]`` - This command rebuilds a node's data by streaming data from other nodes in the cluster (similarly to bootstrap).
+
+When executing the command, ScyllaDB first figures out which ranges the local node (the one we want to rebuild) is responsible for.
+Then which node in the cluster contains the same ranges.
+If ``source-dc-name`` is provided, ScyllaDB will stream data only from nodes in that datacenter, when safe to do so.
+Otherwise, an alternative datacenter that lost no nodes will be considered, and if none exist, all datacenters will be considered.
+Use the ``--force`` option to enforce rebuild using the source datacenter, even if it is unsafe to do so.
+
+When ``rebuild`` is enabled in :doc:`Repair Based Node Operations (RBNO) </operating-scylla/procedures/cluster-management/repair-based-node-operation>`,
+data is rebuilt using repair-based-rebuild by reading all source replicas in each token range and repairing any discrepancies between them.
+Otherwise, data is streamed from a single source replica when rebuilding each token range.
 
 When :doc:`adding a new data-center into an existing ScyllaDB cluster </operating-scylla/procedures/cluster-management/add-dc-to-existing-dc/>` use the rebuild command.

@@ -14,6 +23,6 @@ For Example:

 .. code-block:: shell

-   nodetool rebuild <src-dc-name>
+   nodetool rebuild <source-dc-name>

 .. include:: nodetool-index.rst
--- a/docs/operating-scylla/procedures/cluster-management/_common/membership-change-failures-note.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/membership-change-failures-note.rst
@@ -1,7 +1,10 @@
 .. note::

-    This page only applies to clusters where consistent topology updates are not enabled. 
+    This page only applies to clusters where consistent topology updates are not enabled.
+    Consistent topology updates are mandatory, so **this page serves troubleshooting purposes**.
+
    The page does NOT apply if you:

-    * Created a cluster with ScyllaDB 6.0 (consistent topology updates are automatically enabled).
-    * Upgraded from ScyllaDB 5.4 and :doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
+    * Created a cluster with ScyllaDB 6.0 or later (consistent topology updates are automatically enabled).
+    * `Manually enabled consistent topology updates <https://opensource.docs.scylladb.com/branch-6.0/upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology.html>`_
+      after upgrading to 6.0 or before upgrading to 6.1 (required).
--- a/docs/operating-scylla/procedures/cluster-management/_common/system-auth-alter-info.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/system-auth-alter-info.rst
@@ -1,3 +0,0 @@
-(Note: If you upgraded from version 5.4 without 
-:doc:`enabling consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`, 
-you must additionally alter the ``system_auth`` keyspace.)
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-add-new-dc.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-add-new-dc.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <add-dc-upgrade-info>`.
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-add-new-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-add-new-node.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <add-new-node-upgrade-info>`.
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-remove-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-remove-node.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <remove-node-upgrade-info>`.
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-replace-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-note-replace-node.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <replace-node-upgrade-info>`.
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-add-new-node-or-dc.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-add-new-node-or-dc.rst
@@ -1,24 +0,0 @@
-
-After Upgrading from 5.4
----------------------------
-
-The procedure described above applies to clusters where consistent topology updates 
-are enabled. The feature is automatically enabled in new clusters.
-
-If you've upgraded an existing cluster from version 5.4, ensure that you 
-:doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
-Without consistent topology updates enabled, you must consider the following
-limitations while applying the procedure:
-
-* You can only bootstrap one node at a time. You need to wait until the status 
-  of one new node becomes UN (Up Normal) before adding another new node.
-* If the node starts bootstrapping but fails in the middle, for example, due to 
-  a power loss, you can retry bootstrap by restarting the node. If you don't want to
-  retry, or the node refuses to boot on subsequent attempts, consult the 
-  :doc:`Handling Membership Change Failures </operating-scylla/procedures/cluster-management/handling-membership-change-failures>`
-  document. 
-* The ``system_auth`` keyspace has not been upgraded to ``system``.
-  As a result, if ``authenticator`` is set to ``PasswordAuthenticator``, you must 
-  increase the replication factor of the ``system_auth`` keyspace. It is 
-  recommended to set ``system_auth`` replication factor to the number of nodes 
-  in each DC.
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-remove-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-remove-node.rst
@@ -1,21 +0,0 @@
-
-After Upgrading from 5.4
----------------------------
-
-The procedure described above applies to clusters where consistent topology updates 
-are enabled. The feature is automatically enabled in new clusters.
-
-If you've upgraded an existing cluster from version 5.4, ensure that you 
-:doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
-Without consistent topology updates enabled, you must consider the following
-limitations while applying the procedure:
-    
-* It’s essential to ensure the removed node will **never** come back to the cluster, 
-  which might adversely affect your data (data resurrection/loss). To prevent the removed 
-  node from rejoining the cluster, remove that node from the cluster network or VPC.
-* You can only remove one node at a time. You need to verify that the node has 
-  been removed before removing another one.
-* If ``nodetool decommission`` starts executing but fails in the middle, for example, 
-  due to a power loss, consult the 
-  :doc:`Handling Membership Change Failures </operating-scylla/procedures/cluster-management/handling-membership-change-failures>`
-  document. 
--- a/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-replace-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/_common/upgrade-warning-replace-node.rst
@@ -1,23 +0,0 @@
-
----------------------------
-After Upgrading from 5.4
----------------------------
-
-The procedure described above applies to clusters where consistent topology updates 
-are enabled. The feature is automatically enabled in new clusters.
-
-If you've upgraded an existing cluster from version 5.4, ensure that you 
-:doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
-Without consistent topology updates enabled, you must consider the following
-limitations while applying the procedure:
-    
-* It’s essential to ensure the replaced (dead) node will never come back to the cluster, 
-  which might lead to a split-brain situation. Remove the replaced (dead) node from 
-  the cluster network or VPC.
-* You can only replace one node at a time. You need to wait until the status 
-  of the new node becomes UN (Up Normal) before replacing another new node.
-* If the new node starts and begins the replace operation but then fails in the middle, 
-  for example, due to a power loss, you can retry the replace by restarting the node. 
-  If you don’t want to retry, or the node refuses to boot on subsequent attempts, consult the 
-  :doc:`Handling Membership Change Failures </operating-scylla/procedures/cluster-management/handling-membership-change-failures>`
-  document. 
--- a/docs/operating-scylla/procedures/cluster-management/add-dc-to-existing-dc.rst
+++ b/docs/operating-scylla/procedures/cluster-management/add-dc-to-existing-dc.rst
@@ -1,8 +1,6 @@
 Adding a New Data Center Into an Existing ScyllaDB Cluster
 ***********************************************************

-.. scylladb_include_flag:: upgrade-note-add-new-dc.rst
-
 The following procedure specifies how to add a Data Center (DC) to a live ScyllaDB Cluster, in a single data center, :ref:`multi-availability zone <faq-best-scenario-node-multi-availability-zone>`, or multi-datacenter. Adding a DC out-scales the cluster and provides higher availability (HA).

 The procedure includes:
@@ -164,8 +162,6 @@ Add New DC
   * Keyspace created by the user (which needed to replicate to the new DC).
   * System: ``system_distributed``, ``system_traces``, for example, replicate the data to three nodes in the new DC.

-   .. scylladb_include_flag:: system-auth-alter-info.rst
-
   For example:

   Before
@@ -234,7 +230,3 @@ Additional Resources for Java Clients
 * `DCAwareRoundRobinPolicy.Builder <https://java-driver.docs.scylladb.com/scylla-3.10.2.x/api/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.Builder.html>`_
 * `DCAwareRoundRobinPolicy <https://java-driver.docs.scylladb.com/scylla-3.10.2.x/api/com/datastax/driver/core/policies/DCAwareRoundRobinPolicy.html>`_

-
-.. _add-dc-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-add-new-node-or-dc.rst
--- a/docs/operating-scylla/procedures/cluster-management/add-node-to-cluster.rst
+++ b/docs/operating-scylla/procedures/cluster-management/add-node-to-cluster.rst
@@ -2,8 +2,6 @@
 Adding a New Node Into an Existing ScyllaDB Cluster (Out Scale)
 =================================================================

-.. scylladb_include_flag:: upgrade-note-add-new-node.rst
-
 When you add a new node, other nodes in the cluster stream data to the new node. This operation is called bootstrapping and may
 be time-consuming, depending on the data size and network bandwidth. If using a :ref:`multi-availability-zone <faq-best-scenario-node-multi-availability-zone>`, make sure they are balanced.

@@ -100,7 +98,3 @@ Procedure

 #. If you are using ScyllaDB Monitoring, update the `monitoring stack <https://monitoring.docs.scylladb.com/stable/install/monitoring_stack.html#configure-scylla-nodes-from-files>`_ to monitor it. If you are using ScyllaDB Manager, make sure you install the `Manager Agent <https://manager.docs.scylladb.com/stable/install-scylla-manager-agent.html>`_, and Manager can access it.

-
-.. _add-new-node-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-add-new-node-or-dc.rst
--- a/docs/operating-scylla/procedures/cluster-management/remove-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/remove-node.rst
@@ -2,8 +2,6 @@
 Remove a Node from a ScyllaDB Cluster (Down Scale)
 ***************************************************

-.. scylladb_include_flag:: upgrade-note-remove-node.rst
-
 You can remove nodes from your cluster to reduce its size.

 -----------------------
@@ -83,10 +81,6 @@ the ``nodetool removenode`` operation will fail. To ensure successful operation
  ``nodetool removenode`` (not required when :doc:`Repair Based Node Operations (RBNO) <repair-based-node-operation>` for ``removenode`` 
  is enabled).

-.. _remove-node-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-remove-node.rst
-
 Additional Information
 ----------------------
 * :doc:`Nodetool Reference </operating-scylla/nodetool>`
--- a/docs/operating-scylla/procedures/cluster-management/replace-dead-node.rst
+++ b/docs/operating-scylla/procedures/cluster-management/replace-dead-node.rst
@@ -1,8 +1,6 @@
 Replace a Dead Node in a ScyllaDB Cluster 
 ******************************************

-.. scylladb_include_flag:: upgrade-note-replace-node.rst
-
 Replace dead node operation will cause the other nodes in the cluster to stream data to the node that was replaced. This operation can take some time (depending on the data size and network bandwidth).

 This procedure is for replacing one dead node. You can replace more than one dead node in parallel.
@@ -194,7 +192,3 @@ In this case, the node's data will be cleaned after restart. To remedy this, you

 Sometimes the public/ private IP of instance is changed after restart. If so refer to the Replace Procedure_ above.

-
-.. _replace-node-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-replace-node.rst
--- a/docs/operating-scylla/procedures/cluster-management/update-topology-strategy-from-simple-to-network.rst
+++ b/docs/operating-scylla/procedures/cluster-management/update-topology-strategy-from-simple-to-network.rst
@@ -23,8 +23,6 @@ Alter the following:
 * Keyspace created by the user.
 * System: ``system_distributed``, ``system_traces``.

-.. scylladb_include_flag:: system-auth-alter-info.rst
-
 For example:

 Before
--- a/docs/operating-scylla/procedures/tips/benchmark-tips.rst
+++ b/docs/operating-scylla/procedures/tips/benchmark-tips.rst
@@ -41,7 +41,7 @@ With the recent addition of the `ScyllaDB Advisor <http://monitoring.docs.scylla
 Install ScyllaDB Manager
 ------------------------

-Install and use `ScyllaDB Manager <https://manager.docs.scylladb.com>` together with the `ScyllaDB Monitoring Stack <http://monitoring.docs.scylladb.com/>`_.
+Install and use `ScyllaDB Manager <https://manager.docs.scylladb.com>`_ together with the `ScyllaDB Monitoring Stack <http://monitoring.docs.scylladb.com/>`_.
 ScyllaDB Manager provides automated backups and repairs of your database.
 ScyllaDB Manager can manage multiple ScyllaDB clusters and run cluster-wide tasks in a controlled and predictable way.
 For example, with ScyllaDB Manager you can control the intensity of a repair, increasing it to speed up the process, or lower the intensity to ensure it minimizes impact on ongoing operations.
--- a/docs/operating-scylla/procedures/tips/best-practices-scylla-on-docker.rst
+++ b/docs/operating-scylla/procedures/tips/best-practices-scylla-on-docker.rst
@@ -22,6 +22,13 @@ To start a single ScyllaDB node instance in a Docker container, run:

 docker run --name some-scylla -d scylladb/scylla

+If you're on macOS and plan to start a multi-node cluster (3 nodes or more), start ScyllaDB with
+``–reactor-backend=epoll`` to override the default ``linux-aio`` reactor backend:
+
+.. code-block:: console
+
+ docker run --name some-scylla -d scylladb/scylla --reactor-backend=epoll
+
 The ``docker run`` command starts a new Docker instance in the background named some-scylla that runs the ScyllaDB server:

 .. code-block:: console
@@ -95,6 +102,12 @@ With a single ``some-scylla`` instance running,  joining new nodes to form a clu

 docker run --name some-scylla2 -d scylladb/scylla --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla)"

+If you're on macOS, ensure to add the ``–reactor-backend=epoll`` option when adding new nodes:
+
+.. code-block:: console
+
+ docker run --name some-scylla2 -d scylladb/scylla --reactor-backend=epoll --seeds="$(docker inspect --format='{{ .NetworkSettings.IPAddress }}' some-scylla)"
+
 To query when the node is up and running (and view the status of the entire cluster) use the ``nodetool status`` command:

 .. code-block:: console
--- a/docs/operating-scylla/security/_common/upgrade-note-authentication.rst
+++ b/docs/operating-scylla/security/_common/upgrade-note-authentication.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <authentication-upgrade-info>`.
--- a/docs/operating-scylla/security/_common/upgrade-note-runtime-authentication.rst
+++ b/docs/operating-scylla/security/_common/upgrade-note-runtime-authentication.rst
@@ -1,3 +0,0 @@
-.. note::
-
-   If you upgraded your cluster from version 5.4, see :ref:`After Upgrading from 5.4 <runtime-authentication-upgrade-info>`.
--- a/docs/operating-scylla/security/_common/upgrade-warning-authentication.rst
+++ b/docs/operating-scylla/security/_common/upgrade-warning-authentication.rst
@@ -1,20 +0,0 @@
-
-After Upgrading from 5.4
----------------------------
-
-The procedure described above applies to clusters where consistent topology updates 
-are enabled. The feature is automatically enabled in new clusters.
-
-If you've upgraded an existing cluster from version 5.4, ensure that you 
-:doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
-Without consistent topology updates enabled, you must take additional steps
-to enable authentication: 
-    
-* Before you start the procedure, set the ``system_auth`` keyspace replication factor 
-  to the number of nodes in the datacenter via cqlsh. It allows you to ensure that
-  the user's information is kept highly available for the cluster. If ``system_auth`` 
-  is not equal to the number of nodes and a node fails, the user whose information 
-  is on that node will be denied access.
-* After you start cqlsh with the default superuser username and password, run 
-  a repair on the ``system_auth`` keyspace on all the nodes in the cluster, for example: 
-  ``nodetool repair -pr system_auth``
--- a/docs/operating-scylla/security/_common/upgrade-warning-runtime-authentication.rst
+++ b/docs/operating-scylla/security/_common/upgrade-warning-runtime-authentication.rst
@@ -1,20 +0,0 @@
-
-After Upgrading from 5.4
-^^^^^^^^^^^^^^^^^^^^^^^^^^^^
-
-The procedures described above apply to clusters where consistent topology updates 
-are enabled. The feature is automatically enabled in new clusters.
-
-If you've upgraded an existing cluster from version 5.4, ensure that you 
-:doc:`manually enabled consistent topology updates </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
-Without consistent topology updates enabled, you must take additional steps
-to enable or disable authentication without downtime: 
-    
-* Before you enable authentication without downtime, set the ``system_auth`` 
-  keyspace replication factor to the number of nodes in the datacenter via cqlsh. 
-  It allows you to ensure that the user's information is kept highly available 
-  for the cluster. If ``system_auth`` is not equal to the number of nodes and 
-  a node fails, the user whose information is on that node will be denied access.
-* After you restart the nodes when you enable or disable authentication without
-  downtime, run repair on the ``system_auth`` keyspace, one node at a time on 
-  all the nodes in the cluster.
--- a/docs/operating-scylla/security/authentication.rst
+++ b/docs/operating-scylla/security/authentication.rst
@@ -1,8 +1,6 @@
 Enable Authentication
 =====================

-.. scylladb_include_flag:: upgrade-note-authentication.rst
-
 Authentication is the process where login accounts and their passwords are verified, and the user is allowed access to the database. Authentication is done internally within ScyllaDB and is not done with a third party. Users and passwords are created with roles using a ``CREATE ROLE`` statement. Refer to :doc:`Grant Authorization CQL Reference </operating-scylla/security/authorization>` for details.  

 The procedure described below enables Authentication on the ScyllaDB servers. It is intended to be used when you do **not** have applications running with ScyllaDB/Cassandra drivers.
@@ -39,10 +37,6 @@ Procedure

 #. If you want to create users and roles, continue to :doc:`Enable Authorization </operating-scylla/security/enable-authorization>`.

-.. _authentication-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-authentication.rst
-
 Additional Resources
 --------------------

--- a/docs/operating-scylla/security/certificate-authentication.rst
+++ b/docs/operating-scylla/security/certificate-authentication.rst
@@ -11,7 +11,7 @@ Procedure

 #. Enable authentication

-   Enable authentication and define authorized roles in the cluster as described in the `Enable Authentication </operating-scylla/security/authentication/>`_ document. 
+   Enable authentication and define authorized roles in the cluster as described in the :doc:`Enable Authentication </operating-scylla/security/authentication/>` document. 

 #. Enable CQL transport TLS using client certificate verification
   
--- a/docs/operating-scylla/security/client-node-encryption.rst
+++ b/docs/operating-scylla/security/client-node-encryption.rst
@@ -3,7 +3,7 @@ Encryption: Data in Transit Client to Node

 Follow the procedures below to enable a client to node encryption.
 Once enabled, all communication between the client and the node is transmitted over TLS/SSL.
-The libraries used by ScyllaDB for OpenSSL are FIPS 140-2 certified.
+The libraries used by ScyllaDB for OpenSSL are FIPS 140-2 enabled.

 Workflow
 ^^^^^^^^
--- a/docs/operating-scylla/security/rbac-usecase.rst
+++ b/docs/operating-scylla/security/rbac-usecase.rst
@@ -22,7 +22,7 @@ In the same manner, should someone leave the organization, all you would have to
 Should someone change positions at the company, just assign the new employee to the new role and revoke roles no longer required for the new position.
   
 To build an RBAC environment, you need to create the roles and their associated permissions and then assign or grant the roles to the individual users. Roles inherit the permissions of any other roles that they are granted. The hierarchy of roles can be either simple or extremely complex. This gives great flexibility to database administrators, where they can  create specific permission conditions without incurring a huge administrative burden.
-In addition to standard roles, `ScyllaDB Enterprise <https://enterprise.docs.scylladb.com/>`_ users can implement `Workload Prioritization <https://enterprise.docs.scylladb.com/stable/using-scylla/workload-prioritization.html>`, which allows you to attach roles to Service Levels, thus granting resources to roles as the role demands.
+In addition to standard roles, `ScyllaDB Enterprise <https://enterprise.docs.scylladb.com/>`_ users can implement `Workload Prioritization <https://enterprise.docs.scylladb.com/stable/using-scylla/workload-prioritization.html>`_, which allows you to attach roles to Service Levels, thus granting resources to roles as the role demands.

 .. _rbac-usecase-grant-roles-and-permissions:

--- a/docs/operating-scylla/security/runtime-authentication.rst
+++ b/docs/operating-scylla/security/runtime-authentication.rst
@@ -1,8 +1,6 @@
 Enable and Disable Authentication Without Downtime
 ==================================================

-.. scylladb_include_flag:: upgrade-note-runtime-authentication.rst
-
 Authentication is the process where login accounts and their passwords are verified, and the user is allowed access into the database. Authentication is done internally within ScyllaDB and is not done with a third party. Users and passwords are created with :doc:`roles </operating-scylla/security/authorization>` using a ``CREATE ROLE`` statement. This procedure enables Authentication on the ScyllaDB servers using a transit state, allowing clients to work with or without Authentication at the same time. In this state, you can update the clients (application using ScyllaDB/Apache Cassandra drivers) one at the time. Once all the clients are using Authentication, you can enforce Authentication on all ScyllaDB nodes as well. If you would rather perform a faster authentication procedure where all clients (application using ScyllaDB/Apache Cassandra drivers) will stop working until they are updated to work with Authentication, refer to :doc:`Enable Authentication </operating-scylla/security/runtime-authentication>`.


@@ -108,6 +106,3 @@ Procedure

 #. Verify that all the client applications are working correctly with authentication disabled.

-.. _runtime-authentication-upgrade-info:
-
-.. scylladb_include_flag:: upgrade-warning-runtime-authentication.rst
--- a/docs/reference/configuration-parameters.rst
+++ b/docs/reference/configuration-parameters.rst
@@ -3,7 +3,23 @@ Configuration Parameters
 ========================

 This section contains a list of properties that can be configured in ``scylla.yaml`` - the main configuration file for ScyllaDB.
-In addition, properties that support live updates (liveness) can be updated via the ``system.config`` virtual table or the REST API.
+In addition, properties that support live updates (liveness) can be updated via the ``system.config`` virtual table or the :doc:`REST API </operating-scylla/rest>`.
+
+Live update means that parameters can be modified dynamically while the server
+is running. If ``liveness`` of a parameter is set to ``true``, sending the ``SIGHUP``
+signal to the server processes will trigger ScyllaDB to re-read its configuration
+and override the current configuration with the new value.
+
+**Configuration Precedence**
+
+As the parameters can be configured in more than one place, ScyllaDB applies them
+in the following order with ``scylla.yaml`` parameters updated via ``SIGHUP``
+having the highest priority:
+
+#. Live update via ``scylla.yaml`` (with ``SIGHUP``) or REST API
+#. ``system.config`` table
+#. command line options
+#. ``scylla.yaml``

 .. scylladb_config_list:: ../../db/config.hh ../../db/config.cc
  :template: db_config.tmpl
--- a/docs/troubleshooting/_common/enable-consistent-topology.rst
+++ b/docs/troubleshooting/_common/enable-consistent-topology.rst
@@ -1 +1 @@
-Perform :doc:`the procedure for enabling consistent topology changes </upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`.
+Perform `the procedure for enabling consistent topology changes <https://opensource.docs.scylladb.com/branch-6.0/upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology.html>`_.
--- a/docs/troubleshooting/_common/enabling-consistent-topology-failure.rst
+++ b/docs/troubleshooting/_common/enabling-consistent-topology-failure.rst
@@ -1,3 +1,3 @@
 :ref:`The Raft upgrade procedure <verify-raft-procedure>`
-or :doc:`the procedure for enabling consistent topology changes</upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology>`
+or `the procedure for enabling consistent topology changes <https://opensource.docs.scylladb.com/branch-6.0/upgrade/upgrade-opensource/upgrade-guide-from-5.4-to-6.0/enable-consistent-topology.html>`_
 got stuck because one of the nodes failed in the middle of the procedure and is irrecoverable.
--- a/Show More
+++ b/Show More