Enable prometheus_allow_protobuf by default

Change the prometheus_allow_protobuf configuration to true by default. This allows ScyllaDB server to serve Prometheus protobuf format (enables native histogram support) if asked so by the monitoring server. Update config help text/docs to reflect protobuf support (drop “experimental” wording). Add cluster tests to validate the default is enabled, can be overridden, and /metrics returns protobuf when requested via Accept header (and falls back to text when disabled). Fixes #27817 co-Author: mykaul <mykaul@scylladb.com> Signed-off-by: Amnon Heiman <amnon@scylladb.com>
Merge 'topology_coordinator: Refresh load stats after table is created or altered' from Tomasz Grabiec
2026-01-19 09:40:49 +02:00 · 2026-01-16 11:34:57 +01:00 · 2026-01-16 11:19:01 +02:00 · 2026-01-15 10:25:45 +01:00 · 2026-01-15 05:13:03 +02:00 · 2026-01-15 04:33:43 +02:00
971 changed files with 42139 additions and 12868 deletions
--- a/.github/CODEOWNERS
+++ b/.github/CODEOWNERS
@@ -1,5 +1,5 @@
 # AUTH
-auth/* @nuivall @ptrsmrn
+auth/* @nuivall

 # CACHE
 row_cache* @tgrabiec
@@ -25,11 +25,11 @@ compaction/* @raphaelsc
 transport/*

 # CQL QUERY LANGUAGE
-cql3/* @tgrabiec @nuivall @ptrsmrn
+cql3/* @tgrabiec @nuivall

 # COUNTERS
-counters* @nuivall @ptrsmrn
-tests/counter_test* @nuivall @ptrsmrn
+counters* @nuivall
+tests/counter_test* @nuivall

 # DOCS
 docs/* @annastuchlik @tzach
@@ -57,7 +57,6 @@ repair/* @tgrabiec @asias

 # SCHEMA MANAGEMENT
 db/schema_tables* @tgrabiec
-db/legacy_schema_migrator* @tgrabiec
 service/migration* @tgrabiec
 schema* @tgrabiec

--- a/.github/copilot-instructions.md
+++ b/.github/copilot-instructions.md
@@ -0,0 +1,97 @@
+# ScyllaDB Development Instructions
+
+## Project Context
+High-performance distributed NoSQL database. Core values: performance, correctness, readability.
+
+## Build System
+
+### Modern Build (configure.py + ninja)
+```bash
+# Configure (run once per mode, or when switching modes)
+./configure.py --mode=<mode>  # mode: dev, debug, release, sanitize
+
+# Build everything
+ninja <mode>-build  # e.g., ninja dev-build
+
+# Build Scylla binary only (sufficient for Python integration tests)
+ninja build/<mode>/scylla
+
+# Build specific test
+ninja build/<mode>/test/boost/<test_name>
+```
+
+## Running Tests
+
+### C++ Unit Tests
+```bash
+# Run all tests in a file
+./test.py --mode=<mode> test/<suite>/<test_name>.cc
+
+# Run a single test case from a file
+./test.py --mode=<mode> test/<suite>/<test_name>.cc::<test_case_name>
+
+# Examples
+./test.py --mode=dev test/boost/memtable_test.cc
+./test.py --mode=dev test/raft/raft_server_test.cc::test_check_abort_on_client_api
+```
+
+**Important:** 
+- Use full path with `.cc` extension (e.g., `test/boost/test_name.cc`, not `boost/test_name`)
+- To run a single test case, append `::<test_case_name>` to the file path
+- If you encounter permission issues with cgroup metric gathering, add `--no-gather-metrics` flag
+
+**Rebuilding Tests:**
+- test.py does NOT automatically rebuild when test source files are modified
+- Many tests are part of composite binaries (e.g., `combined_tests` in test/boost contains multiple test files)
+- To find which binary contains a test, check `configure.py` in the repository root (primary source) or `test/<suite>/CMakeLists.txt`
+- To rebuild a specific test binary: `ninja build/<mode>/test/<suite>/<binary_name>`
+- Examples: 
+  - `ninja build/dev/test/boost/combined_tests` (contains group0_voter_calculator_test.cc and others)
+  - `ninja build/dev/test/raft/replication_test` (standalone Raft test)
+
+### Python Integration Tests
+```bash
+# Only requires Scylla binary (full build usually not needed)
+ninja build/<mode>/scylla
+
+# Run all tests in a file
+./test.py --mode=<mode> <test_path>
+
+# Run a single test case from a file
+./test.py --mode=<mode> <test_path>::<test_function_name>
+
+# Examples
+./test.py --mode=dev alternator/
+./test.py --mode=dev cluster/test_raft_voters::test_raft_limited_voters_retain_coordinator
+
+# Optional flags
+./test.py --mode=dev cluster/test_raft_no_quorum -v  # Verbose output
+./test.py --mode=dev cluster/test_raft_no_quorum --repeat 5  # Repeat test 5 times
+```
+
+**Important:**
+- Use path without `.py` extension (e.g., `cluster/test_raft_no_quorum`, not `cluster/test_raft_no_quorum.py`)
+- To run a single test case, append `::<test_function_name>` to the file path
+- Add `-v` for verbose output
+- Add `--repeat <num>` to repeat a test multiple times
+- After modifying C++ source files, only rebuild the Scylla binary for Python tests - building the entire repository is unnecessary
+
+## Code Philosophy
+- Performance matters in hot paths (data read/write, inner loops)
+- Self-documenting code through clear naming
+- Comments explain "why", not "what"
+- Prefer standard library over custom implementations
+- Strive for simplicity and clarity, add complexity only when clearly justified
+- Question requests: don't blindly implement requests - evaluate trade-offs, identify issues, and suggest better alternatives when appropriate
+- Consider different approaches, weigh pros and cons, and recommend the best fit for the specific context
+
+## Test Philosophy
+- Performance matters. Tests should run as quickly as possible. Sleeps in the code are highly discouraged and should be avoided, to reduce run time and flakiness.
+- Stability matters. Tests should be stable. New tests should be executed 100 times at least to ensure they pass 100 out of 100 times. (use --repeat 100 --max-failures 1 when running it)
+- Unit tests should ideally test one thing and one thing only.
+- Tests for bug fixes should run before the fix - and show the failure and after the fix - and show they now pass.
+- Tests for bug fixes should have in their comments which bug fixes (GitHub or JIRA issue) they test.
+- Tests in debug are always slower, so if needed, reduce number of iterations, rows, data used, cycles, etc. in debug mode.
+- Tests should strive to be repeatable, and not use random input that will make their results unpredictable.
+- Tests should consume as little resources as possible. Prefer running tests on a single node if it is sufficient, for example.
+
--- a/.github/instructions/cpp.instructions.md
+++ b/.github/instructions/cpp.instructions.md
@@ -0,0 +1,115 @@
+---
+applyTo: "**/*.{cc,hh}"
+---
+
+# C++ Guidelines
+
+**Important:** Always match the style and conventions of existing code in the file and directory.
+
+## Memory Management
+- Prefer stack allocation whenever possible
+- Use `std::unique_ptr` by default for dynamic allocations
+- `new`/`delete` are forbidden (use RAII)
+- Use `seastar::lw_shared_ptr` or `seastar::shared_ptr` for shared ownership within same shard
+- Use `seastar::foreign_ptr` for cross-shard sharing
+- Avoid `std::shared_ptr` except when interfacing with external C++ APIs
+- Avoid raw pointers except for non-owning references or C API interop
+
+## Seastar Asynchronous Programming
+- Use `seastar::future<T>` for all async operations
+- Prefer coroutines (`co_await`, `co_return`) over `.then()` chains for readability
+- Coroutines are preferred over `seastar::do_with()` for managing temporary state
+- In hot paths where futures are ready, continuations may be more efficient than coroutines
+- Chain futures with `.then()`, don't block with `.get()` (unless in `seastar::thread` context)
+- All I/O must be asynchronous (no blocking calls)
+- Use `seastar::gate` for shutdown coordination
+- Use `seastar::semaphore` for resource limiting (not `std::mutex`)
+- Break long loops with `maybe_yield()` to avoid reactor stalls
+
+## Coroutines
+```cpp
+seastar::future<T> func() {
+    auto result = co_await async_operation();
+    co_return result;
+}
+```
+
+## Error Handling
+- Throw exceptions for errors (futures propagate them automatically)
+- In data path: avoid exceptions, use `std::expected` (or `boost::outcome`) instead
+- Use standard exceptions (`std::runtime_error`, `std::invalid_argument`)
+- Database-specific: throw appropriate schema/query exceptions
+
+## Performance
+- Pass large objects by `const&` or `&&` (move semantics)
+- Use `std::string_view` for non-owning string references
+- Avoid copies: prefer move semantics
+- Use `utils::chunked_vector` instead of `std::vector` for large allocations (>128KB)
+- Minimize dynamic allocations in hot paths
+
+## Database-Specific Types
+- Use `schema_ptr` for schema references
+- Use `mutation` and `mutation_partition` for data modifications
+- Use `partition_key` and `clustering_key` for keys
+- Use `api::timestamp_type` for database timestamps
+- Use `gc_clock` for garbage collection timing
+
+## Style
+- C++23 standard (prefer modern features, especially coroutines)
+- Use `auto` when type is obvious from RHS
+- Avoid `auto` when it obscures the type
+- Use range-based for loops: `for (const auto& item : container)`
+- Use standard algorithms when they clearly simplify code (e.g., replacing 10-line loops)
+- Avoid chaining multiple algorithms if a straightforward loop is clearer
+- Mark functions and variables `const` whenever possible
+- Use scoped enums: `enum class` (not unscoped `enum`)
+
+## Headers
+- Use `#pragma once`
+- Include order: own header, C++ std, Seastar, Boost, project headers
+- Forward declare when possible
+- Never `using namespace` in headers (exception: `using namespace seastar` is globally available via `seastarx.hh`)
+
+## Documentation
+- Public APIs require clear documentation
+- Implementation details should be self-evident from code
+- Use `///` or Doxygen `/** */` for public documentation, `//` for implementation notes - follow the existing style
+
+## Naming
+- `snake_case` for most identifiers (classes, functions, variables, namespaces)
+- Template parameters: `CamelCase` (e.g., `template<typename ValueType>`)
+- Member variables: prefix with `_` (e.g., `int _count;`)
+- Structs (value-only): no `_` prefix on members
+- Constants and `constexpr`: `snake_case` (e.g., `static constexpr int max_size = 100;`)
+- Files: `.hh` for headers, `.cc` for source
+
+## Formatting
+- 4 spaces indentation, never tabs
+- Opening braces on same line as control structure (except namespaces)
+- Space after keywords: `if (`, `while (`, `return `
+- Whitespace around operators matches precedence: `*a + *b` not `* a+* b`
+- Line length: keep reasonable (<160 chars), use continuation lines with double indent if needed
+- Brace all nested scopes, even single statements
+- Minimal patches: only format code you modify, never reformat entire files
+
+## Logging
+- Use structured logging with appropriate levels: DEBUG, INFO, WARN, ERROR
+- Include context in log messages (e.g., request IDs)
+- Never log sensitive data (credentials, PII)
+
+## Forbidden
+- `malloc`/`free`
+- `printf` family (use logging or fmt)
+- Raw pointers for ownership
+- `using namespace` in headers
+- Blocking operations: `std::sleep`, `std::read`, `std::mutex` (use Seastar equivalents)
+- `std::atomic` (reserved for very special circumstances only)
+- Macros (use `inline`, `constexpr`, or templates instead)
+
+## Testing
+When modifying existing code, follow TDD: create/update test first, then implement.
+- Examine existing tests for style and structure
+- Use Boost.Test framework
+- Use `SEASTAR_THREAD_TEST_CASE` for Seastar asynchronous tests
+- Aim for high code coverage, especially for new features and bug fixes
+- Maintain bisectability: all tests must pass in every commit. Mark failing tests with `BOOST_FAIL()` or similar, then fix in subsequent commit
--- a/.github/instructions/python.instructions.md
+++ b/.github/instructions/python.instructions.md
@@ -0,0 +1,51 @@
+---
+applyTo: "**/*.py"
+---
+
+# Python Guidelines
+
+**Important:** Match existing code style. Some directories (like `test/cqlpy` and `test/alternator`) prefer simplicity over type hints and docstrings.
+
+## Style
+- Follow PEP 8
+- Use type hints for function signatures (unless directory style omits them)
+- Use f-strings for formatting
+- Line length: 160 characters max
+- 4 spaces for indentation
+
+## Imports
+Order: standard library, third-party, local imports
+```python
+import os
+import sys
+
+import pytest
+from cassandra.cluster import Cluster
+
+from test.utils import setup_keyspace
+```
+
+Never use `from module import *`
+
+## Documentation
+All public functions/classes need docstrings (unless the current directory conventions omit them):
+```python
+def my_function(arg1: str, arg2: int) -> bool:
+    """
+    Brief summary of function purpose.
+
+    Args:
+        arg1: Description of first argument.
+        arg2: Description of second argument.
+
+    Returns:
+        Description of return value.
+    """
+    pass
+```
+
+## Testing Best Practices
+- Maintain bisectability: all tests must pass in every commit
+- Mark currently-failing tests with `@pytest.mark.xfail`, unmark when fixed
+- Use descriptive names that convey intent
+- Docstrings/comments should explain what the test verifies and why, and if it reproduces a specific issue or how it fits into the larger test suite
--- a/.github/scripts/auto-backport.py
+++ b/.github/scripts/auto-backport.py
@@ -62,7 +62,7 @@ def create_pull_request(repo, new_branch_name, base_branch_name, pr, backport_pr
        if is_draft:
            labels_to_add.append("conflicts")
            pr_comment = f"@{pr.user.login} - This PR was marked as draft because it has conflicts\n"
-            pr_comment += "Please resolve them and mark this PR as ready for review"
+            pr_comment += "Please resolve them and remove the 'conflicts' label. The PR will be made ready for review automatically."
            backport_pr.create_issue_comment(pr_comment)
        
        # Apply all labels at once if we have any
@@ -142,20 +142,31 @@ def backport(repo, pr, version, commits, backport_base_branch, is_collaborator):


 def with_github_keyword_prefix(repo, pr):
-    pattern = rf"(?:fix(?:|es|ed))\s*:?\s*(?:(?:(?:{repo.full_name})?#)|https://github\.com/{repo.full_name}/issues/)(\d+)"
-    match = re.findall(pattern, pr.body, re.IGNORECASE)
-    if not match:
-        for commit in pr.get_commits():
-            match = re.findall(pattern, commit.commit.message, re.IGNORECASE)
-            if match:
-                print(f'{pr.number} has a valid close reference in commit message {commit.sha}')
-                break
-    if not match:
-        print(f'No valid close reference for {pr.number}')
-        return False
-    else:
+    # GitHub issue pattern: #123, scylladb/scylladb#123, or full GitHub URLs
+    github_pattern = rf"(?:fix(?:|es|ed))\s*:?\s*(?:(?:(?:{repo.full_name})?#)|https://github\.com/{repo.full_name}/issues/)(\d+)"
+    
+    # JIRA issue pattern: PKG-92 or https://scylladb.atlassian.net/browse/PKG-92
+    jira_pattern = r"(?:fix(?:|es|ed))\s*:?\s*(?:(?:https://scylladb\.atlassian\.net/browse/)?([A-Z]+-\d+))"
+    
+    # Check PR body for GitHub issues
+    github_match = re.findall(github_pattern, pr.body, re.IGNORECASE)
+    # Check PR body for JIRA issues
+    jira_match = re.findall(jira_pattern, pr.body, re.IGNORECASE)
+    
+    match = github_match or jira_match
+
+    if match:
        return True

+    for commit in pr.get_commits():
+        github_match = re.findall(github_pattern, commit.commit.message, re.IGNORECASE)
+        jira_match = re.findall(jira_pattern, commit.commit.message, re.IGNORECASE)
+        if github_match or jira_match:
+            print(f'{pr.number} has a valid close reference in commit message {commit.sha}')
+            return True
+
+    print(f'No valid close reference for {pr.number}')
+    return False

 def main():
    args = parse_args()
--- a/.github/workflows/backport-pr-fixes-validation.yaml
+++ b/.github/workflows/backport-pr-fixes-validation.yaml
@@ -18,7 +18,7 @@ jobs:
            
            // Regular expression pattern to check for "Fixes" prefix
            // Adjusted to dynamically insert the repository full name
-            const pattern = `Fixes:? (?:#|${repo.replace('/', '\\/')}#|https://github\\.com/${repo.replace('/', '\\/')}/issues/)(\\d+)`;
+            const pattern = `Fixes:? ((?:#|${repo.replace('/', '\\/')}#|https://github\\.com/${repo.replace('/', '\\/')}/issues/)(\\d+)|([A-Z]+-\\d+))`;
            const regex = new RegExp(pattern);
            
            if (!regex.test(body)) {
--- a/.github/workflows/call_jira_status_in_progress.yml
+++ b/.github/workflows/call_jira_status_in_progress.yml
@@ -1,12 +0,0 @@
-name: Call Jira Status In Progress
-
-on:
-  pull_request_target:
-    types: [opened]
-
-jobs:
-  call-jira-status-in-progress:
-    uses: scylladb/github-automation/.github/workflows/main_update_jira_status_to_in_progress.yml@main
-    secrets:
-      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
-
--- a/.github/workflows/call_jira_status_in_review.yml
+++ b/.github/workflows/call_jira_status_in_review.yml
@@ -1,12 +0,0 @@
-name: Call Jira Status In Review
-
-on:
-  pull_request_target:
-    types: [ready_for_review, review_requested]
-
-jobs:
-  call-jira-status-in-review:
-    uses: scylladb/github-automation/.github/workflows/main_update_jira_status_to_in_review.yml@main
-    secrets:
-      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
-
--- a/.github/workflows/call_jira_status_ready_for_merge.yml
+++ b/.github/workflows/call_jira_status_ready_for_merge.yml
@@ -1,12 +0,0 @@
-name: Call Jira Status Ready For Merge
-
-on:
-  pull_request_target:
-    types: [labeled]
-
-jobs:
-  call-jira-status-update:
-    uses: scylladb/github-automation/.github/workflows/main_update_jira_status_to_ready_for_merge.yml@main
-    secrets:
-      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
-
--- a/.github/workflows/call_jira_sync.yml
+++ b/.github/workflows/call_jira_sync.yml
@@ -0,0 +1,41 @@
+name: Sync Jira Based on PR Events
+
+on:
+  pull_request_target:
+    types: [opened, ready_for_review, review_requested, labeled, unlabeled, closed]
+
+permissions:
+  contents: read
+  pull-requests: write
+  issues: write
+
+jobs:
+  jira-sync-pr-opened:
+    if: github.event.action == 'opened'
+    uses: scylladb/github-automation/.github/workflows/main_jira_sync_pr_opened.yml@main
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
+
+  jira-sync-in-review:
+    if: github.event.action == 'ready_for_review' || github.event.action == 'review_requested'
+    uses: scylladb/github-automation/.github/workflows/main_jira_sync_in_review.yml@main
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
+
+  jira-sync-add-label:
+    if: github.event.action == 'labeled'
+    uses: scylladb/github-automation/.github/workflows/main_jira_sync_add_label.yml@main
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
+
+  jira-status-remove-label:
+    if: github.event.action == 'unlabeled'
+    uses: scylladb/github-automation/.github/workflows/main_jira_sync_remove_label.yml@main
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
+
+  jira-status-pr-closed:
+    if: github.event.action == 'closed' 
+    uses: scylladb/github-automation/.github/workflows/main_jira_sync_pr_closed.yml@main
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
--- a/.github/workflows/call_sync_milestone_to_jira.yml
+++ b/.github/workflows/call_sync_milestone_to_jira.yml
@@ -0,0 +1,14 @@
+name: Call Jira release creation for new milestone
+
+on:
+  milestone:
+    types: [created]
+
+jobs:
+  sync-milestone-to-jira:
+    uses: scylladb/github-automation/.github/workflows/main_sync_milestone_to_jira_release.yml@main
+    with:
+      # Comma-separated list of Jira project keys
+      jira_project_keys: "SCYLLADB,CUSTOMER"
+    secrets:
+      caller_jira_auth: ${{ secrets.USER_AND_KEY_FOR_JIRA_AUTOMATION }}
--- a/.github/workflows/call_validate_pr_author_email.yml
+++ b/.github/workflows/call_validate_pr_author_email.yml
@@ -0,0 +1,13 @@
+name: validate_pr_author_email
+
+on:
+  pull_request_target:
+    types:
+      - opened
+      - synchronize
+      - reopened
+
+jobs:
+  validate_pr_author_email:
+    uses: scylladb/github-automation/.github/workflows/validate_pr_author_email.yml@main
+
--- a/.github/workflows/codespell.yaml
+++ b/.github/workflows/codespell.yaml
@@ -13,5 +13,5 @@ jobs:
      - uses: codespell-project/actions-codespell@master
        with:
          only_warn: 1
-          ignore_words_list: "ans,datas,fo,ser,ue,crate,nd,reenable,strat,stap,te,raison"
+          ignore_words_list: "ans,datas,fo,ser,ue,crate,nd,reenable,strat,stap,te,raison,iif,tread"
          skip: "./.git,./build,./tools,*.js,*.lock,./test,./licenses,./redis/lolwut.cc,*.svg"
--- a/.github/workflows/docs-pages.yaml
+++ b/.github/workflows/docs-pages.yaml
@@ -18,6 +18,8 @@ on:

 jobs:
  release:
+    permissions:
+      contents: write
    runs-on: ubuntu-latest
    steps:
      - name: Checkout
--- a/.github/workflows/docs-pr.yaml
+++ b/.github/workflows/docs-pr.yaml
@@ -2,6 +2,9 @@ name: "Docs / Build PR"
 # For more information,
 # see https://sphinx-theme.scylladb.com/stable/deployment/production.html#available-workflows

+permissions:
+  contents: read
+
 env:
  FLAG: ${{ github.repository == 'scylladb/scylla-enterprise' && 'enterprise' || 'opensource' }}

--- a/.github/workflows/docs-validate-metrics.yml
+++ b/.github/workflows/docs-validate-metrics.yml
@@ -0,0 +1,37 @@
+name: Docs / Validate metrics
+
+permissions:
+  contents: read
+
+on:
+  pull_request:
+    branches:
+      - master
+      - enterprise
+    paths:
+      - '**/*.cc'
+      - 'scripts/metrics-config.yml'
+      - 'scripts/get_description.py'
+      - 'docs/_ext/scylladb_metrics.py'
+
+jobs:
+  validate-metrics:
+    runs-on: ubuntu-latest
+    name: Check metrics documentation coverage
+
+    steps:
+    - name: Checkout code
+      uses: actions/checkout@v4
+      with:
+        submodules: true
+
+    - name: Set up Python
+      uses: actions/setup-python@v6
+      with:
+        python-version: '3.10'
+
+    - name: Install dependencies
+      run: pip install PyYAML
+
+    - name: Validate metrics
+      run: python3 scripts/get_description.py --validate -c scripts/metrics-config.yml
--- a/.github/workflows/read-toolchain.yaml
+++ b/.github/workflows/read-toolchain.yaml
@@ -10,6 +10,8 @@ on:
 jobs:
  read-toolchain:
    runs-on: ubuntu-latest
+    permissions:
+      contents: read
    outputs:
      image: ${{ steps.read.outputs.image }}
    steps:
--- a/.github/workflows/trigger-scylla-ci.yaml
+++ b/.github/workflows/trigger-scylla-ci.yaml
@@ -3,10 +3,13 @@ name: Trigger Scylla CI Route
 on:
  issue_comment:
    types: [created]
+  pull_request_target:
+    types:
+      - unlabeled

 jobs:
  trigger-jenkins:
-    if: github.event.comment.user.login != 'scylladbbot' && contains(github.event.comment.body, '@scylladbbot') && contains(github.event.comment.body, 'trigger-ci')
+    if: (github.event.comment.user.login != 'scylladbbot' && contains(github.event.comment.body, '@scylladbbot') && contains(github.event.comment.body, 'trigger-ci')) || github.event.label.name == 'conflicts'
    runs-on: ubuntu-latest
    steps:
      - name: Trigger Scylla-CI-Route Jenkins Job
--- a/.github/workflows/trigger_ci.yaml
+++ b/.github/workflows/trigger_ci.yaml
@@ -0,0 +1,242 @@
+name: Trigger next gating
+
+on:
+  pull_request_target:
+    types: [opened, reopened, synchronize]
+  issue_comment:
+    types: [created]
+    
+jobs:
+  trigger-ci:
+    runs-on: ubuntu-latest
+    steps:
+      - name: Dump GitHub context
+        env:
+          GITHUB_CONTEXT: ${{ toJson(github) }}
+        run: echo "$GITHUB_CONTEXT"
+      - name: Checkout PR code
+        uses: actions/checkout@v3
+        with:
+          fetch-depth: 0  # Needed to access full history
+          ref: ${{ github.event.pull_request.head.ref }}
+
+      - name: Fetch before commit if needed
+        run: |
+          if ! git cat-file -e ${{ github.event.before }} 2>/dev/null; then
+            echo "Fetching before commit ${{ github.event.before }}"
+            git fetch --depth=1 origin ${{ github.event.before }}
+          fi
+
+      - name: Compare commits for file changes
+        if: github.action == 'synchronize'
+        env:
+          GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
+        run: |
+          echo "Base: ${{ github.event.before }}"
+          echo "Head: ${{ github.event.after }}"
+
+          TREE_BEFORE=$(git show -s --format=%T ${{ github.event.before }})
+          TREE_AFTER=$(git show -s --format=%T ${{ github.event.after }})
+          
+          echo "TREE_BEFORE=$TREE_BEFORE" >> $GITHUB_ENV
+          echo "TREE_AFTER=$TREE_AFTER" >> $GITHUB_ENV
+
+      - name: Check if last push has file changes
+        run: |
+          if [[ "${{ env.TREE_BEFORE }}" == "${{ env.TREE_AFTER }}" ]]; then
+            echo "No file changes detected in the last push, only commit message edit."
+            echo "has_file_changes=false" >> $GITHUB_ENV
+          else
+            echo "File changes detected in the last push."
+            echo "has_file_changes=true" >> $GITHUB_ENV
+          fi
+
+      - name: Rule 1 - Check PR draft or conflict status
+        run: |
+          # Check if PR is in draft mode
+          IS_DRAFT="${{ github.event.pull_request.draft }}"
+          
+          # Check if PR has 'conflict' label
+          HAS_CONFLICT_LABEL="false"
+          LABELS='${{ toJson(github.event.pull_request.labels) }}'
+          if echo "$LABELS" | jq -r '.[].name' | grep -q "^conflict$"; then
+            HAS_CONFLICT_LABEL="true"
+          fi
+          
+          # Set draft_or_conflict variable
+          if [[ "$IS_DRAFT" == "true" || "$HAS_CONFLICT_LABEL" == "true" ]]; then
+            echo "draft_or_conflict=true" >> $GITHUB_ENV
+            echo "✅ Rule 1: PR is in draft mode or has conflict label - setting draft_or_conflict=true"
+          else
+            echo "draft_or_conflict=false" >> $GITHUB_ENV
+            echo "✅ Rule 1: PR is ready and has no conflict label - setting draft_or_conflict=false"
+          fi
+          
+          echo "Draft status: $IS_DRAFT"
+          echo "Has conflict label: $HAS_CONFLICT_LABEL"
+          echo "Result: draft_or_conflict = $draft_or_conflict"
+
+      - name: Rule 2 - Check labels
+        run: |
+          # Check if PR has P0 or P1 labels
+          HAS_P0_P1_LABEL="false"
+          LABELS='${{ toJson(github.event.pull_request.labels) }}'
+          if echo "$LABELS" | jq -r '.[].name' | grep -E "^(P0|P1)$" > /dev/null; then
+            HAS_P0_P1_LABEL="true"
+          fi
+          
+          # Check if PR already has force_on_cloud label
+          echo "HAS_FORCE_ON_CLOUD_LABEL=false" >> $GITHUB_ENV
+          if echo "$LABELS" | jq -r '.[].name' | grep -q "^force_on_cloud$"; then
+            HAS_FORCE_ON_CLOUD_LABEL="true"
+            echo "HAS_FORCE_ON_CLOUD_LABEL=true" >> $GITHUB_ENV
+          fi
+          
+          echo "Has P0/P1 label: $HAS_P0_P1_LABEL"
+          echo "Has force_on_cloud label: $HAS_FORCE_ON_CLOUD_LABEL"
+          
+          # Add force_on_cloud label if PR has P0/P1 and doesn't already have force_on_cloud
+          if [[ "$HAS_P0_P1_LABEL" == "true" && "$HAS_FORCE_ON_CLOUD_LABEL" == "false" ]]; then
+            echo "✅ Rule 2: PR has P0 or P1 label - adding force_on_cloud label"
+            curl -X POST \
+              -H "Authorization: token ${{ secrets.GITHUB_TOKEN }}" \
+              -H "Accept: application/vnd.github.v3+json" \
+              "https://api.github.com/repos/${{ github.repository }}/issues/${{ github.event.pull_request.number }}/labels" \
+              -d '{"labels":["force_on_cloud"]}'
+          elif [[ "$HAS_P0_P1_LABEL" == "true" && "$HAS_FORCE_ON_CLOUD_LABEL" == "true" ]]; then
+            echo "✅ Rule 2: PR has P0 or P1 label and already has force_on_cloud label - no action needed"
+          else
+            echo "✅ Rule 2: PR does not have P0 or P1 label - no force_on_cloud label needed"
+          fi
+
+          SKIP_UNIT_TEST_CUSTOM="false"
+          if echo "$LABELS" | jq -r '.[].name' | grep -q "^ci/skip_unit-tests_custom$"; then
+            SKIP_UNIT_TEST_CUSTOM="true"
+          fi
+          echo "SKIP_UNIT_TEST_CUSTOM=$SKIP_UNIT_TEST_CUSTOM" >> $GITHUB_ENV
+
+      - name: Rule 3 - Analyze changed files and set build requirements
+        run: |
+          # Get list of changed files
+          CHANGED_FILES=$(git diff --name-only ${{ github.event.pull_request.base.sha }} ${{ github.event.pull_request.head.sha }})
+          echo "Changed files:"
+          echo "$CHANGED_FILES"
+          echo ""
+          
+          # Initialize all requirements to false
+          REQUIRE_BUILD="false"
+          REQUIRE_DTEST="false"
+          REQUIRE_UNITTEST="false"
+          REQUIRE_ARTIFACTS="false"
+          REQUIRE_SCYLLA_GDB="false"
+          
+          # Check each file against patterns
+          while IFS= read -r file; do
+            if [[ -n "$file" ]]; then
+              echo "Checking file: $file"
+              
+              # Build pattern: ^(?!scripts\/pull_github_pr.sh).*$
+              # Everything except scripts/pull_github_pr.sh
+              if [[ "$file" != "scripts/pull_github_pr.sh" ]]; then
+                REQUIRE_BUILD="true"
+                echo "  ✓ Matches build pattern"
+              fi
+              
+              # Dtest pattern: ^(?!test(.py|\/)|dist\/docker\/|dist\/common\/scripts\/).*$
+              # Everything except test files, dist/docker/, dist/common/scripts/
+              if [[ ! "$file" =~ ^test\.(py|/).*$ ]] && [[ ! "$file" =~ ^dist/docker/.*$ ]] && [[ ! "$file" =~ ^dist/common/scripts/.*$ ]]; then
+                REQUIRE_DTEST="true"
+                echo "  ✓ Matches dtest pattern"
+              fi
+              
+              # Unittest pattern: ^(?!dist\/docker\/|dist\/common\/scripts).*$
+              # Everything except dist/docker/, dist/common/scripts/
+              if [[ ! "$file" =~ ^dist/docker/.*$ ]] && [[ ! "$file" =~ ^dist/common/scripts.*$ ]]; then
+                REQUIRE_UNITTEST="true"
+                echo "  ✓ Matches unittest pattern"
+              fi
+              
+              # Artifacts pattern: ^(?:dist|tools\/toolchain).*$
+              # Files starting with dist or tools/toolchain
+              if [[ "$file" =~ ^dist.*$ ]] || [[ "$file" =~ ^tools/toolchain.*$ ]]; then
+                REQUIRE_ARTIFACTS="true"
+                echo "  ✓ Matches artifacts pattern"
+              fi
+              
+              # Scylla GDB pattern: ^(scylla-gdb.py).*$
+              # Files starting with scylla-gdb.py
+              if [[ "$file" =~ ^scylla-gdb\.py.*$ ]]; then
+                REQUIRE_SCYLLA_GDB="true"
+                echo "  ✓ Matches scylla_gdb pattern"
+              fi
+            fi
+          done <<< "$CHANGED_FILES"
+          
+          # Set environment variables
+          echo "requireBuild=$REQUIRE_BUILD" >> $GITHUB_ENV
+          echo "requireDtest=$REQUIRE_DTEST" >> $GITHUB_ENV
+          echo "requireUnittest=$REQUIRE_UNITTEST" >> $GITHUB_ENV
+          echo "requireArtifacts=$REQUIRE_ARTIFACTS" >> $GITHUB_ENV
+          echo "requireScyllaGdb=$REQUIRE_SCYLLA_GDB" >> $GITHUB_ENV
+          
+          echo ""
+          echo "✅ Rule 3: File analysis complete"
+          echo "Build required: $REQUIRE_BUILD"
+          echo "Dtest required: $REQUIRE_DTEST"
+          echo "Unittest required: $REQUIRE_UNITTEST"
+          echo "Artifacts required: $REQUIRE_ARTIFACTS"
+          echo "Scylla GDB required: $REQUIRE_SCYLLA_GDB"
+
+      - name: Determine Jenkins Job Name
+        run: |
+          if [[ "${{ github.ref_name }}" == "next" ]]; then
+            FOLDER_NAME="scylla-master"
+          elif [[ "${{ github.ref_name }}" == "next-enterprise" ]]; then
+            FOLDER_NAME="scylla-enterprise"
+          else
+            VERSION=$(echo "${{ github.ref_name }}" | awk -F'-' '{print $2}')
+            if [[ "$VERSION" =~ ^202[0-4]\.[0-9]+$ ]]; then
+              FOLDER_NAME="enterprise-$VERSION"
+            elif [[ "$VERSION" =~ ^[0-9]+\.[0-9]+$ ]]; then
+              FOLDER_NAME="scylla-$VERSION"
+            fi
+          fi
+          echo "JOB_NAME=${FOLDER_NAME}/job/scylla-ci" >> $GITHUB_ENV
+
+      - name: Trigger Jenkins Job
+        if: env.draft_or_conflict == 'false' && env.has_file_changes == 'true' && github.action == 'opened' || github.action == 'reopened'
+        env:
+          JENKINS_USER: ${{ secrets.JENKINS_USERNAME }}
+          JENKINS_API_TOKEN: ${{ secrets.JENKINS_TOKEN }}
+          JENKINS_URL: "https://jenkins.scylladb.com"
+          SLACK_BOT_TOKEN: ${{ secrets.SLACK_BOT_TOKEN }}
+        run: |
+          PR_NUMBER=${{ github.event.issue.number }}
+          PR_REPO_NAME=${{ github.event.repository.full_name }}
+          echo "Triggering Jenkins Job: $JOB_NAME"
+          curl -X POST \
+            "$JENKINS_URL/job/$JOB_NAME/buildWithParameters? \
+            PR_NUMBER=$PR_NUMBER& \
+            RUN_DTEST=$REQUIRE_DTEST& \
+            RUN_ONLY_SCYLLA_GDB=$REQUIRE_SCYLLA_GDB& \
+            RUN_UNIT_TEST=$REQUIRE_UNITTEST& \
+            FORCE_ON_CLOUD=$HAS_FORCE_ON_CLOUD_LABEL& \
+            SKIP_UNIT_TEST_CUSTOM=$SKIP_UNIT_TEST_CUSTOM& \
+            RUN_ARTIFACT_TESTS=$REQUIRE_ARTIFACTS" \
+            --fail \
+            --user "$JENKINS_USER:$JENKINS_API_TOKEN" \
+            -i -v
+  trigger-ci-via-comment:
+    if: github.event.comment.user.login != 'scylladbbot' && contains(github.event.comment.body, '@scylladbbot') && contains(github.event.comment.body, 'trigger-ci')
+    runs-on: ubuntu-latest
+    steps:
+      - name: Trigger Scylla-CI Jenkins Job
+        env:
+          JENKINS_USER: ${{ secrets.JENKINS_USERNAME }}
+          JENKINS_API_TOKEN: ${{ secrets.JENKINS_TOKEN }}
+          JENKINS_URL: "https://jenkins.scylladb.com"
+        run: |
+          PR_NUMBER=${{ github.event.issue.number }}
+          PR_REPO_NAME=${{ github.event.repository.full_name }}
+          curl -X POST "$JENKINS_URL/job/$JOB_NAME/buildWithParameters?PR_NUMBER=$PR_NUMBER" \
+          --user "$JENKINS_USER:$JENKINS_API_TOKEN" --fail -i -v
--- a/CMakeLists.txt
+++ b/CMakeLists.txt
@@ -116,6 +116,7 @@ list(APPEND absl_cxx_flags
 if(CMAKE_CXX_COMPILER_ID STREQUAL "GNU")
    list(APPEND ABSL_GCC_FLAGS ${absl_cxx_flags})
 elseif(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
+    list(APPEND absl_cxx_flags "-Wno-deprecated-builtins")
    list(APPEND ABSL_LLVM_FLAGS ${absl_cxx_flags})
 endif()
 set(ABSL_DEFAULT_LINKOPTS
@@ -163,7 +164,45 @@ file(MAKE_DIRECTORY "${scylla_gen_build_dir}")
 include(add_version_library)
 generate_scylla_version()

+option(Scylla_USE_PRECOMPILED_HEADER "Use precompiled header for Scylla" ON)
+add_library(scylla-precompiled-header STATIC exported_templates.cc)
+target_link_libraries(scylla-precompiled-header PRIVATE
+    absl::headers
+    absl::btree
+    absl::hash
+    absl::raw_hash_set
+    Seastar::seastar
+    Snappy::snappy
+    systemd
+    ZLIB::ZLIB
+    lz4::lz4_static
+    zstd::zstd_static)
+if (Scylla_USE_PRECOMPILED_HEADER)
+  set(Scylla_USE_PRECOMPILED_HEADER_USE ON)
+  find_program(DISTCC_EXEC NAMES distcc OPTIONAL)
+  if (DISTCC_EXEC)
+    if(DEFINED ENV{DISTCC_HOSTS})
+      set(Scylla_USE_PRECOMPILED_HEADER_USE OFF)
+      message(STATUS "Disabling precompiled header usage because distcc exists and DISTCC_HOSTS is set, assuming you're using distributed compilation.")
+    else()
+      file(REAL_PATH "~/.distcc/hosts" DIST_CC_HOSTS_PATH EXPAND_TILDE)
+      if (EXISTS ${DIST_CC_HOSTS_PATH})
+        set(Scylla_USE_PRECOMPILED_HEADER_USE OFF)
+        message(STATUS "Disabling precompiled header usage because distcc and ~/.distcc/hosts exists, assuming you're using distributed compilation.")
+      endif()
+    endif()
+  endif()
+  if (Scylla_USE_PRECOMPILED_HEADER_USE)
+    message(STATUS "Using precompiled header for Scylla - remember to add `sloppiness = pch_defines,time_macros` to ccache.conf, if you're using ccache.")
+    target_precompile_headers(scylla-precompiled-header PRIVATE "stdafx.hh")
+    target_compile_definitions(scylla-precompiled-header PRIVATE SCYLLA_USE_PRECOMPILED_HEADER)
+  endif()
+else()
+  set(Scylla_USE_PRECOMPILED_HEADER_USE OFF)
+endif()
+
 add_library(scylla-main STATIC)
+
 target_sources(scylla-main
  PRIVATE
    absl-flat_hash_map.cc
@@ -208,6 +247,7 @@ target_link_libraries(scylla-main
    ZLIB::ZLIB
    lz4::lz4_static
    zstd::zstd_static
+    scylla-precompiled-header
 )

 option(Scylla_CHECK_HEADERS
--- a/alternator/CMakeLists.txt
+++ b/alternator/CMakeLists.txt
@@ -18,6 +18,7 @@ target_sources(alternator
    consumed_capacity.cc
    ttl.cc
    parsed_expression_cache.cc
+    http_compression.cc
    ${cql_grammar_srcs})
 target_include_directories(alternator
  PUBLIC
@@ -34,5 +35,8 @@ target_link_libraries(alternator
    idl
    absl::headers)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(alternator REUSE_FROM scylla-precompiled-header)
+endif()
 check_headers(check-headers alternator
  GLOB_RECURSE ${CMAKE_CURRENT_SOURCE_DIR}/*.hh)
--- a/alternator/conditions.cc
+++ b/alternator/conditions.cc
@@ -42,7 +42,7 @@ comparison_operator_type get_comparison_operator(const rjson::value& comparison_
    if (!comparison_operator.IsString()) {
        throw api_error::validation(fmt::format("Invalid comparison operator definition {}", rjson::print(comparison_operator)));
    }
-    std::string op = comparison_operator.GetString();
+    std::string op = rjson::to_string(comparison_operator);
    auto it = ops.find(op);
    if (it == ops.end()) {
        throw api_error::validation(fmt::format("Unsupported comparison operator {}", op));
@@ -377,8 +377,8 @@ bool check_compare(const rjson::value* v1, const rjson::value& v2, const Compara
        return cmp(unwrap_number(*v1, cmp.diagnostic), unwrap_number(v2, cmp.diagnostic));
    }
    if (kv1.name == "S") {
-        return cmp(std::string_view(kv1.value.GetString(), kv1.value.GetStringLength()),
-                   std::string_view(kv2.value.GetString(), kv2.value.GetStringLength()));
+        return cmp(rjson::to_string_view(kv1.value),
+                   rjson::to_string_view(kv2.value));
    }
    if (kv1.name == "B") {
        auto d_kv1 = unwrap_bytes(kv1.value, v1_from_query);
@@ -470,9 +470,9 @@ static bool check_BETWEEN(const rjson::value* v, const rjson::value& lb, const r
        return check_BETWEEN(unwrap_number(*v, diag), unwrap_number(lb, diag), unwrap_number(ub, diag), bounds_from_query);
    }
    if (kv_v.name == "S") {
-        return check_BETWEEN(std::string_view(kv_v.value.GetString(), kv_v.value.GetStringLength()),
-                             std::string_view(kv_lb.value.GetString(), kv_lb.value.GetStringLength()),
-                             std::string_view(kv_ub.value.GetString(), kv_ub.value.GetStringLength()),
+        return check_BETWEEN(rjson::to_string_view(kv_v.value),
+                             rjson::to_string_view(kv_lb.value),
+                             rjson::to_string_view(kv_ub.value),
                             bounds_from_query);
    }
    if (kv_v.name == "B") {
--- a/alternator/consumed_capacity.cc
+++ b/alternator/consumed_capacity.cc
@@ -8,6 +8,8 @@

 #include "consumed_capacity.hh"
 #include "error.hh"
+#include "utils/rjson.hh"
+#include <fmt/format.h>

 namespace alternator {

@@ -32,12 +34,12 @@ bool consumed_capacity_counter::should_add_capacity(const rjson::value& request)
    if (!return_consumed->IsString()) {
        throw api_error::validation("Non-string ReturnConsumedCapacity field in request");
    }
-    std::string consumed = return_consumed->GetString();
+    std::string_view consumed = rjson::to_string_view(*return_consumed);
    if (consumed == "INDEXES") {
        throw api_error::validation("INDEXES consumed capacity is not supported");
    }
    if (consumed != "TOTAL") {
-        throw api_error::validation("Unknown consumed capacity "+ consumed);
+        throw api_error::validation(fmt::format("Unknown consumed capacity {}", consumed));
    }
    return true;
 }
--- a/alternator/controller.cc
+++ b/alternator/controller.cc
@@ -28,6 +28,7 @@ static logging::logger logger("alternator_controller");
 controller::controller(
        sharded<gms::gossiper>& gossiper,
        sharded<service::storage_proxy>& proxy,
+        sharded<service::storage_service>& ss,
        sharded<service::migration_manager>& mm,
        sharded<db::system_distributed_keyspace>& sys_dist_ks,
        sharded<cdc::generation_service>& cdc_gen_svc,
@@ -39,6 +40,7 @@ controller::controller(
    : protocol_server(sg)
    , _gossiper(gossiper)
    , _proxy(proxy)
+    , _ss(ss)
    , _mm(mm)
    , _sys_dist_ks(sys_dist_ks)
    , _cdc_gen_svc(cdc_gen_svc)
@@ -89,7 +91,7 @@ future<> controller::start_server() {
        auto get_timeout_in_ms = [] (const db::config& cfg) -> utils::updateable_value<uint32_t> {
            return cfg.alternator_timeout_in_ms;
        };
-        _executor.start(std::ref(_gossiper), std::ref(_proxy), std::ref(_mm), std::ref(_sys_dist_ks),
+        _executor.start(std::ref(_gossiper), std::ref(_proxy), std::ref(_ss), std::ref(_mm), std::ref(_sys_dist_ks),
                        sharded_parameter(get_cdc_metadata, std::ref(_cdc_gen_svc)), _ssg.value(),
                        sharded_parameter(get_timeout_in_ms, std::ref(_config))).get();
        _server.start(std::ref(_executor), std::ref(_proxy), std::ref(_gossiper), std::ref(_auth_service), std::ref(_sl_controller)).get();
@@ -103,11 +105,23 @@ future<> controller::start_server() {
            alternator_port = _config.alternator_port();
            _listen_addresses.push_back({addr, *alternator_port});
        }
+        std::optional<uint16_t> alternator_port_proxy_protocol;
+        if (_config.alternator_port_proxy_protocol()) {
+            alternator_port_proxy_protocol = _config.alternator_port_proxy_protocol();
+            _listen_addresses.push_back({addr, *alternator_port_proxy_protocol});
+        }
        std::optional<uint16_t> alternator_https_port;
+        std::optional<uint16_t> alternator_https_port_proxy_protocol;
        std::optional<tls::credentials_builder> creds;
-        if (_config.alternator_https_port()) {
-            alternator_https_port = _config.alternator_https_port();
-            _listen_addresses.push_back({addr, *alternator_https_port});
+        if (_config.alternator_https_port() || _config.alternator_https_port_proxy_protocol()) {
+            if (_config.alternator_https_port()) {
+                alternator_https_port = _config.alternator_https_port();
+                _listen_addresses.push_back({addr, *alternator_https_port});
+            }
+            if (_config.alternator_https_port_proxy_protocol()) {
+                alternator_https_port_proxy_protocol = _config.alternator_https_port_proxy_protocol();
+                _listen_addresses.push_back({addr, *alternator_https_port_proxy_protocol});
+            }
            creds.emplace();
            auto opts = _config.alternator_encryption_options();
            if (opts.empty()) {
@@ -133,19 +147,29 @@ future<> controller::start_server() {
            }
        }
        _server.invoke_on_all(
-                [this, addr, alternator_port, alternator_https_port, creds = std::move(creds)] (server& server) mutable {
-            return server.init(addr, alternator_port, alternator_https_port, creds,
+                [this, addr, alternator_port, alternator_https_port, alternator_port_proxy_protocol, alternator_https_port_proxy_protocol, creds = std::move(creds)] (server& server) mutable {
+            return server.init(addr, alternator_port, alternator_https_port, alternator_port_proxy_protocol, alternator_https_port_proxy_protocol, creds,
                    _config.alternator_enforce_authorization,
+                    _config.alternator_warn_authorization,
                    _config.alternator_max_users_query_size_in_trace_output,
                    &_memory_limiter.local().get_semaphore(),
                    _config.max_concurrent_requests_per_shard);
-        }).handle_exception([this, addr, alternator_port, alternator_https_port] (std::exception_ptr ep) {
-            logger.error("Failed to set up Alternator HTTP server on {} port {}, TLS port {}: {}",
-                    addr, alternator_port ? std::to_string(*alternator_port) : "OFF", alternator_https_port ? std::to_string(*alternator_https_port) : "OFF", ep);
+        }).handle_exception([this, addr, alternator_port, alternator_https_port, alternator_port_proxy_protocol, alternator_https_port_proxy_protocol] (std::exception_ptr ep) {
+            logger.error("Failed to set up Alternator HTTP server on {} port {}, TLS port {}, proxy-protocol port {}, TLS proxy-protocol port {}: {}",
+                    addr,
+                    alternator_port ? std::to_string(*alternator_port) : "OFF",
+                    alternator_https_port ? std::to_string(*alternator_https_port) : "OFF",
+                    alternator_port_proxy_protocol ? std::to_string(*alternator_port_proxy_protocol) : "OFF",
+                    alternator_https_port_proxy_protocol ? std::to_string(*alternator_https_port_proxy_protocol) : "OFF",
+                    ep);
            return stop_server().then([ep = std::move(ep)] { return make_exception_future<>(ep); });
-        }).then([addr, alternator_port, alternator_https_port] {
-            logger.info("Alternator server listening on {}, HTTP port {}, HTTPS port {}",
-                    addr, alternator_port ? std::to_string(*alternator_port) : "OFF", alternator_https_port ? std::to_string(*alternator_https_port) : "OFF");
+        }).then([addr, alternator_port, alternator_https_port, alternator_port_proxy_protocol, alternator_https_port_proxy_protocol] {
+            logger.info("Alternator server listening on {}, HTTP port {}, HTTPS port {}, proxy-protocol port {}, TLS proxy-protocol port {}",
+                    addr,
+                    alternator_port ? std::to_string(*alternator_port) : "OFF",
+                    alternator_https_port ? std::to_string(*alternator_https_port) : "OFF",
+                    alternator_port_proxy_protocol ? std::to_string(*alternator_port_proxy_protocol) : "OFF",
+                    alternator_https_port_proxy_protocol ? std::to_string(*alternator_https_port_proxy_protocol) : "OFF");
        }).get();
    });
 }
@@ -168,7 +192,7 @@ future<> controller::request_stop_server() {
    });
 }

-future<utils::chunked_vector<client_data>> controller::get_client_data() {
+future<utils::chunked_vector<foreign_ptr<std::unique_ptr<client_data>>>> controller::get_client_data() {
    return _server.local().get_client_data();
 }

--- a/alternator/controller.hh
+++ b/alternator/controller.hh
@@ -15,6 +15,7 @@

 namespace service {
 class storage_proxy;
+class storage_service;
 class migration_manager;
 class memory_limiter;
 }
@@ -57,6 +58,7 @@ class server;
 class controller : public protocol_server {
    sharded<gms::gossiper>& _gossiper;
    sharded<service::storage_proxy>& _proxy;
+    sharded<service::storage_service>& _ss;
    sharded<service::migration_manager>& _mm;
    sharded<db::system_distributed_keyspace>& _sys_dist_ks;
    sharded<cdc::generation_service>& _cdc_gen_svc;
@@ -74,6 +76,7 @@ public:
    controller(
        sharded<gms::gossiper>& gossiper,
        sharded<service::storage_proxy>& proxy,
+        sharded<service::storage_service>& ss,
        sharded<service::migration_manager>& mm,
        sharded<db::system_distributed_keyspace>& sys_dist_ks,
        sharded<cdc::generation_service>& cdc_gen_svc,
@@ -93,7 +96,7 @@ public:
    // This virtual function is called (on each shard separately) when the
    // virtual table "system.clients" is read. It is expected to generate a
    // list of clients connected to this server (on this shard).
-    virtual future<utils::chunked_vector<client_data>> get_client_data() override;
+    virtual future<utils::chunked_vector<foreign_ptr<std::unique_ptr<client_data>>>> get_client_data() override;
 };

 }
--- a/alternator/executor.cc
+++ b/alternator/executor.cc
--- a/alternator/executor.hh
+++ b/alternator/executor.hh
@@ -17,11 +17,13 @@
 #include "service/client_state.hh"
 #include "service_permit.hh"
 #include "db/timeout_clock.hh"
+#include "db/config.hh"

 #include "alternator/error.hh"
 #include "stats.hh"
 #include "utils/rjson.hh"
 #include "utils/updateable_value.hh"
+#include "utils/simple_value_with_expiry.hh"

 #include "tracing/trace_state.hh"

@@ -40,6 +42,8 @@ namespace cql3::selection {

 namespace service {
    class storage_proxy;
+    class cas_shard;
+    class storage_service;
 }

 namespace cdc {
@@ -56,7 +60,9 @@ class schema_builder;

 namespace alternator {

+enum class table_status;
 class rmw_operation;
+class put_or_delete_item;

 schema_ptr get_table(service::storage_proxy& proxy, const rjson::value& request);
 bool is_alternator_keyspace(const sstring& ks_name);
@@ -134,17 +140,24 @@ class expression_cache;

 class executor : public peering_sharded_service<executor> {
    gms::gossiper& _gossiper;
+    service::storage_service& _ss;
    service::storage_proxy& _proxy;
    service::migration_manager& _mm;
    db::system_distributed_keyspace& _sdks;
    cdc::metadata& _cdc_metadata;
    utils::updateable_value<bool> _enforce_authorization;
+    utils::updateable_value<bool> _warn_authorization;
    // An smp_service_group to be used for limiting the concurrency when
    // forwarding Alternator request between shards - if necessary for LWT.
    smp_service_group _ssg;

    std::unique_ptr<parsed::expression_cache> _parsed_expression_cache;

+    struct describe_table_info_manager;
+    std::unique_ptr<describe_table_info_manager> _describe_table_info_manager;
+
+    future<> cache_newly_calculated_size_on_all_shards(schema_ptr schema, std::uint64_t size_in_bytes, std::chrono::nanoseconds ttl);
+    future<> fill_table_size(rjson::value &table_description, schema_ptr schema, bool deleting);
 public:
    using client_state = service::client_state;
    // request_return_type is the return type of the executor methods, which
@@ -170,6 +183,7 @@ public:

    executor(gms::gossiper& gossiper,
             service::storage_proxy& proxy,
+             service::storage_service& ss,
             service::migration_manager& mm,
             db::system_distributed_keyspace& sdks,
             cdc::metadata& cdc_metadata,
@@ -217,6 +231,18 @@ private:
    friend class rmw_operation;

    static void describe_key_schema(rjson::value& parent, const schema&, std::unordered_map<std::string,std::string> * = nullptr, const std::map<sstring, sstring> *tags = nullptr);
+    future<rjson::value> fill_table_description(schema_ptr schema, table_status tbl_status, service::client_state& client_state, tracing::trace_state_ptr trace_state, service_permit permit);
+    future<executor::request_return_type> create_table_on_shard0(service::client_state&& client_state, tracing::trace_state_ptr trace_state, rjson::value request, bool enforce_authorization, bool warn_authorization, const db::tablets_mode_t::mode tablets_mode);
+
+    future<> do_batch_write(
+        std::vector<std::pair<schema_ptr, put_or_delete_item>> mutation_builders,
+        service::client_state& client_state,
+        tracing::trace_state_ptr trace_state,
+        service_permit permit);
+
+    future<> cas_write(schema_ptr schema, service::cas_shard cas_shard, const dht::decorated_key& dk,
+        const std::vector<put_or_delete_item>& mutation_builders, service::client_state& client_state,
+        tracing::trace_state_ptr trace_state, service_permit permit);

 public:
    static void describe_key_schema(rjson::value& parent, const schema& schema, std::unordered_map<std::string,std::string>&, const std::map<sstring, sstring> *tags = nullptr);
@@ -264,7 +290,7 @@ bool is_big(const rjson::value& val, int big_size = 100'000);
 // Check CQL's Role-Based Access Control (RBAC) permission (MODIFY,
 // SELECT, DROP, etc.) on the given table. When permission is denied an
 // appropriate user-readable api_error::access_denied is thrown.
-future<> verify_permission(bool enforce_authorization, const service::client_state&, const schema_ptr&, auth::permission);
+future<> verify_permission(bool enforce_authorization, bool warn_authorization, const service::client_state&, const schema_ptr&, auth::permission, alternator::stats& stats);

 /**
 * Make return type for serializing the object "streamed",
--- a/alternator/http_compression.cc
+++ b/alternator/http_compression.cc
@@ -0,0 +1,301 @@
+/*
+ * Copyright 2025-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#include "alternator/http_compression.hh"
+#include "alternator/server.hh"
+#include <seastar/coroutine/maybe_yield.hh>
+#include <zlib.h>
+
+static logging::logger slogger("alternator-http-compression");
+
+namespace alternator {
+
+
+static constexpr size_t compressed_buffer_size = 1024;
+class zlib_compressor {
+    z_stream _zs;
+    temporary_buffer<char> _output_buf;
+    noncopyable_function<future<>(temporary_buffer<char>&&)> _write_func;
+public:
+    zlib_compressor(bool gzip, int compression_level, noncopyable_function<future<>(temporary_buffer<char>&&)> write_func)
+     : _write_func(std::move(write_func)) {
+        memset(&_zs, 0, sizeof(_zs));
+        if (deflateInit2(&_zs, std::clamp(compression_level, Z_NO_COMPRESSION, Z_BEST_COMPRESSION), Z_DEFLATED,
+                (gzip ? 16 : 0) + MAX_WBITS, 8, Z_DEFAULT_STRATEGY) != Z_OK) {
+            // Should only happen if memory allocation fails
+            throw std::bad_alloc();
+        }
+    }
+    ~zlib_compressor() {
+        deflateEnd(&_zs);
+    }
+    future<> close() {
+        return compress(nullptr, 0, true);
+    }
+
+    future<> compress(const char* buf, size_t len, bool is_last_chunk = false) {
+        _zs.next_in = reinterpret_cast<unsigned char*>(const_cast<char*>(buf));
+        _zs.avail_in = (uInt) len;
+        int mode = is_last_chunk ? Z_FINISH : Z_NO_FLUSH;
+        while(_zs.avail_in > 0 || is_last_chunk) {
+            co_await coroutine::maybe_yield();
+            if (_output_buf.empty()) {
+                if (is_last_chunk) {
+                    uint32_t max_buffer_size = 0;
+                    deflatePending(&_zs, &max_buffer_size, nullptr);
+                    max_buffer_size += deflateBound(&_zs, _zs.avail_in) + 1;
+                    _output_buf = temporary_buffer<char>(std::min(compressed_buffer_size, (size_t) max_buffer_size));
+                } else {
+                    _output_buf = temporary_buffer<char>(compressed_buffer_size);
+                }
+                _zs.next_out = reinterpret_cast<unsigned char*>(_output_buf.get_write());
+                _zs.avail_out = compressed_buffer_size;
+            }
+            int e = deflate(&_zs, mode);
+            if (e < Z_OK) {
+                throw api_error::internal("Error during compression of response body");
+            }
+            if (e == Z_STREAM_END || _zs.avail_out < compressed_buffer_size / 4) {
+                _output_buf.trim(compressed_buffer_size - _zs.avail_out);
+                co_await _write_func(std::move(_output_buf));
+                if (e == Z_STREAM_END) {
+                    break;
+                }
+            }
+        }
+    }
+};
+
+// Helper string_view functions for parsing Accept-Encoding header
+struct case_insensitive_cmp_sv {
+    bool operator()(std::string_view s1, std::string_view s2) const {
+        return std::equal(s1.begin(), s1.end(), s2.begin(), s2.end(),
+            [](char a, char b) { return ::tolower(a) == ::tolower(b); });
+    }
+};
+static inline std::string_view trim_left(std::string_view sv) {
+    while (!sv.empty() && std::isspace(static_cast<unsigned char>(sv.front())))
+        sv.remove_prefix(1);
+    return sv;
+}
+static inline std::string_view trim_right(std::string_view sv) {
+    while (!sv.empty() && std::isspace(static_cast<unsigned char>(sv.back())))
+        sv.remove_suffix(1);
+    return sv;
+}
+static inline std::string_view trim(std::string_view sv) {
+    return trim_left(trim_right(sv));
+}
+
+inline std::vector<std::string_view> split(std::string_view text, char separator) {
+    std::vector<std::string_view> tokens;
+    if (text == "") {
+        return tokens;
+    }
+
+    while (true) {
+        auto pos = text.find_first_of(separator);
+        if (pos != std::string_view::npos) {
+            tokens.emplace_back(text.data(), pos);
+            text.remove_prefix(pos + 1);
+        } else {
+            tokens.emplace_back(text);
+            break;
+        }
+    }
+    return tokens;
+}
+
+constexpr response_compressor::compression_type response_compressor::get_compression_type(std::string_view encoding) {
+    for (size_t i = 0; i < static_cast<size_t>(compression_type::count); ++i) {
+        if (case_insensitive_cmp_sv{}(encoding, compression_names[i])) {
+            return static_cast<compression_type>(i);
+        }
+    }
+    return compression_type::unknown;
+}
+
+response_compressor::compression_type response_compressor::find_compression(std::string_view accept_encoding, size_t response_size) {
+    std::optional<float> ct_q[static_cast<size_t>(compression_type::count)];
+    ct_q[static_cast<size_t>(compression_type::none)] = std::numeric_limits<float>::min(); // enabled, but lowest priority
+    compression_type selected_ct = compression_type::none;
+
+    std::vector<std::string_view> entries = split(accept_encoding, ',');
+    for (auto& e : entries) {
+        std::vector<std::string_view> params = split(e, ';');
+        if (params.size() == 0) {
+            continue;
+        }
+        compression_type ct = get_compression_type(trim(params[0]));
+        if (ct == compression_type::unknown) {
+            continue; // ignore unknown encoding types
+        }
+        if (ct_q[static_cast<size_t>(ct)].has_value() && ct_q[static_cast<size_t>(ct)] != 0.0f) {
+            continue; // already processed this encoding
+        }
+        if (response_size < _threshold[static_cast<size_t>(ct)]) {
+            continue; // below threshold treat as unknown
+        }
+        for (size_t i = 1; i < params.size(); ++i) { // find "q=" parameter
+            auto pos = params[i].find("q=");
+            if (pos == std::string_view::npos) {
+                continue;
+            }
+            std::string_view param = params[i].substr(pos + 2);
+            param = trim(param);
+            // parse quality value
+            float q_value = 1.0f;
+            auto [ptr, ec] = std::from_chars(param.data(), param.data() + param.size(), q_value);
+            if (ec != std::errc() || ptr != param.data() + param.size()) {
+                continue;
+            }
+            if (q_value < 0.0) {
+                q_value = 0.0;
+            } else if (q_value > 1.0) {
+                q_value = 1.0;
+            }
+            ct_q[static_cast<size_t>(ct)] = q_value;
+            break; // we parsed quality value
+        }
+        if (!ct_q[static_cast<size_t>(ct)].has_value()) {
+            ct_q[static_cast<size_t>(ct)] = 1.0f; // default quality value
+        }
+        // keep the highest encoding (in the order, unless 'any')
+        if (selected_ct == compression_type::any) {
+            if (ct_q[static_cast<size_t>(ct)] >= ct_q[static_cast<size_t>(selected_ct)]) {
+                selected_ct = ct;
+            }
+        } else {
+            if (ct_q[static_cast<size_t>(ct)] > ct_q[static_cast<size_t>(selected_ct)]) {
+                selected_ct = ct;
+            }
+        }
+    }
+    if (selected_ct == compression_type::any) {
+        // select any not mentioned or highest quality
+        selected_ct = compression_type::none;
+        for (size_t i = 0; i < static_cast<size_t>(compression_type::compressions_count); ++i) {
+            if (!ct_q[i].has_value()) {
+                return static_cast<compression_type>(i);
+            }
+            if (ct_q[i] > ct_q[static_cast<size_t>(selected_ct)]) {
+                selected_ct = static_cast<compression_type>(i);
+            }
+        }
+    }
+    return selected_ct;
+}
+
+static future<chunked_content> compress(response_compressor::compression_type ct, const db::config& cfg, std::string str) {
+    chunked_content compressed;
+    auto write = [&compressed](temporary_buffer<char>&& buf) -> future<> {
+        compressed.push_back(std::move(buf));
+        return make_ready_future<>();
+    };
+    zlib_compressor compressor(ct != response_compressor::compression_type::deflate,
+        cfg.alternator_response_gzip_compression_level(), std::move(write));
+    co_await compressor.compress(str.data(), str.size(), true);
+    co_return compressed;
+}
+
+static sstring flatten(chunked_content&& cc) {
+    size_t total_size = 0;
+    for (const auto& chunk : cc) {
+        total_size += chunk.size();
+    }
+    sstring result = sstring{ sstring::initialized_later{}, total_size };
+    size_t offset = 0;
+    for (const auto& chunk : cc) {
+        std::copy(chunk.begin(), chunk.end(), result.begin() + offset);
+        offset += chunk.size();
+    }
+    return result;
+}
+
+future<std::unique_ptr<http::reply>> response_compressor::generate_reply(std::unique_ptr<http::reply> rep, sstring accept_encoding, const char* content_type, std::string&& response_body) {
+    response_compressor::compression_type ct = find_compression(accept_encoding, response_body.size());
+    if (ct != response_compressor::compression_type::none) {
+        rep->add_header("Content-Encoding", get_encoding_name(ct));
+        rep->set_content_type(content_type);
+        return compress(ct, cfg, std::move(response_body)).then([rep = std::move(rep)] (chunked_content compressed) mutable {
+            rep->_content = flatten(std::move(compressed));
+            return make_ready_future<std::unique_ptr<http::reply>>(std::move(rep));
+        });
+    } else {
+        // Note that despite the move, there is a copy here -
+        // as str is std::string and rep->_content is sstring.
+        rep->_content = std::move(response_body);
+        rep->set_content_type(content_type);
+    }
+    return make_ready_future<std::unique_ptr<http::reply>>(std::move(rep));
+}
+
+template<typename Compressor>
+class compressed_data_sink_impl : public data_sink_impl {
+    output_stream<char> _out;
+    Compressor _compressor;
+public:
+    template<typename... Args>
+    compressed_data_sink_impl(output_stream<char>&& out, Args&&... args)
+     : _out(std::move(out)), _compressor(std::forward<Args>(args)..., [this](temporary_buffer<char>&& buf) {
+        return _out.write(std::move(buf));
+    }) { }
+
+    future<> put(std::span<temporary_buffer<char>> data) override {
+        return data_sink_impl::fallback_put(data, [this] (temporary_buffer<char>&& buf) {
+            return do_put(std::move(buf));
+        });
+    }
+
+private:
+    future<> do_put(temporary_buffer<char> buf) {
+        co_return co_await _compressor.compress(buf.get(), buf.size());
+
+    }
+    future<> close() override {
+        return _compressor.close().then([this] {
+            return _out.close();
+        });
+    }
+};
+
+executor::body_writer compress(response_compressor::compression_type ct, const db::config& cfg, executor::body_writer&& bw) {
+    return [bw = std::move(bw), ct, level = cfg.alternator_response_gzip_compression_level()](output_stream<char>&& out) mutable -> future<> {
+        output_stream_options opts;
+        opts.trim_to_size = true;
+        std::unique_ptr<data_sink_impl> data_sink_impl;
+        switch (ct) {
+            case response_compressor::compression_type::gzip:
+                data_sink_impl = std::make_unique<compressed_data_sink_impl<zlib_compressor>>(std::move(out), true, level);
+                break;
+            case response_compressor::compression_type::deflate:
+                data_sink_impl = std::make_unique<compressed_data_sink_impl<zlib_compressor>>(std::move(out), false, level);
+                break;
+            case response_compressor::compression_type::none:
+            case response_compressor::compression_type::any:
+            case response_compressor::compression_type::unknown:
+                on_internal_error(slogger,"Compression not selected");
+            default:
+                on_internal_error(slogger, "Unsupported compression type for data sink");
+        }
+        return bw(output_stream<char>(data_sink(std::move(data_sink_impl)), compressed_buffer_size, opts));
+    };
+}
+
+future<std::unique_ptr<http::reply>> response_compressor::generate_reply(std::unique_ptr<http::reply> rep, sstring accept_encoding, const char* content_type, executor::body_writer&& body_writer) {
+    response_compressor::compression_type ct = find_compression(accept_encoding, std::numeric_limits<size_t>::max());
+    if (ct != response_compressor::compression_type::none) {
+        rep->add_header("Content-Encoding", get_encoding_name(ct));
+        rep->write_body(content_type, compress(ct, cfg, std::move(body_writer)));
+    } else {
+        rep->write_body(content_type, std::move(body_writer));
+    }
+    return make_ready_future<std::unique_ptr<http::reply>>(std::move(rep));
+}
+
+} // namespace alternator
--- a/alternator/http_compression.hh
+++ b/alternator/http_compression.hh
@@ -0,0 +1,91 @@
+/*
+ * Copyright 2025-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#pragma once
+
+#include "alternator/executor.hh"
+#include <seastar/http/httpd.hh>
+#include "db/config.hh"
+
+namespace alternator {
+
+class response_compressor {
+public:
+    enum class compression_type {
+        gzip,
+        deflate,
+        compressions_count,
+        any = compressions_count,
+        none,
+        count,
+        unknown = count
+    };
+    static constexpr std::string_view compression_names[] = {
+        "gzip",
+        "deflate",
+        "*",
+        "identity"
+    };
+
+    static sstring get_encoding_name(compression_type ct) {
+        return sstring(compression_names[static_cast<size_t>(ct)]);
+    }
+    static constexpr compression_type get_compression_type(std::string_view encoding);
+
+    sstring get_accepted_encoding(const http::request& req) {
+        if (get_threshold() == 0) {
+            return "";
+        }
+        return req.get_header("Accept-Encoding");
+    }
+    compression_type find_compression(std::string_view accept_encoding, size_t response_size);
+
+    response_compressor(const db::config& cfg)
+        : cfg(cfg)
+        ,_gzip_level_observer(
+            cfg.alternator_response_gzip_compression_level.observe([this](int v) {
+                    update_threshold();
+                }))
+        ,_gzip_threshold_observer(
+            cfg.alternator_response_compression_threshold_in_bytes.observe([this](uint32_t v) {
+                    update_threshold();
+                }))
+    {
+        update_threshold();
+    }
+    response_compressor(const response_compressor& rhs) : response_compressor(rhs.cfg) {}
+
+private:
+    const db::config& cfg;
+    utils::observable<int>::observer _gzip_level_observer;
+    utils::observable<uint32_t>::observer _gzip_threshold_observer;
+    uint32_t _threshold[static_cast<size_t>(compression_type::count)];
+
+    size_t get_threshold() { return _threshold[static_cast<size_t>(compression_type::any)]; }
+    void update_threshold() {
+        _threshold[static_cast<size_t>(compression_type::none)] = std::numeric_limits<uint32_t>::max();
+        _threshold[static_cast<size_t>(compression_type::any)] = std::numeric_limits<uint32_t>::max();
+        uint32_t gzip = cfg.alternator_response_gzip_compression_level() <= 0 ? std::numeric_limits<uint32_t>::max()
+            : cfg.alternator_response_compression_threshold_in_bytes();
+        _threshold[static_cast<size_t>(compression_type::gzip)] = gzip;
+        _threshold[static_cast<size_t>(compression_type::deflate)] = gzip;
+        for (size_t i = 0; i < static_cast<size_t>(compression_type::compressions_count); ++i) {
+            if (_threshold[i] < _threshold[static_cast<size_t>(compression_type::any)]) {
+                _threshold[static_cast<size_t>(compression_type::any)] = _threshold[i];
+            }
+        }
+    }
+
+public:
+    future<std::unique_ptr<http::reply>> generate_reply(std::unique_ptr<http::reply> rep,
+         sstring accept_encoding, const char* content_type, std::string&& response_body);
+    future<std::unique_ptr<http::reply>> generate_reply(std::unique_ptr<http::reply> rep,
+         sstring accept_encoding, const char* content_type, executor::body_writer&& body_writer);
+};
+
+}
--- a/alternator/serialization.cc
+++ b/alternator/serialization.cc
@@ -282,15 +282,23 @@ std::string type_to_string(data_type type) {
    return it->second;
 }

-bytes get_key_column_value(const rjson::value& item, const column_definition& column) {
+std::optional<bytes> try_get_key_column_value(const rjson::value& item, const column_definition& column) {
    std::string column_name = column.name_as_text();
    const rjson::value* key_typed_value = rjson::find(item, column_name);
    if (!key_typed_value) {
-        throw api_error::validation(fmt::format("Key column {} not found", column_name));
+        return std::nullopt;
    }
    return get_key_from_typed_value(*key_typed_value, column);
 }

+bytes get_key_column_value(const rjson::value& item, const column_definition& column) {
+    auto value = try_get_key_column_value(item, column);
+    if (!value) {
+        throw api_error::validation(fmt::format("Key column {} not found", column.name_as_text()));
+    }
+    return std::move(*value);
+}
+
 // Parses the JSON encoding for a key value, which is a map with a single
 // entry whose key is the type and the value is the encoded value.
 // If this type does not match the desired "type_str", an api_error::validation
@@ -380,20 +388,38 @@ clustering_key ck_from_json(const rjson::value& item, schema_ptr schema) {
        return clustering_key::make_empty();
    }
    std::vector<bytes> raw_ck;
-    // FIXME: this is a loop, but we really allow only one clustering key column.
+    // Note: it's possible to get more than one clustering column here, as
+    // Alternator can be used to read scylla internal tables.
    for (const column_definition& cdef : schema->clustering_key_columns()) {
-        bytes raw_value = get_key_column_value(item,  cdef);
+        auto raw_value = get_key_column_value(item,  cdef);
        raw_ck.push_back(std::move(raw_value));
    }

    return clustering_key::from_exploded(raw_ck);
 }

-position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema) {
-    auto ck = ck_from_json(item, schema);
-    if (is_alternator_keyspace(schema->ks_name())) {
-        return position_in_partition::for_key(std::move(ck));
+clustering_key_prefix ck_prefix_from_json(const rjson::value& item, schema_ptr schema) {
+    if (schema->clustering_key_size() == 0) {
+        return clustering_key_prefix::make_empty();
    }
+    std::vector<bytes> raw_ck;
+    for (const column_definition& cdef : schema->clustering_key_columns()) {
+        auto raw_value = try_get_key_column_value(item,  cdef);
+        if (!raw_value) {
+            break;
+        }
+        raw_ck.push_back(std::move(*raw_value));
+    }
+
+    return clustering_key_prefix::from_exploded(raw_ck);
+}
+
+position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema) {
+    const bool is_alternator_ks = is_alternator_keyspace(schema->ks_name());
+    if (is_alternator_ks) {
+        return position_in_partition::for_key(ck_from_json(item, schema));
+    }
+    
    const auto region_item = rjson::find(item, scylla_paging_region);
    const auto weight_item = rjson::find(item, scylla_paging_weight);
    if (bool(region_item) != bool(weight_item)) {
@@ -413,8 +439,9 @@ position_in_partition pos_from_json(const rjson::value& item, schema_ptr schema)
        } else {
            throw std::runtime_error(fmt::format("Invalid value for weight: {}", weight_view));
        }
-        return position_in_partition(region, weight, region == partition_region::clustered ? std::optional(std::move(ck)) : std::nullopt);
+        return position_in_partition(region, weight, region == partition_region::clustered ? std::optional(ck_prefix_from_json(item, schema)) : std::nullopt);
    }
+    auto ck = ck_from_json(item, schema);
    if (ck.is_empty()) {
        return position_in_partition::for_partition_start();
    }
@@ -469,7 +496,7 @@ const std::pair<std::string, const rjson::value*> unwrap_set(const rjson::value&
        return {"", nullptr};
    }
    auto it = v.MemberBegin();
-    const std::string it_key = it->name.GetString();
+    const std::string it_key = rjson::to_string(it->name);
    if (it_key != "SS" && it_key != "BS" && it_key != "NS") {
        return {std::move(it_key), nullptr};
    }
--- a/alternator/server.cc
+++ b/alternator/server.cc
@@ -13,6 +13,7 @@
 #include <seastar/http/function_handlers.hh>
 #include <seastar/http/short_streams.hh>
 #include <seastar/core/coroutine.hh>
+#include <seastar/coroutine/maybe_yield.hh>
 #include <seastar/util/defer.hh>
 #include <seastar/util/short_streams.hh>
 #include "seastarx.hh"
@@ -31,6 +32,9 @@
 #include "utils/overloaded_functor.hh"
 #include "utils/aws_sigv4.hh"
 #include "client_data.hh"
+#include "utils/updateable_value.hh"
+#include <zlib.h>
+#include "alternator/http_compression.hh"

 static logging::logger slogger("alternator-server");

@@ -108,9 +112,12 @@ class api_handler : public handler_base {
    // type applies to all replies, both success and error.
    static constexpr const char* REPLY_CONTENT_TYPE = "application/x-amz-json-1.0";
 public:
-    api_handler(const std::function<future<executor::request_return_type>(std::unique_ptr<request> req)>& _handle) : _f_handle(
+    api_handler(const std::function<future<executor::request_return_type>(std::unique_ptr<request> req)>& _handle,
+                const db::config& config) : _response_compressor(config), _f_handle(
         [this, _handle](std::unique_ptr<request> req, std::unique_ptr<reply> rep) {
-         return seastar::futurize_invoke(_handle, std::move(req)).then_wrapped([this, rep = std::move(rep)](future<executor::request_return_type> resf) mutable {
+         sstring accept_encoding = _response_compressor.get_accepted_encoding(*req);
+         return seastar::futurize_invoke(_handle, std::move(req)).then_wrapped(
+            [this, rep = std::move(rep), accept_encoding=std::move(accept_encoding)](future<executor::request_return_type> resf) mutable {
             if (resf.failed()) {
                 // Exceptions of type api_error are wrapped as JSON and
                 // returned to the client as expected. Other types of
@@ -130,22 +137,20 @@ public:
                 return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
             }
             auto res = resf.get();
-             std::visit(overloaded_functor {
+             return std::visit(overloaded_functor {
                [&] (std::string&& str) {
-                    // Note that despite the move, there is a copy here -
-                    // as str is std::string and rep->_content is sstring.
-                    rep->_content = std::move(str);
-                    rep->set_content_type(REPLY_CONTENT_TYPE);
+                    return _response_compressor.generate_reply(std::move(rep), std::move(accept_encoding),
+                                                               REPLY_CONTENT_TYPE, std::move(str));
                },
                [&] (executor::body_writer&& body_writer) {
-                    rep->write_body(REPLY_CONTENT_TYPE, std::move(body_writer));
+                    return _response_compressor.generate_reply(std::move(rep), std::move(accept_encoding),
+                                                               REPLY_CONTENT_TYPE, std::move(body_writer));
                },
                [&] (const api_error& err) {
                    generate_error_reply(*rep, err);
+                    return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
                }
             }, std::move(res));
-
-             return make_ready_future<std::unique_ptr<reply>>(std::move(rep));
         });
    }) { }

@@ -174,6 +179,7 @@ protected:
        slogger.trace("api_handler error case: {}", rep._content);
    }

+    response_compressor _response_compressor;
    future_handler_function _f_handle;
 };

@@ -270,24 +276,57 @@ protected:
    }
 };

+// This function increments the authentication_failures counter, and may also
+// log a warn-level message and/or throw an exception, depending on what
+// enforce_authorization and warn_authorization are set to.
+// The username and client address are only used for logging purposes -
+// they are not included in the error message returned to the client, since
+// the client knows who it is.
+// Note that if enforce_authorization is false, this function will return
+// without throwing. So a caller that doesn't want to continue after an
+// authentication_error must explicitly return after calling this function.
+template<typename Exception>
+static void authentication_error(alternator::stats& stats, bool enforce_authorization, bool warn_authorization, Exception&& e, std::string_view user, gms::inet_address client_address) {
+    stats.authentication_failures++;
+    if (enforce_authorization) {
+        if (warn_authorization) {
+            slogger.warn("alternator_warn_authorization=true: {} for user {}, client address {}", e.what(), user, client_address);
+        }
+        throw std::move(e);
+    } else {
+        if (warn_authorization) {
+            slogger.warn("If you set alternator_enforce_authorization=true the following will be enforced: {} for user {}, client address {}", e.what(), user, client_address);
+        }
+    }
+}
+
 future<std::string> server::verify_signature(const request& req, const chunked_content& content) {
-    if (!_enforce_authorization) {
+    if (!_enforce_authorization.get() && !_warn_authorization.get()) {
        slogger.debug("Skipping authorization");
        return make_ready_future<std::string>();
    }
    auto host_it = req._headers.find("Host");
    if (host_it == req._headers.end()) {
-        throw api_error::invalid_signature("Host header is mandatory for signature verification");
+        authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+            api_error::invalid_signature("Host header is mandatory for signature verification"), 
+            "", req.get_client_address());
+        return make_ready_future<std::string>();
    }
    auto authorization_it = req._headers.find("Authorization");
    if (authorization_it == req._headers.end()) {
-        throw api_error::missing_authentication_token("Authorization header is mandatory for signature verification");
+        authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+            api_error::missing_authentication_token("Authorization header is mandatory for signature verification"),
+            "", req.get_client_address());
+        return make_ready_future<std::string>();
    }
    std::string host = host_it->second;
    std::string_view authorization_header = authorization_it->second;
    auto pos = authorization_header.find_first_of(' ');
    if (pos == std::string_view::npos || authorization_header.substr(0, pos) != "AWS4-HMAC-SHA256") {
-        throw api_error::invalid_signature(fmt::format("Authorization header must use AWS4-HMAC-SHA256 algorithm: {}", authorization_header));
+        authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+            api_error::invalid_signature(fmt::format("Authorization header must use AWS4-HMAC-SHA256 algorithm: {}", authorization_header)),
+            "", req.get_client_address());
+        return make_ready_future<std::string>();
    }
    authorization_header.remove_prefix(pos+1);
    std::string credential;
@@ -322,7 +361,9 @@ future<std::string> server::verify_signature(const request& req, const chunked_c

    std::vector<std::string_view> credential_split = split(credential, '/');
    if (credential_split.size() != 5) {
-        throw api_error::validation(fmt::format("Incorrect credential information format: {}", credential));
+        authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+            api_error::validation(fmt::format("Incorrect credential information format: {}", credential)), "", req.get_client_address());
+        return make_ready_future<std::string>();
    }
    std::string user(credential_split[0]);
    std::string datestamp(credential_split[1]);
@@ -333,39 +374,81 @@ future<std::string> server::verify_signature(const request& req, const chunked_c
    for (const auto& header : signed_headers) {
        signed_headers_map.emplace(header, std::string_view());
    }
+    std::vector<std::string> modified_values;
    for (auto& header : req._headers) {
        std::string header_str;
        header_str.resize(header.first.size());
        std::transform(header.first.begin(), header.first.end(), header_str.begin(), ::tolower);
        auto it = signed_headers_map.find(header_str);
        if (it != signed_headers_map.end()) {
-            it->second = std::string_view(header.second);
+            // replace multiple spaces in the header value header.second with
+            // a single space, as required by AWS SigV4 header canonization.
+            // If we modify the value, we need to save it in modified_values
+            // to keep it alive.
+            std::string value;
+            value.reserve(header.second.size());
+            bool prev_space = false;
+            bool modified = false;
+            for (char ch : header.second) {
+                if (ch == ' ') {
+                    if (!prev_space) {
+                        value += ch;
+                        prev_space = true;
+                    } else {
+                        modified = true; // skip a space
+                    }
+                } else {
+                    value += ch;
+                    prev_space = false;
+                }
+            }
+            if (modified) {
+                modified_values.emplace_back(std::move(value));
+                it->second = std::string_view(modified_values.back());
+            } else {
+                it->second = std::string_view(header.second);
+            }
        }
    }

    auto cache_getter = [&proxy = _proxy, &as = _auth_service] (std::string username) {
        return get_key_from_roles(proxy, as, std::move(username));
    };
-    return _key_cache.get_ptr(user, cache_getter).then([this, &req, &content,
+    return _key_cache.get_ptr(user, cache_getter).then_wrapped([this, &req, &content,
                                                    user = std::move(user),
                                                    host = std::move(host),
                                                    datestamp = std::move(datestamp),
                                                    signed_headers_str = std::move(signed_headers_str),
                                                    signed_headers_map = std::move(signed_headers_map),
+                                                    modified_values = std::move(modified_values),
                                                    region = std::move(region),
                                                    service = std::move(service),
-                                                    user_signature = std::move(user_signature)] (key_cache::value_ptr key_ptr) {
+                                                    user_signature = std::move(user_signature)] (future<key_cache::value_ptr> key_ptr_fut) {
+        key_cache::value_ptr key_ptr(nullptr);
+        try {
+            key_ptr = key_ptr_fut.get();
+        } catch (const api_error& e) {
+            authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+                e, user, req.get_client_address());
+            return std::string();
+        }
        std::string signature;
        try {
            signature = utils::aws::get_signature(user, *key_ptr, std::string_view(host), "/", req._method,
                datestamp, signed_headers_str, signed_headers_map, &content, region, service, "");
        } catch (const std::exception& e) {
-            throw api_error::invalid_signature(e.what());
+            authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+                api_error::invalid_signature(fmt::format("invalid signature: {}", e.what())),
+                user, req.get_client_address());
+            return std::string();
        }

        if (signature != std::string_view(user_signature)) {
            _key_cache.remove(user);
-            throw api_error::unrecognized_client("The security token included in the request is invalid.");
+            authentication_error(_executor._stats, _enforce_authorization.get(), _warn_authorization.get(),
+                api_error::unrecognized_client("wrong signature"),
+                user, req.get_client_address());
+            return std::string();
        }
        return user;
    });
@@ -501,6 +584,108 @@ read_entire_stream(input_stream<char>& inp, size_t length_limit) {
    co_return ret;
 }

+// safe_gzip_stream is an exception-safe wrapper for zlib's z_stream.
+// The "z_stream" struct is used by zlib to hold state while decompressing a
+// stream of data. It allocates memory which must be freed with inflateEnd(),
+// which the destructor of this class does.
+class safe_gzip_zstream {
+    z_stream _zs;
+public:
+    // If gzip is true, decode a gzip header (for "Content-Encoding: gzip").
+    // Otherwise, a zlib header (for "Content-Encoding: deflate").
+    safe_gzip_zstream(bool gzip = true) {
+        memset(&_zs, 0, sizeof(_zs));
+        if (inflateInit2(&_zs, gzip ? 16 + MAX_WBITS : MAX_WBITS) != Z_OK) {
+            // Should only happen if memory allocation fails
+            throw std::bad_alloc();
+        }
+    }
+    ~safe_gzip_zstream() {
+        inflateEnd(&_zs);
+    }
+    z_stream* operator->() {
+        return &_zs;
+    }
+    z_stream* get() {
+        return &_zs;
+    }
+    void reset() {
+        inflateReset(&_zs);
+    }
+};
+
+// ungzip() takes a chunked_content of a compressed request body, and returns
+// the uncompressed content as a chunked_content. If gzip is true, we expect
+// gzip header (for "Content-Encoding: gzip"), if gzip is false, we expect a
+// zlib header (for "Content-Encoding: deflate").
+// If the uncompressed content exceeds length_limit, an error is thrown.
+static future<chunked_content>
+ungzip(chunked_content&& compressed_body, size_t length_limit, bool gzip = true) {
+    chunked_content ret;
+    // output_buf can be any size - when uncompressing input_buf, it doesn't
+    // need to fit in a single output_buf, we'll use multiple output_buf for
+    // a single input_buf if needed.
+    constexpr size_t OUTPUT_BUF_SIZE = 4096;
+    temporary_buffer<char> output_buf;
+    safe_gzip_zstream strm(gzip);
+    bool complete_stream = false; // empty input is not a valid gzip/deflate
+    size_t total_out_bytes = 0;
+    for (const temporary_buffer<char>& input_buf : compressed_body) {
+        if (input_buf.empty()) {
+            continue;
+        }
+        complete_stream = false;
+        strm->next_in = (Bytef*) input_buf.get();
+        strm->avail_in = (uInt) input_buf.size();
+        do {
+            co_await coroutine::maybe_yield();
+            if (output_buf.empty()) {
+                output_buf = temporary_buffer<char>(OUTPUT_BUF_SIZE);
+            }
+            strm->next_out = (Bytef*) output_buf.get();
+            strm->avail_out = OUTPUT_BUF_SIZE;
+            int e = inflate(strm.get(), Z_NO_FLUSH);
+            size_t out_bytes = OUTPUT_BUF_SIZE - strm->avail_out;
+            if (out_bytes > 0) {
+                // If output_buf is nearly full, we save it as-is in ret. But
+                // if it only has little data, better copy to a small buffer.
+                if (out_bytes > OUTPUT_BUF_SIZE/2) {
+                    ret.push_back(std::move(output_buf).prefix(out_bytes));
+                    // output_buf is now empty. if this loop finds more input,
+                    // we'll allocate a new output buffer.
+                } else {
+                    ret.push_back(temporary_buffer<char>(output_buf.get(), out_bytes));
+                }
+                total_out_bytes += out_bytes;
+                if (total_out_bytes > length_limit) {
+                    throw api_error::payload_too_large(fmt::format("Request content length limit of {} bytes exceeded", length_limit));
+                }
+            }
+            if (e == Z_STREAM_END) {
+                // There may be more input after the first gzip stream - in
+                // either this input_buf or the next one. The additional input
+                // should be a second concatenated gzip. We need to allow that
+                // by resetting the gzip stream and continuing the input loop
+                // until there's no more input.
+                strm.reset();
+                if (strm->avail_in == 0) {
+                    complete_stream = true;
+                    break;
+                }
+            } else if (e != Z_OK && e != Z_BUF_ERROR) {
+                // DynamoDB returns an InternalServerError when given a bad
+                // gzip request body. See test test_broken_gzip_content
+                throw api_error::internal("Error during gzip decompression of request body");
+            }
+        } while (strm->avail_in > 0 || strm->avail_out == 0);
+    }
+    if (!complete_stream) {
+        // The gzip stream was not properly finished with Z_STREAM_END
+        throw api_error::internal("Truncated gzip in request body");
+    }
+    co_return ret;
+}
+
 future<executor::request_return_type> server::handle_api_request(std::unique_ptr<request> req) {
    _executor._stats.total_operations++;
    sstring target = req->get_header("X-Amz-Target");
@@ -538,11 +723,32 @@ future<executor::request_return_type> server::handle_api_request(std::unique_ptr
        units.return_units(mem_estimate - new_mem_estimate);
    }
    auto username = co_await verify_signature(*req, content);
+    // If the request is compressed, uncompress it now, after we checked
+    // the signature (the signature is computed on the compressed content).
+    // We apply the request_content_length_limit again to the uncompressed
+    // content - we don't want to allow a tiny compressed request to
+    // expand to a huge uncompressed request.
+    sstring content_encoding = req->get_header("Content-Encoding");
+    if (content_encoding == "gzip") {
+        content = co_await ungzip(std::move(content), request_content_length_limit);
+    } else if (content_encoding == "deflate") {
+        content = co_await ungzip(std::move(content), request_content_length_limit, false);
+    } else if (!content_encoding.empty()) {
+        // DynamoDB returns a 500 error for unsupported Content-Encoding.
+        // I'm not sure if this is the best error code, but let's do it too.
+        // See the test test_garbage_content_encoding confirming this case.
+        co_return api_error::internal("Unsupported Content-Encoding");
+    }
+
    // As long as the system_clients_entry object is alive, this request will
    // be visible in the "system.clients" virtual table. When requested, this
    // entry will be formatted by server::ongoing_request::make_client_data().
+    auto user_agent_header = co_await _connection_options_keys_and_values.get_or_load(req->get_header("User-Agent"), [] (const client_options_cache_key_type&) {
+        return make_ready_future<options_cache_value_type>(options_cache_value_type{});
+    });
+
    auto system_clients_entry = _ongoing_requests.emplace(
-        req->get_client_address(), req->get_header("User-Agent"),
+        req->get_client_address(), std::move(user_agent_header),
        username, current_scheduling_group(),
        req->get_protocol_name() == "https");

@@ -587,7 +793,7 @@ future<executor::request_return_type> server::handle_api_request(std::unique_ptr
 void server::set_routes(routes& r) {
    api_handler* req_handler = new api_handler([this] (std::unique_ptr<request> req) mutable {
        return handle_api_request(std::move(req));
-    });
+    }, _proxy.data_dictionary().get_config());

    r.put(operation_type::POST, "/", req_handler);
    r.put(operation_type::GET, "/", new health_handler(_pending_requests));
@@ -618,7 +824,6 @@ server::server(executor& exec, service::storage_proxy& proxy, gms::gossiper& gos
        , _auth_service(auth_service)
        , _sl_controller(sl_controller)
        , _key_cache(1024, 1min, slogger)
-        , _enforce_authorization(false)
        , _max_users_query_size_in_trace_output(1024)
        , _enabled_servers{}
        , _pending_requests("alternator::server::pending_requests")
@@ -699,27 +904,38 @@ server::server(executor& exec, service::storage_proxy& proxy, gms::gossiper& gos
    } {
 }

-future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
-        utils::updateable_value<bool> enforce_authorization, utils::updateable_value<uint64_t> max_users_query_size_in_trace_output,
+future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port,
+        std::optional<uint16_t> port_proxy_protocol, std::optional<uint16_t> https_port_proxy_protocol,
+        std::optional<tls::credentials_builder> creds,
+        utils::updateable_value<bool> enforce_authorization, utils::updateable_value<bool> warn_authorization, utils::updateable_value<uint64_t> max_users_query_size_in_trace_output,
        semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests) {
    _memory_limiter = memory_limiter;
    _enforce_authorization = std::move(enforce_authorization);
+    _warn_authorization = std::move(warn_authorization);
    _max_concurrent_requests = std::move(max_concurrent_requests);
    _max_users_query_size_in_trace_output = std::move(max_users_query_size_in_trace_output);
-    if (!port && !https_port) {
+    if (!port && !https_port && !port_proxy_protocol && !https_port_proxy_protocol) {
        return make_exception_future<>(std::runtime_error("Either regular port or TLS port"
                " must be specified in order to init an alternator HTTP server instance"));
    }
-    return seastar::async([this, addr, port, https_port, creds] {
+    return seastar::async([this, addr, port, https_port, port_proxy_protocol, https_port_proxy_protocol, creds] {
        _executor.start().get();

-        if (port) {
+        if (port || port_proxy_protocol) {
            set_routes(_http_server._routes);
            _http_server.set_content_streaming(true);
-            _http_server.listen(socket_address{addr, *port}).get();
+            if (port) {
+                _http_server.listen(socket_address{addr, *port}).get();
+            }
+            if (port_proxy_protocol) {
+                listen_options lo;
+                lo.reuse_address = true;
+                lo.proxy_protocol = true;
+                _http_server.listen(socket_address{addr, *port_proxy_protocol}, lo).get();
+            }
            _enabled_servers.push_back(std::ref(_http_server));
        }
-        if (https_port) {
+        if (https_port || https_port_proxy_protocol) {
            set_routes(_https_server._routes);
            _https_server.set_content_streaming(true);

@@ -739,7 +955,15 @@ future<> server::init(net::inet_address addr, std::optional<uint16_t> port, std:
            } else {
                _credentials = creds->build_server_credentials();
            }
-            _https_server.listen(socket_address{addr, *https_port}, _credentials).get();
+            if (https_port) {
+                _https_server.listen(socket_address{addr, *https_port}, _credentials).get();
+            }
+            if (https_port_proxy_protocol) {
+                listen_options lo;
+                lo.reuse_address = true;
+                lo.proxy_protocol = true;
+                _https_server.listen(socket_address{addr, *https_port_proxy_protocol}, lo, _credentials).get();
+            }
            _enabled_servers.push_back(std::ref(_https_server));
        }
    });
@@ -812,16 +1036,15 @@ client_data server::ongoing_request::make_client_data() const {
    // and keep "driver_version" unset.
    cd.driver_name = _user_agent;
    // Leave "protocol_version" unset, it has no meaning in Alternator.
-    // Leave "hostname", "ssl_protocol" and "ssl_cipher_suite" unset.
-    // As reported in issue #9216, we never set these fields in CQL
-    // either (see cql_server::connection::make_client_data()).
+    // Leave "hostname", "ssl_protocol" and "ssl_cipher_suite" unset for Alternator.
+    // Note: CQL sets ssl_protocol and ssl_cipher_suite via generic_server::connection base class.
    return cd;
 }

-future<utils::chunked_vector<client_data>> server::get_client_data() {
-    utils::chunked_vector<client_data> ret;
+future<utils::chunked_vector<foreign_ptr<std::unique_ptr<client_data>>>> server::get_client_data() {
+    utils::chunked_vector<foreign_ptr<std::unique_ptr<client_data>>> ret;
    co_await _ongoing_requests.for_each_gently([&ret] (const ongoing_request& r) {
-        ret.emplace_back(r.make_client_data());
+        ret.emplace_back(make_foreign(std::make_unique<client_data>(r.make_client_data())));
    });
    co_return ret;
 }
--- a/alternator/server.hh
+++ b/alternator/server.hh
@@ -47,6 +47,7 @@ class server : public peering_sharded_service<server> {

    key_cache _key_cache;
    utils::updateable_value<bool> _enforce_authorization;
+    utils::updateable_value<bool> _warn_authorization;
    utils::updateable_value<uint64_t> _max_users_query_size_in_trace_output;
    utils::small_vector<std::reference_wrapper<seastar::httpd::http_server>, 2> _enabled_servers;
    named_gate _pending_requests;
@@ -54,6 +55,7 @@ class server : public peering_sharded_service<server> {
    // though it isn't really relevant for Alternator which defines its own
    // timeouts separately. We can create this object only once.
    updateable_timeout_config _timeout_config;
+    client_options_cache_type _connection_options_keys_and_values;

    alternator_callbacks_map _callbacks;

@@ -87,7 +89,7 @@ class server : public peering_sharded_service<server> {
    // is called when reading the "system.clients" virtual table.
    struct ongoing_request {
        socket_address _client_address;
-        sstring _user_agent;
+        client_options_cache_entry_type _user_agent;
        sstring _username;
        scheduling_group _scheduling_group;
        bool _is_https;
@@ -98,15 +100,17 @@ class server : public peering_sharded_service<server> {
 public:
    server(executor& executor, service::storage_proxy& proxy, gms::gossiper& gossiper, auth::service& service, qos::service_level_controller& sl_controller);

-    future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port, std::optional<tls::credentials_builder> creds,
-            utils::updateable_value<bool> enforce_authorization, utils::updateable_value<uint64_t> max_users_query_size_in_trace_output,
+    future<> init(net::inet_address addr, std::optional<uint16_t> port, std::optional<uint16_t> https_port,
+            std::optional<uint16_t> port_proxy_protocol, std::optional<uint16_t> https_port_proxy_protocol,
+            std::optional<tls::credentials_builder> creds,
+            utils::updateable_value<bool> enforce_authorization, utils::updateable_value<bool> warn_authorization, utils::updateable_value<uint64_t> max_users_query_size_in_trace_output,
            semaphore* memory_limiter, utils::updateable_value<uint32_t> max_concurrent_requests);
    future<> stop();
    // get_client_data() is called (on each shard separately) when the virtual
    // table "system.clients" is read. It is expected to generate a list of
    // clients connected to this server (on this shard). This function is
    // called by alternator::controller::get_client_data().
-    future<utils::chunked_vector<client_data>> get_client_data();
+    future<utils::chunked_vector<foreign_ptr<std::unique_ptr<client_data>>>> get_client_data();
 private:
    void set_routes(seastar::httpd::routes& r);
    // If verification succeeds, returns the authenticated user's username
--- a/alternator/stats.cc
+++ b/alternator/stats.cc
@@ -188,6 +188,16 @@ static void register_metrics_with_optional_table(seastar::metrics::metric_groups
            seastar::metrics::make_total_operations("expression_cache_misses", stats.expression_cache.requests[stats::expression_types::PROJECTION_EXPRESSION].misses,
                    seastar::metrics::description("Counts number of misses of cached expressions"), labels)(expression_label("ProjectionExpression")).aggregate(aggregate_labels).set_skip_when_empty()
    });
+
+    // Only register the following metrics for the global metrics, not per-table
+    if (!has_table) {
+        metrics.add_group("alternator", {
+            seastar::metrics::make_counter("authentication_failures", stats.authentication_failures,
+                seastar::metrics::description("total number of authentication failures"), labels).aggregate({seastar::metrics::shard_label}).set_skip_when_empty(),
+            seastar::metrics::make_counter("authorization_failures", stats.authorization_failures,
+                seastar::metrics::description("total number of authorization failures"), labels).aggregate({seastar::metrics::shard_label}).set_skip_when_empty(),
+        });
+    }
 }

 void register_metrics(seastar::metrics::metric_groups& metrics, const stats& stats) {
--- a/alternator/stats.hh
+++ b/alternator/stats.hh
@@ -105,6 +105,17 @@ public:
        // The sizes are the the written items' sizes grouped per table.
        utils::estimated_histogram batch_write_item_op_size_kb{30};
    } operation_sizes;
+    // Count of authentication and authorization failures, counted if either
+    // alternator_enforce_authorization or alternator_warn_authorization are
+    // set to true. If both are false, no authentication or authorization
+    // checks are performed, so failures are not recognized or counted.
+    // "authentication" failure means the request was not signed with a valid
+    // user and key combination. "authorization" failure means the request was
+    // authenticated to a valid user - but this user did not have permissions
+    // to perform the operation (considering RBAC settings and the user's
+    // superuser status).
+    uint64_t authentication_failures = 0;
+    uint64_t authorization_failures = 0;
    // Miscellaneous event counters
    uint64_t total_operations = 0;
    uint64_t unsupported_operations = 0;
--- a/alternator/streams.cc
+++ b/alternator/streams.cc
@@ -827,7 +827,7 @@ future<executor::request_return_type> executor::get_records(client_state& client

    tracing::add_table_name(trace_state, schema->ks_name(), schema->cf_name());

-    co_await verify_permission(_enforce_authorization, client_state, schema, auth::permission::SELECT);
+    co_await verify_permission(_enforce_authorization, _warn_authorization, client_state, schema, auth::permission::SELECT, _stats);

    db::consistency_level cl = db::consistency_level::LOCAL_QUORUM;
    partition_key pk = iter.shard.id.to_partition_key(*schema);
@@ -1073,9 +1073,7 @@ bool executor::add_stream_options(const rjson::value& stream_specification, sche
    }

    if (stream_enabled->GetBool()) {
-        auto db = sp.data_dictionary();
-
-        if (!db.features().alternator_streams) {
+        if (!sp.features().alternator_streams) {
            throw api_error::validation("StreamSpecification: alternator streams feature not enabled in cluster.");
        }

--- a/alternator/ttl.cc
+++ b/alternator/ttl.cc
@@ -68,7 +68,7 @@ extern const sstring TTL_TAG_KEY;

 future<executor::request_return_type> executor::update_time_to_live(client_state& client_state, service_permit permit, rjson::value request) {
    _stats.api_operations.update_time_to_live++;
-    if (!_proxy.data_dictionary().features().alternator_ttl) {
+    if (!_proxy.features().alternator_ttl) {
        co_return api_error::unknown_operation("UpdateTimeToLive not yet supported. Experimental support is available if the 'alternator-ttl' experimental feature is enabled on all nodes.");
    }

@@ -93,9 +93,9 @@ future<executor::request_return_type> executor::update_time_to_live(client_state
    if (v->GetStringLength() < 1 || v->GetStringLength() > 255) {
        co_return api_error::validation("The length of AttributeName must be between 1 and 255");
    }
-    sstring attribute_name(v->GetString(), v->GetStringLength());
+    sstring attribute_name = rjson::to_sstring(*v);

-    co_await verify_permission(_enforce_authorization, client_state, schema, auth::permission::ALTER);
+    co_await verify_permission(_enforce_authorization, _warn_authorization, client_state, schema, auth::permission::ALTER, _stats);
    co_await db::modify_tags(_mm, schema->ks_name(), schema->cf_name(), [&](std::map<sstring, sstring>& tags_map) {
        if (enabled) {
            if (tags_map.contains(TTL_TAG_KEY)) {
@@ -753,7 +753,7 @@ static future<bool> scan_table(
        auto my_host_id = erm->get_topology().my_host_id();
        const auto &tablet_map = erm->get_token_metadata().tablets().get_tablet_map(s->id());
        for (std::optional tablet = tablet_map.first_tablet(); tablet; tablet = tablet_map.next_tablet(*tablet)) {
-            auto tablet_primary_replica = tablet_map.get_primary_replica(*tablet);
+            auto tablet_primary_replica = tablet_map.get_primary_replica(*tablet, erm->get_topology());
            // check if this is the primary replica for the current tablet
            if (tablet_primary_replica.host == my_host_id && tablet_primary_replica.shard == this_shard_id()) {
                co_await scan_tablet(*tablet, proxy, abort_source, page_sem, expiration_stats, scan_ctx, tablet_map);
--- a/api/CMakeLists.txt
+++ b/api/CMakeLists.txt
@@ -31,6 +31,7 @@ set(swagger_files
  api-doc/column_family.json
  api-doc/commitlog.json
  api-doc/compaction_manager.json
+  api-doc/client_routes.json
  api-doc/config.json
  api-doc/cql_server_test.json
  api-doc/endpoint_snitch_info.json
@@ -68,6 +69,7 @@ target_sources(api
  PRIVATE
    api.cc
    cache_service.cc
+    client_routes.cc
    collectd.cc
    column_family.cc
    commitlog.cc
@@ -106,5 +108,8 @@ target_link_libraries(api
    wasmtime_bindings
    absl::headers)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(api REUSE_FROM scylla-precompiled-header)
+endif()
 check_headers(check-headers api
  GLOB_RECURSE ${CMAKE_CURRENT_SOURCE_DIR}/*.hh)
--- a/api/api-doc/client_routes.def.json
+++ b/api/api-doc/client_routes.def.json
@@ -0,0 +1,23 @@
+    , "client_routes_entry": {
+        "id": "client_routes_entry",
+        "summary": "An entry storing client routes",
+        "properties": {
+            "connection_id": {"type": "string"},
+            "host_id": {"type": "string", "format": "uuid"},
+            "address": {"type": "string"},
+            "port": {"type": "integer"},
+            "tls_port": {"type": "integer"},
+            "alternator_port": {"type": "integer"},
+            "alternator_https_port": {"type": "integer"}
+        },
+        "required": ["connection_id", "host_id", "address"]
+    }
+    , "client_routes_key": {
+        "id": "client_routes_key",
+        "summary": "A key of client_routes_entry",
+        "properties": {
+            "connection_id": {"type": "string"},
+            "host_id": {"type": "string", "format": "uuid"}
+        }
+    }
+
--- a/api/api-doc/client_routes.json
+++ b/api/api-doc/client_routes.json
@@ -0,0 +1,74 @@
+    , "/v2/client-routes":{
+        "get": {
+            "description":"List all client route entries",
+            "operationId":"get_client_routes",
+            "tags":["client_routes"],
+            "produces":[
+                "application/json"
+            ],
+            "parameters":[],
+            "responses":{
+                "200":{
+                    "schema":{
+                        "type":"array",
+                        "items":{ "$ref":"#/definitions/client_routes_entry" }
+                    }
+                },
+                "default":{
+                    "description":"unexpected error",
+                    "schema":{"$ref":"#/definitions/ErrorModel"}
+                }
+            }
+        },
+        "post": {
+            "description":"Upsert one or more client route entries",
+            "operationId":"set_client_routes",
+            "tags":["client_routes"],
+            "parameters":[
+                {
+                    "name":"body",
+                    "in":"body",
+                    "required":true,
+                    "schema":{
+                        "type":"array",
+                        "items":{ "$ref":"#/definitions/client_routes_entry" }
+                    }
+                }
+            ],
+            "responses":{
+                "200":{ "description": "OK" },
+                "default":{
+                    "description":"unexpected error",
+                    "schema":{ "$ref":"#/definitions/ErrorModel" }
+                }
+            }
+        },
+        "delete": {
+            "description":"Delete one or more client route entries",
+            "operationId":"delete_client_routes",
+            "tags":["client_routes"],
+            "parameters":[
+                {
+                    "name":"body",
+                    "in":"body",
+                    "required":true,
+                    "schema":{
+                        "type":"array",
+                        "items":{ "$ref":"#/definitions/client_routes_key" }
+                    }
+                }
+            ],
+            "responses":{
+                "200":{
+                    "description": "OK"
+                },
+                "default":{
+                    "description":"unexpected error",
+                    "schema":{
+                        "$ref":"#/definitions/ErrorModel"
+                    }
+                }
+            }
+        }
+    }
+
--- a/api/api-doc/storage_service.json
+++ b/api/api-doc/storage_service.json
@@ -220,6 +220,25 @@
            }
         ]
      },
+      {
+         "path":"/storage_service/nodes/excluded",
+         "operations":[
+            {
+               "method":"GET",
+               "summary":"Retrieve host ids of nodes which are marked as excluded",
+               "type":"array",
+               "items":{
+                  "type":"string"
+               },
+               "nickname":"get_excluded_nodes",
+               "produces":[
+                  "application/json"
+               ],
+               "parameters":[
+               ]
+            }
+         ]
+      },
      {
         "path":"/storage_service/nodes/joining",
         "operations":[
@@ -942,6 +961,14 @@
                          "type":"string",
                          "paramType":"query",
                          "enum": ["all", "dc", "rack", "node"]
+                      },
+                      {
+                         "name":"primary_replica_only",
+                         "description":"Load the sstables and stream to the primary replica node within the scope, if one is specified. If not, stream to the global primary replica.",
+                         "required":false,
+                         "allowMultiple":false,
+                         "type":"boolean",
+                         "paramType":"query"
                      }
                  ]
              }
@@ -1028,7 +1055,7 @@
         ]
      },
      {
-         "path":"/storage_service/cleanup_all",
+         "path":"/storage_service/cleanup_all/",
         "operations":[
            {
               "method":"POST",
@@ -1038,6 +1065,30 @@
               "produces":[
                  "application/json"
               ],
+               "parameters":[
+                    {
+                     "name":"global",
+                     "description":"true if cleanup of entire cluster is requested",
+                     "required":false,
+                     "allowMultiple":false,
+                     "type":"boolean",
+                     "paramType":"query"
+                  }
+               ]
+            }
+         ]
+      },
+      {
+         "path":"/storage_service/mark_node_as_clean",
+         "operations":[
+            {
+               "method":"POST",
+               "summary":"Mark the node as clean. After that the node will not be considered as needing cleanup during automatic cleanup which is triggered by some topology operations",
+               "type":"void",
+               "nickname":"reset_cleanup_needed",
+               "produces":[
+                  "application/json"
+               ],
               "parameters":[]
            }
         ]
@@ -1571,6 +1622,30 @@
            }
         ]
      },
+      {
+         "path":"/storage_service/exclude_node",
+         "operations":[
+            {
+               "method":"POST",
+               "summary":"Marks the node as permanently down (excluded).",
+               "type":"void",
+               "nickname":"exclude_node",
+               "produces":[
+                  "application/json"
+               ],
+               "parameters":[
+                  {
+                     "name":"hosts",
+                     "description":"Comma-separated list of host ids to exclude",
+                     "required":true,
+                     "allowMultiple":false,
+                     "type":"string",
+                     "paramType":"query"
+                  }
+               ]
+            }
+         ]
+      },
      {
         "path":"/storage_service/removal_status",
         "operations":[
@@ -2976,7 +3051,7 @@
                  },
                  {
                     "name":"incremental_mode",
-                     "description":"Set the incremental repair mode. Can be 'disabled', 'incremental', or 'full'. 'incremental': The incremental repair logic is enabled. Unrepaired sstables will be included for repair. Repaired sstables will be skipped. The incremental repair states will be updated after repair. 'full': The incremental repair logic is enabled. Both repaired and unrepaired sstables will be included for repair. The incremental repair states will be updated after repair. 'disabled': The incremental repair logic is disabled completely. The incremental repair states, e.g., repaired_at in sstables and sstables_repaired_at in the system.tablets table, will not be updated after repair. When the option is not provided, it defaults to incremental mode.",
+                     "description":"Set the incremental repair mode. Can be 'disabled', 'incremental', or 'full'. 'incremental': The incremental repair logic is enabled. Unrepaired sstables will be included for repair. Repaired sstables will be skipped. The incremental repair states will be updated after repair. 'full': The incremental repair logic is enabled. Both repaired and unrepaired sstables will be included for repair. The incremental repair states will be updated after repair. 'disabled': The incremental repair logic is disabled completely. The incremental repair states, e.g., repaired_at in sstables and sstables_repaired_at in the system.tablets table, will not be updated after repair. When the option is not provided, it defaults to 'disabled' mode.",
                     "required":false,
                     "allowMultiple":false,
                     "type":"string",
--- a/api/api-doc/tasks.json
+++ b/api/api-doc/tasks.json
@@ -42,6 +42,14 @@
                     "allowMultiple":false,
                     "type":"boolean",
                     "paramType":"query"
+                  },
+                  {
+                     "name":"consider_only_existing_data",
+                     "description":"Set to \"true\" to flush all memtables and force tombstone garbage collection to check only the sstables being compacted (false by default). The memtable, commitlog and other uncompacted sstables will not be checked during tombstone garbage collection.",
+                     "required":false,
+                     "allowMultiple":false,
+                     "type":"boolean",
+                     "paramType":"query"
                  }
               ]
            }
--- a/api/api.cc
+++ b/api/api.cc
@@ -37,6 +37,7 @@
 #include "raft.hh"
 #include "gms/gossip_address_map.hh"
 #include "service_levels.hh"
+#include "client_routes.hh"

 logging::logger apilog("api");

@@ -67,9 +68,11 @@ future<> set_server_init(http_context& ctx) {
        rb02->set_api_doc(r);
        rb02->register_api_file(r, "swagger20_header");
        rb02->register_api_file(r, "metrics");
+        rb02->register_api_file(r, "client_routes");
        rb->register_function(r, "system",
                "The system related API");
        rb02->add_definitions_file(r, "metrics");
+        rb02->add_definitions_file(r, "client_routes");
        set_system(ctx, r);
        rb->register_function(r, "error_injection",
            "The error injection API");
@@ -129,6 +132,16 @@ future<> unset_server_storage_service(http_context& ctx) {
    return ctx.http_server.set_routes([&ctx] (routes& r) { unset_storage_service(ctx, r); });
 }

+future<> set_server_client_routes(http_context& ctx, sharded<service::client_routes_service>& cr) {
+    return ctx.http_server.set_routes([&ctx, &cr] (routes& r) {
+        set_client_routes(ctx, r, cr);
+    });
+}
+
+future<> unset_server_client_routes(http_context& ctx) {
+    return ctx.http_server.set_routes([&ctx] (routes& r) { unset_client_routes(ctx, r); });
+}
+
 future<> set_load_meter(http_context& ctx, service::load_meter& lm) {
    return ctx.http_server.set_routes([&ctx, &lm] (routes& r) { set_load_meter(ctx, r, lm); });
 }
--- a/api/api_init.hh
+++ b/api/api_init.hh
@@ -29,6 +29,7 @@ class storage_proxy;
 class storage_service;
 class raft_group0_client;
 class raft_group_registry;
+class client_routes_service;

 } // namespace service

@@ -99,6 +100,8 @@ future<> set_server_snitch(http_context& ctx, sharded<locator::snitch_ptr>& snit
 future<> unset_server_snitch(http_context& ctx);
 future<> set_server_storage_service(http_context& ctx, sharded<service::storage_service>& ss, service::raft_group0_client&);
 future<> unset_server_storage_service(http_context& ctx);
+future<> set_server_client_routes(http_context& ctx, sharded<service::client_routes_service>& cr);
+future<> unset_server_client_routes(http_context& ctx);
 future<> set_server_sstables_loader(http_context& ctx, sharded<sstables_loader>& sst_loader);
 future<> unset_server_sstables_loader(http_context& ctx);
 future<> set_server_view_builder(http_context& ctx, sharded<db::view::view_builder>& vb, sharded<gms::gossiper>& g);
--- a/api/client_routes.cc
+++ b/api/client_routes.cc
@@ -0,0 +1,176 @@
+/*
+ * Copyright (C) 2025-present ScyllaDB
+ *
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+ #include <seastar/http/short_streams.hh>
+
+#include "client_routes.hh"
+#include "api/api.hh"
+#include "service/storage_service.hh"
+#include "service/client_routes.hh"
+#include "utils/rjson.hh"
+
+
+#include "api/api-doc/client_routes.json.hh"
+
+using namespace seastar::httpd;
+using namespace std::chrono_literals;
+using namespace json;
+
+extern logging::logger apilog;
+
+namespace api {
+
+static void validate_client_routes_endpoint(sharded<service::client_routes_service>& cr, sstring endpoint_name) {
+    if (!cr.local().get_feature_service().client_routes) {
+        apilog.warn("{}: called before the cluster feature was enabled", endpoint_name);
+        throw std::runtime_error(fmt::format("{} requires all nodes to support the CLIENT_ROUTES cluster feature", endpoint_name));
+    }
+}
+
+static sstring parse_string(const char* name, rapidjson::Value const& v) {
+    const auto it = v.FindMember(name);
+    if (it == v.MemberEnd()) {
+        throw bad_param_exception(fmt::format("Missing '{}'", name));
+    }
+    if (!it->value.IsString()) {
+        throw bad_param_exception(fmt::format("'{}' must be a string", name));
+    }
+    return {it->value.GetString(), it->value.GetStringLength()};
+}
+
+static std::optional<uint32_t> parse_port(const char* name, rapidjson::Value const& v) {
+    const auto it = v.FindMember(name);
+    if (it == v.MemberEnd()) {
+        return std::nullopt;
+    }
+    if (!it->value.IsInt()) {
+        throw bad_param_exception(fmt::format("'{}' must be an integer", name));
+    }
+    auto port = it->value.GetInt();
+    if (port < 1 || port > 65535) {
+        throw bad_param_exception(fmt::format("'{}' value={} is outside the allowed port range", name, port));
+    }
+    return port;
+}
+
+static std::vector<service::client_routes_service::client_route_entry> parse_set_client_array(const rapidjson::Document& root) {
+    if (!root.IsArray()) {
+        throw bad_param_exception("Body must be a JSON array");
+    }
+
+    std::vector<service::client_routes_service::client_route_entry> v;
+    v.reserve(root.GetArray().Size());
+    for (const auto& element : root.GetArray()) {
+        if (!element.IsObject()) { throw bad_param_exception("Each element must be object"); }
+
+        const auto port = parse_port("port", element);
+        const auto tls_port = parse_port("tls_port", element);
+        const auto alternator_port = parse_port("alternator_port", element);
+        const auto alternator_https_port = parse_port("alternator_https_port", element);
+
+        if (!port.has_value() && !tls_port.has_value() && !alternator_port.has_value() && !alternator_https_port.has_value()) {
+            throw bad_param_exception("At least one port field ('port', 'tls_port', 'alternator_port', 'alternator_https_port') must be specified");
+        }
+
+        v.emplace_back(
+            parse_string("connection_id", element),
+            utils::UUID{parse_string("host_id", element)},
+            parse_string("address", element),
+            port,
+            tls_port,
+            alternator_port,
+            alternator_https_port
+        );
+    }
+
+    return v;
+}
+
+static
+future<json::json_return_type>
+rest_set_client_routes(http_context& ctx, sharded<service::client_routes_service>& cr, std::unique_ptr<http::request> req) {
+    validate_client_routes_endpoint(cr, "rest_set_client_routes");
+
+    rapidjson::Document root;
+    auto content = co_await util::read_entire_stream_contiguous(*req->content_stream);
+    root.Parse(content.c_str());
+
+    co_await cr.local().set_client_routes(parse_set_client_array(root));
+    co_return seastar::json::json_void();
+}
+
+static std::vector<service::client_routes_service::client_route_key> parse_delete_client_array(const rapidjson::Document& root) {
+    if (!root.IsArray()) {
+        throw bad_param_exception("Body must be a JSON array");
+    }
+
+    std::vector<service::client_routes_service::client_route_key> v;
+    v.reserve(root.GetArray().Size());
+    for (const auto& element : root.GetArray()) {
+        v.emplace_back(
+            parse_string("connection_id", element),
+            utils::UUID{parse_string("host_id", element)}
+        );
+    }
+
+    return v;
+}
+
+static
+future<json::json_return_type>
+rest_delete_client_routes(http_context& ctx, sharded<service::client_routes_service>& cr, std::unique_ptr<http::request> req) {
+    validate_client_routes_endpoint(cr, "delete_client_routes");
+
+    rapidjson::Document root;
+    auto content = co_await util::read_entire_stream_contiguous(*req->content_stream);
+    root.Parse(content.c_str());
+
+    co_await cr.local().delete_client_routes(parse_delete_client_array(root));
+    co_return seastar::json::json_void();
+}
+
+static
+future<json::json_return_type>
+rest_get_client_routes(http_context& ctx, sharded<service::client_routes_service>& cr, std::unique_ptr<http::request> req) {
+    validate_client_routes_endpoint(cr, "get_client_routes");
+
+    co_return co_await cr.invoke_on(0, [] (service::client_routes_service& cr) -> future<json::json_return_type> {
+        co_return json::json_return_type(stream_range_as_array(co_await cr.get_client_routes(), [](const service::client_routes_service::client_route_entry & entry) {
+            seastar::httpd::client_routes_json::client_routes_entry obj;
+            obj.connection_id = entry.connection_id;
+            obj.host_id = fmt::to_string(entry.host_id);
+            obj.address = entry.address;
+            if (entry.port.has_value()) { obj.port = entry.port.value(); }
+            if (entry.tls_port.has_value()) { obj.tls_port = entry.tls_port.value(); }
+            if (entry.alternator_port.has_value()) { obj.alternator_port = entry.alternator_port.value(); }
+            if (entry.alternator_https_port.has_value()) { obj.alternator_https_port = entry.alternator_https_port.value(); }
+            return obj;
+        }));
+    });
+}
+
+void set_client_routes(http_context& ctx, routes& r, sharded<service::client_routes_service>& cr) {
+    seastar::httpd::client_routes_json::set_client_routes.set(r, [&ctx, &cr] (std::unique_ptr<seastar::http::request> req) {
+        return rest_set_client_routes(ctx, cr, std::move(req));
+    });
+    seastar::httpd::client_routes_json::delete_client_routes.set(r, [&ctx, &cr] (std::unique_ptr<seastar::http::request> req) {
+        return rest_delete_client_routes(ctx, cr, std::move(req));
+    });
+    seastar::httpd::client_routes_json::get_client_routes.set(r, [&ctx, &cr] (std::unique_ptr<seastar::http::request> req) {
+        return rest_get_client_routes(ctx, cr, std::move(req));
+    });
+}
+
+void unset_client_routes(http_context& ctx, routes& r) {
+    seastar::httpd::client_routes_json::set_client_routes.unset(r);
+    seastar::httpd::client_routes_json::delete_client_routes.unset(r);
+    seastar::httpd::client_routes_json::get_client_routes.unset(r);
+}
+
+}
--- a/api/client_routes.hh
+++ b/api/client_routes.hh
@@ -0,0 +1,20 @@
+/*
+ * Copyright (C) 2025-present ScyllaDB
+ *
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+#pragma once
+
+#include <seastar/core/sharded.hh>
+#include <seastar/json/json_elements.hh>
+#include "api/api_init.hh"
+
+namespace api {
+
+void set_client_routes(http_context& ctx, httpd::routes& r, sharded<service::client_routes_service>& cr);
+void unset_client_routes(http_context& ctx, httpd::routes& r);
+
+}
--- a/api/column_family.cc
+++ b/api/column_family.cc
@@ -66,6 +66,13 @@ static future<json::json_return_type>  get_cf_stats(sharded<replica::database>&
    }, std::plus<int64_t>());
 }

+static future<json::json_return_type>  get_cf_stats(sharded<replica::database>& db,
+        std::function<int64_t(const replica::column_family_stats&)> f) {
+    return map_reduce_cf(db, int64_t(0), [f](const replica::column_family& cf) {
+        return f(cf.get_stats());
+    }, std::plus<int64_t>());
+}
+
 static future<json::json_return_type> for_tables_on_all_shards(sharded<replica::database>& db, std::vector<table_info> tables, std::function<future<>(replica::table&)> set) {
    return do_with(std::move(tables), [&db, set] (const std::vector<table_info>& tables) {
        return db.invoke_on_all([&tables, set] (replica::database& db) {
@@ -1066,10 +1073,14 @@ void set_column_family(http_context& ctx, routes& r, sharded<replica::database>&
    });

    ss::get_load.set(r, [&db] (std::unique_ptr<http::request> req) {
-        return get_cf_stats(db, &replica::column_family_stats::live_disk_space_used);
+        return get_cf_stats(db, [](const replica::column_family_stats& stats) {
+            return stats.live_disk_space_used.on_disk;
+        });
    });
    ss::get_metrics_load.set(r, [&db] (std::unique_ptr<http::request> req) {
-        return get_cf_stats(db, &replica::column_family_stats::live_disk_space_used);
+        return get_cf_stats(db, [](const replica::column_family_stats& stats) {
+            return stats.live_disk_space_used.on_disk;
+        });
    });

    ss::get_keyspaces.set(r, [&db] (const_req req) {
--- a/api/storage_service.cc
+++ b/api/storage_service.cc
@@ -20,6 +20,7 @@
 #include "utils/hash.hh"
 #include <optional>
 #include <sstream>
+#include <stdexcept>
 #include <time.h>
 #include <algorithm>
 #include <functional>
@@ -504,6 +505,7 @@ void set_sstables_loader(http_context& ctx, routes& r, sharded<sstables_loader>&
        auto bucket = req->get_query_param("bucket");
        auto prefix = req->get_query_param("prefix");
        auto scope = parse_stream_scope(req->get_query_param("scope"));
+        auto primary_replica_only = validate_bool_x(req->get_query_param("primary_replica_only"), false);

        rjson::chunked_content content = co_await util::read_entire_stream(*req->content_stream);
        rjson::value parsed = rjson::parse(std::move(content));
@@ -513,7 +515,7 @@ void set_sstables_loader(http_context& ctx, routes& r, sharded<sstables_loader>&
        auto sstables = parsed.GetArray() |
            std::views::transform([] (const auto& s) { return sstring(rjson::to_string_view(s)); }) |
            std::ranges::to<std::vector>();
-        auto task_id = co_await sst_loader.local().download_new_sstables(keyspace, table, prefix, std::move(sstables), endpoint, bucket, scope);
+        auto task_id = co_await sst_loader.local().download_new_sstables(keyspace, table, prefix, std::move(sstables), endpoint, bucket, scope, primary_replica_only);
        co_return json::json_return_type(fmt::to_string(task_id));
    });

@@ -545,17 +547,13 @@ void set_view_builder(http_context& ctx, routes& r, sharded<db::view::view_build
                vp.insert(b.second);
            }
        }
-        std::vector<sstring> res;
        replica::database& db = vb.local().get_db();
        auto uuid = validate_table(db, ks, cf_name);
        replica::column_family& cf = db.find_column_family(uuid);
-        res.reserve(cf.get_index_manager().list_indexes().size());
-        for (auto&& i : cf.get_index_manager().list_indexes()) {
-            if (vp.contains(secondary_index::index_table_name(i.metadata().name()))) {
-                res.emplace_back(i.metadata().name());
-            }
-        }
-        co_return res;
+        co_return cf.get_index_manager().list_indexes()
+                | std::views::transform([] (const auto& i) { return i.metadata().name(); })
+                | std::views::filter([&vp] (const auto& n) { return vp.contains(secondary_index::index_table_name(n)); })
+                | std::ranges::to<std::vector>();
    });

 }
@@ -763,8 +761,14 @@ rest_cdc_streams_check_and_repair(sharded<service::storage_service>& ss, std::un
 static
 future<json::json_return_type>
 rest_cleanup_all(http_context& ctx, sharded<service::storage_service>& ss, std::unique_ptr<http::request> req) {
-        apilog.info("cleanup_all");
-        auto done = co_await ss.invoke_on(0, [] (service::storage_service& ss) -> future<bool> {
+        bool global = true;
+        if (auto global_param = req->get_query_param("global"); !global_param.empty()) {
+            global = validate_bool(global_param);
+        }
+
+        apilog.info("cleanup_all global={}", global);
+
+        auto done = !global ? false : co_await ss.invoke_on(0, [] (service::storage_service& ss) -> future<bool> {
            if (!ss.is_topology_coordinator_enabled()) {
                co_return false;
            }
@@ -774,14 +778,35 @@ rest_cleanup_all(http_context& ctx, sharded<service::storage_service>& ss, std::
        if (done) {
            co_return json::json_return_type(0);
        }
-        // fall back to the local global cleanup if topology coordinator is not enabled
+        // fall back to the local cleanup if topology coordinator is not enabled or local cleanup is requested
        auto& db = ctx.db;
        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
        auto task = co_await compaction_module.make_and_start_task<compaction::global_cleanup_compaction_task_impl>({}, db);
        co_await task->done();
+
+        // Mark this node as clean
+        co_await ss.invoke_on(0, [] (service::storage_service& ss) -> future<> {
+            if (ss.is_topology_coordinator_enabled()) {
+                co_await ss.reset_cleanup_needed();
+            }
+        });
+
        co_return json::json_return_type(0);
 }

+static
+future<json::json_return_type>
+rest_reset_cleanup_needed(http_context& ctx, sharded<service::storage_service>& ss, std::unique_ptr<http::request> req) {
+        apilog.info("reset_cleanup_needed");
+        co_await ss.invoke_on(0, [] (service::storage_service& ss) {
+            if (!ss.is_topology_coordinator_enabled()) {
+                throw std::runtime_error("mark_node_as_clean is only supported when topology over raft is enabled");
+            }
+            return ss.reset_cleanup_needed();
+        });
+        co_return json_void();
+}
+
 static
 future<json::json_return_type>
 rest_force_flush(http_context& ctx, std::unique_ptr<http::request> req) {
@@ -844,6 +869,25 @@ rest_remove_node(sharded<service::storage_service>& ss, std::unique_ptr<http::re
        });
 }

+static
+future<json::json_return_type>
+rest_exclude_node(sharded<service::storage_service>& ss, std::unique_ptr<http::request> req) {
+    auto hosts = utils::split_comma_separated_list(req->get_query_param("hosts"))
+        | std::views::transform([] (const sstring& s) { return locator::host_id(utils::UUID(s)); })
+        | std::ranges::to<std::vector<locator::host_id>>();
+
+    auto& topo = ss.local().get_token_metadata().get_topology();
+    for (auto host : hosts) {
+        if (!topo.has_node(host)) {
+            throw bad_param_exception(fmt::format("Host ID {} does not belong to this cluster", host));
+        }
+    }
+
+    apilog.info("exclude_node: hosts={}", hosts);
+    co_await ss.local().mark_excluded(hosts);
+    co_return json_void();
+}
+
 static
 future<json::json_return_type>
 rest_get_removal_status(sharded<service::storage_service>& ss, std::unique_ptr<http::request> req) {
@@ -1764,11 +1808,13 @@ void set_storage_service(http_context& ctx, routes& r, sharded<service::storage_
    ss::get_natural_endpoints_v2.set(r, rest_bind(rest_get_natural_endpoints_v2, ctx, ss));
    ss::cdc_streams_check_and_repair.set(r, rest_bind(rest_cdc_streams_check_and_repair, ss));
    ss::cleanup_all.set(r, rest_bind(rest_cleanup_all, ctx, ss));
+    ss::reset_cleanup_needed.set(r, rest_bind(rest_reset_cleanup_needed, ctx, ss));
    ss::force_flush.set(r, rest_bind(rest_force_flush, ctx));
    ss::force_keyspace_flush.set(r, rest_bind(rest_force_keyspace_flush, ctx));
    ss::decommission.set(r, rest_bind(rest_decommission, ss));
    ss::move.set(r, rest_bind(rest_move, ss));
    ss::remove_node.set(r, rest_bind(rest_remove_node, ss));
+    ss::exclude_node.set(r, rest_bind(rest_exclude_node, ss));
    ss::get_removal_status.set(r, rest_bind(rest_get_removal_status, ss));
    ss::force_remove_completion.set(r, rest_bind(rest_force_remove_completion, ss));
    ss::set_logging_level.set(r, rest_bind(rest_set_logging_level));
@@ -1841,11 +1887,13 @@ void unset_storage_service(http_context& ctx, routes& r) {
    ss::get_natural_endpoints.unset(r);
    ss::cdc_streams_check_and_repair.unset(r);
    ss::cleanup_all.unset(r);
+    ss::reset_cleanup_needed.unset(r);
    ss::force_flush.unset(r);
    ss::force_keyspace_flush.unset(r);
    ss::decommission.unset(r);
    ss::move.unset(r);
    ss::remove_node.unset(r);
+    ss::exclude_node.unset(r);
    ss::get_removal_status.unset(r);
    ss::force_remove_completion.unset(r);
    ss::set_logging_level.unset(r);
--- a/api/task_manager.cc
+++ b/api/task_manager.cc
@@ -9,6 +9,7 @@
 #include <seastar/core/chunked_fifo.hh>
 #include <seastar/core/coroutine.hh>
 #include <seastar/coroutine/exception.hh>
+#include <seastar/coroutine/maybe_yield.hh>
 #include <seastar/http/exception.hh>

 #include "task_manager.hh"
@@ -264,7 +265,7 @@ void set_task_manager(http_context& ctx, routes& r, sharded<tasks::task_manager>
                if (id) {
                    module->unregister_task(id);
                }
-                co_await maybe_yield();
+                co_await coroutine::maybe_yield();
            }
        });
        co_return json_void();
--- a/api/tasks.cc
+++ b/api/tasks.cc
@@ -38,76 +38,78 @@ static auto wrap_ks_cf(http_context &ctx, ks_cf_func f) {
    };
 }

+static future<shared_ptr<compaction::major_keyspace_compaction_task_impl>> force_keyspace_compaction(http_context& ctx, std::unique_ptr<http::request> req) {
+    auto& db = ctx.db;
+    auto [ keyspace, table_infos ] = parse_table_infos(ctx, *req, "cf");
+    auto flush = validate_bool_x(req->get_query_param("flush_memtables"), true);
+    auto consider_only_existing_data = validate_bool_x(req->get_query_param("consider_only_existing_data"), false);
+    apilog.info("force_keyspace_compaction: keyspace={} tables={}, flush={} consider_only_existing_data={}", keyspace, table_infos, flush, consider_only_existing_data);
+
+    auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
+    std::optional<compaction::flush_mode> fmopt;
+    if (!flush && !consider_only_existing_data) {
+        fmopt = compaction::flush_mode::skip;
+    }
+    return compaction_module.make_and_start_task<compaction::major_keyspace_compaction_task_impl>({}, std::move(keyspace), tasks::task_id::create_null_id(), db, table_infos, fmopt, consider_only_existing_data);
+}
+
+static future<shared_ptr<compaction::upgrade_sstables_compaction_task_impl>> upgrade_sstables(http_context& ctx, std::unique_ptr<http::request> req, sstring keyspace, std::vector<table_info> table_infos) {
+    auto& db = ctx.db;
+    bool exclude_current_version = req_param<bool>(*req, "exclude_current_version", false);
+
+    apilog.info("upgrade_sstables: keyspace={} tables={} exclude_current_version={}", keyspace, table_infos, exclude_current_version);
+
+    auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
+    return compaction_module.make_and_start_task<compaction::upgrade_sstables_compaction_task_impl>({}, std::move(keyspace), db, table_infos, exclude_current_version);
+}
+
+static future<shared_ptr<compaction::cleanup_keyspace_compaction_task_impl>> force_keyspace_cleanup(http_context& ctx, sharded<service::storage_service>& ss, std::unique_ptr<http::request> req) {
+    auto& db = ctx.db;
+    auto [keyspace, table_infos] = parse_table_infos(ctx, *req);
+    const auto& rs = db.local().find_keyspace(keyspace).get_replication_strategy();
+    if (rs.is_local() || !rs.is_vnode_based()) {
+        auto reason = rs.is_local() ? "require" : "support";
+        apilog.info("Keyspace {} does not {} cleanup", keyspace, reason);
+        co_return nullptr;
+    }
+    apilog.info("force_keyspace_cleanup: keyspace={} tables={}", keyspace, table_infos);
+    if (!co_await ss.local().is_vnodes_cleanup_allowed(keyspace)) {
+        auto msg = "Can not perform cleanup operation when topology changes";
+        apilog.warn("force_keyspace_cleanup: keyspace={} tables={}: {}", keyspace, table_infos, msg);
+        co_await coroutine::return_exception(std::runtime_error(msg));
+    }
+
+    auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
+    co_return co_await compaction_module.make_and_start_task<compaction::cleanup_keyspace_compaction_task_impl>(
+        {}, std::move(keyspace), db, table_infos, compaction::flush_mode::all_tables, tasks::is_user_task::yes);
+}
+
 void set_tasks_compaction_module(http_context& ctx, routes& r, sharded<service::storage_service>& ss, sharded<db::snapshot_ctl>& snap_ctl) {
    t::force_keyspace_compaction_async.set(r, [&ctx](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        auto [ keyspace, table_infos ] = parse_table_infos(ctx, *req, "cf");
-        auto flush = validate_bool_x(req->get_query_param("flush_memtables"), true);
-        apilog.debug("force_keyspace_compaction_async: keyspace={} tables={}, flush={}", keyspace, table_infos, flush);
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        std::optional<compaction::flush_mode> fmopt;
-        if (!flush) {
-            fmopt = compaction::flush_mode::skip;
-        }
-        auto task = co_await compaction_module.make_and_start_task<compaction::major_keyspace_compaction_task_impl>({}, std::move(keyspace), tasks::task_id::create_null_id(), db, table_infos, fmopt);
-
+        auto task = co_await force_keyspace_compaction(ctx, std::move(req));
        co_return json::json_return_type(task->get_status().id.to_sstring());
    });

    ss::force_keyspace_compaction.set(r, [&ctx](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        auto [ keyspace, table_infos ] = parse_table_infos(ctx, *req, "cf");
-        auto flush = validate_bool_x(req->get_query_param("flush_memtables"), true);
-        auto consider_only_existing_data = validate_bool_x(req->get_query_param("consider_only_existing_data"), false);
-        apilog.info("force_keyspace_compaction: keyspace={} tables={}, flush={} consider_only_existing_data={}", keyspace, table_infos, flush, consider_only_existing_data);
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        std::optional<compaction::flush_mode> fmopt;
-        if (!flush && !consider_only_existing_data) {
-            fmopt = compaction::flush_mode::skip;
-        }
-        auto task = co_await compaction_module.make_and_start_task<compaction::major_keyspace_compaction_task_impl>({}, std::move(keyspace), tasks::task_id::create_null_id(), db, table_infos, fmopt, consider_only_existing_data);
+        auto task = co_await force_keyspace_compaction(ctx, std::move(req));
        co_await task->done();
        co_return json_void();
    });

    t::force_keyspace_cleanup_async.set(r, [&ctx, &ss](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        auto [keyspace, table_infos] = parse_table_infos(ctx, *req);
-        apilog.info("force_keyspace_cleanup_async: keyspace={} tables={}", keyspace, table_infos);
-        if (!co_await ss.local().is_vnodes_cleanup_allowed(keyspace)) {
-            auto msg = "Can not perform cleanup operation when topology changes";
-            apilog.warn("force_keyspace_cleanup_async: keyspace={} tables={}: {}", keyspace, table_infos, msg);
-            co_await coroutine::return_exception(std::runtime_error(msg));
+        tasks::task_id id = tasks::task_id::create_null_id();
+        auto task = co_await force_keyspace_cleanup(ctx, ss, std::move(req));
+        if (task) {
+            id = task->get_status().id;
        }
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        auto task = co_await compaction_module.make_and_start_task<compaction::cleanup_keyspace_compaction_task_impl>({}, std::move(keyspace), db, table_infos, compaction::flush_mode::all_tables, tasks::is_user_task::yes);
-
-        co_return json::json_return_type(task->get_status().id.to_sstring());
+        co_return json::json_return_type(id.to_sstring());
    });

    ss::force_keyspace_cleanup.set(r, [&ctx, &ss](std::unique_ptr<http::request> req) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        auto [keyspace, table_infos] = parse_table_infos(ctx, *req);
-        const auto& rs = db.local().find_keyspace(keyspace).get_replication_strategy();
-        if (rs.is_local() || !rs.is_vnode_based()) {
-            auto reason = rs.is_local() ? "require" : "support";
-            apilog.info("Keyspace {} does not {} cleanup", keyspace, reason);
-            co_return json::json_return_type(0);
+        auto task = co_await force_keyspace_cleanup(ctx, ss, std::move(req));
+        if (task) {
+            co_await task->done();
        }
-        apilog.info("force_keyspace_cleanup: keyspace={} tables={}", keyspace, table_infos);
-        if (!co_await ss.local().is_vnodes_cleanup_allowed(keyspace)) {
-            auto msg = "Can not perform cleanup operation when topology changes";
-            apilog.warn("force_keyspace_cleanup: keyspace={} tables={}: {}", keyspace, table_infos, msg);
-            co_await coroutine::return_exception(std::runtime_error(msg));
-        }
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        auto task = co_await compaction_module.make_and_start_task<compaction::cleanup_keyspace_compaction_task_impl>(
-            {}, std::move(keyspace), db, table_infos, compaction::flush_mode::all_tables, tasks::is_user_task::yes);
-        co_await task->done();
        co_return json::json_return_type(0);
    });

@@ -129,25 +131,12 @@ void set_tasks_compaction_module(http_context& ctx, routes& r, sharded<service::
    }));

    t::upgrade_sstables_async.set(r, wrap_ks_cf(ctx, [] (http_context& ctx, std::unique_ptr<http::request> req, sstring keyspace, std::vector<table_info> table_infos) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        bool exclude_current_version = req_param<bool>(*req, "exclude_current_version", false);
-
-        apilog.info("upgrade_sstables: keyspace={} tables={} exclude_current_version={}", keyspace, table_infos, exclude_current_version);
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        auto task = co_await compaction_module.make_and_start_task<compaction::upgrade_sstables_compaction_task_impl>({}, std::move(keyspace), db, table_infos, exclude_current_version);
-
+        auto task = co_await upgrade_sstables(ctx, std::move(req), std::move(keyspace), std::move(table_infos));
        co_return json::json_return_type(task->get_status().id.to_sstring());
    }));

    ss::upgrade_sstables.set(r, wrap_ks_cf(ctx, [] (http_context& ctx, std::unique_ptr<http::request> req, sstring keyspace, std::vector<table_info> table_infos) -> future<json::json_return_type> {
-        auto& db = ctx.db;
-        bool exclude_current_version = req_param<bool>(*req, "exclude_current_version", false);
-
-        apilog.info("upgrade_sstables: keyspace={} tables={} exclude_current_version={}", keyspace, table_infos, exclude_current_version);
-
-        auto& compaction_module = db.local().get_compaction_manager().get_task_manager_module();
-        auto task = co_await compaction_module.make_and_start_task<compaction::upgrade_sstables_compaction_task_impl>({}, std::move(keyspace), db, table_infos, exclude_current_version);
+        auto task = co_await upgrade_sstables(ctx, std::move(req), std::move(keyspace), std::move(table_infos));
        co_await task->done();
        co_return json::json_return_type(0);
    }));
--- a/api/token_metadata.cc
+++ b/api/token_metadata.cc
@@ -62,6 +62,17 @@ void set_token_metadata(http_context& ctx, routes& r, sharded<locator::shared_to
        return addr | std::ranges::to<std::vector>();
    });

+    ss::get_excluded_nodes.set(r, [&tm](const_req req) {
+        const auto& local_tm = *tm.local().get();
+        std::vector<sstring> eps;
+        local_tm.get_topology().for_each_node([&] (auto& node) {
+            if (node.is_excluded()) {
+                eps.push_back(node.host_id().to_sstring());
+            }
+        });
+        return eps;
+    });
+
    ss::get_joining_nodes.set(r, [&tm, &g](const_req req) {
        const auto& local_tm = *tm.local().get();
        const auto& points = local_tm.get_bootstrap_tokens();
@@ -130,6 +141,7 @@ void unset_token_metadata(http_context& ctx, routes& r) {
    ss::get_leaving_nodes.unset(r);
    ss::get_moving_nodes.unset(r);
    ss::get_joining_nodes.unset(r);
+    ss::get_excluded_nodes.unset(r);
    ss::get_host_id_map.unset(r);
    httpd::endpoint_snitch_info_json::get_datacenter.unset(r);
    httpd::endpoint_snitch_info_json::get_rack.unset(r);
--- a/audit/CMakeLists.txt
+++ b/audit/CMakeLists.txt
@@ -5,6 +5,7 @@ target_sources(scylla_audit
  PRIVATE
    audit.cc
    audit_cf_storage_helper.cc
+    audit_composite_storage_helper.cc
    audit_syslog_storage_helper.cc)
 target_include_directories(scylla_audit
  PUBLIC
@@ -16,4 +17,7 @@ target_link_libraries(scylla_audit
  PRIVATE
    cql3)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(scylla_audit REUSE_FROM scylla-precompiled-header)
+endif()
 add_whole_archive(audit scylla_audit)
--- a/audit/audit.cc
+++ b/audit/audit.cc
@@ -13,9 +13,11 @@
 #include "cql3/statements/batch_statement.hh"
 #include "cql3/statements/modification_statement.hh"
 #include "storage_helper.hh"
+#include "audit_cf_storage_helper.hh"
+#include "audit_syslog_storage_helper.hh"
+#include "audit_composite_storage_helper.hh"
 #include "audit.hh"
 #include "../db/config.hh"
-#include "utils/class_registrator.hh"

 #include <boost/algorithm/string/split.hpp>
 #include <boost/algorithm/string/trim.hpp>
@@ -26,6 +28,47 @@ namespace audit {

 logging::logger logger("audit");

+static std::set<sstring> parse_audit_modes(const sstring& data) {
+    std::set<sstring> result;
+    if (!data.empty()) {
+        std::vector<sstring> audit_modes;
+        boost::split(audit_modes, data, boost::is_any_of(","));
+        if (audit_modes.empty()) {
+            return {};
+        }
+        for (sstring& audit_mode : audit_modes) {
+            boost::trim(audit_mode);
+            if (audit_mode == "none") {
+                return {};
+            }
+            if (audit_mode != "table" && audit_mode != "syslog") {
+                throw audit_exception(fmt::format("Bad configuration: invalid 'audit': {}", audit_mode));
+            }
+            result.insert(std::move(audit_mode));
+        }
+    }
+    return result;
+}
+
+static std::unique_ptr<storage_helper> create_storage_helper(const std::set<sstring>& audit_modes, cql3::query_processor& qp, service::migration_manager& mm) {
+    SCYLLA_ASSERT(!audit_modes.empty() && !audit_modes.contains("none"));
+
+    std::vector<std::unique_ptr<storage_helper>> helpers;
+    for (const sstring& audit_mode : audit_modes) {
+        if (audit_mode == "table") {
+            helpers.emplace_back(std::make_unique<audit_cf_storage_helper>(qp, mm));
+        } else if (audit_mode == "syslog") {
+            helpers.emplace_back(std::make_unique<audit_syslog_storage_helper>(qp, mm));
+        }
+    }
+
+    SCYLLA_ASSERT(!helpers.empty());
+    if (helpers.size() == 1) {
+        return std::move(helpers.front());
+    }
+    return std::make_unique<audit_composite_storage_helper>(std::move(helpers));
+}
+
 static sstring category_to_string(statement_category category)
 {
    switch (category) {
@@ -103,7 +146,9 @@ static std::set<sstring> parse_audit_keyspaces(const sstring& data) {
 }

 audit::audit(locator::shared_token_metadata& token_metadata,
-             sstring&& storage_helper_name,
+             cql3::query_processor& qp,
+             service::migration_manager& mm,
+             std::set<sstring>&& audit_modes,
             std::set<sstring>&& audited_keyspaces,
             std::map<sstring, std::set<sstring>>&& audited_tables,
             category_set&& audited_categories,
@@ -112,28 +157,21 @@ audit::audit(locator::shared_token_metadata& token_metadata,
    , _audited_keyspaces(std::move(audited_keyspaces))
    , _audited_tables(std::move(audited_tables))
    , _audited_categories(std::move(audited_categories))
-    , _storage_helper_class_name(std::move(storage_helper_name))
    , _cfg(cfg)
    , _cfg_keyspaces_observer(cfg.audit_keyspaces.observe([this] (sstring const& new_value){ update_config<std::set<sstring>>(new_value, parse_audit_keyspaces, _audited_keyspaces); }))
    , _cfg_tables_observer(cfg.audit_tables.observe([this] (sstring const& new_value){ update_config<std::map<sstring, std::set<sstring>>>(new_value, parse_audit_tables, _audited_tables); }))
    , _cfg_categories_observer(cfg.audit_categories.observe([this] (sstring const& new_value){ update_config<category_set>(new_value, parse_audit_categories, _audited_categories); }))
-{ }
+{
+    _storage_helper_ptr = create_storage_helper(std::move(audit_modes), qp, mm);
+}

 audit::~audit() = default;

-future<> audit::create_audit(const db::config& cfg, sharded<locator::shared_token_metadata>& stm) {
-    sstring storage_helper_name;
-    if (cfg.audit() == "table") {
-        storage_helper_name = "audit_cf_storage_helper";
-    } else if (cfg.audit() == "syslog") {
-        storage_helper_name = "audit_syslog_storage_helper";
-    } else if (cfg.audit() == "none") {
-        // Audit is off
+future<> audit::start_audit(const db::config& cfg, sharded<locator::shared_token_metadata>& stm, sharded<cql3::query_processor>& qp, sharded<service::migration_manager>& mm) {
+    std::set<sstring> audit_modes = parse_audit_modes(cfg.audit());
+    if (audit_modes.empty()) {
        logger.info("Audit is disabled");
-
        return make_ready_future<>();
-    } else {
-        throw audit_exception(fmt::format("Bad configuration: invalid 'audit': {}", cfg.audit()));
    }
    category_set audited_categories = parse_audit_categories(cfg.audit_categories());
    std::map<sstring, std::set<sstring>> audited_tables = parse_audit_tables(cfg.audit_tables());
@@ -143,19 +181,20 @@ future<> audit::create_audit(const db::config& cfg, sharded<locator::shared_toke
                cfg.audit(), cfg.audit_categories(), cfg.audit_keyspaces(), cfg.audit_tables());

    return audit_instance().start(std::ref(stm),
-                                  std::move(storage_helper_name),
+                                  std::ref(qp),
+                                  std::ref(mm),
+                                  std::move(audit_modes),
                                  std::move(audited_keyspaces),
                                  std::move(audited_tables),
                                  std::move(audited_categories),
-                                  std::cref(cfg));
-}
-
-future<> audit::start_audit(const db::config& cfg, sharded<cql3::query_processor>& qp, sharded<service::migration_manager>& mm) {
-    if (!audit_instance().local_is_initialized()) {
-        return make_ready_future<>();
-    }
-    return audit_instance().invoke_on_all([&cfg, &qp, &mm] (audit& local_audit) {
-        return local_audit.start(cfg, qp.local(), mm.local());
+                                  std::cref(cfg))
+    .then([&cfg] {
+        if (!audit_instance().local_is_initialized()) {
+            return make_ready_future<>();
+        }
+        return audit_instance().invoke_on_all([&cfg] (audit& local_audit) {
+            return local_audit.start(cfg);
+        });
    });
 }

@@ -181,15 +220,7 @@ audit_info_ptr audit::create_no_audit_info() {
    return audit_info_ptr();
 }

-future<> audit::start(const db::config& cfg, cql3::query_processor& qp, service::migration_manager& mm) {
-    try {
-        _storage_helper_ptr = create_object<storage_helper>(_storage_helper_class_name, qp, mm);
-    } catch (no_such_class& e) {
-        logger.error("Can't create audit storage helper {}: not supported", _storage_helper_class_name);
-        throw;
-    } catch (...) {
-        throw;
-    }
+future<> audit::start(const db::config& cfg) {
    return _storage_helper_ptr->start(cfg);
 }

--- a/audit/audit.hh
+++ b/audit/audit.hh
@@ -102,7 +102,6 @@ class audit final : public seastar::async_sharded_service<audit> {
    std::map<sstring, std::set<sstring>> _audited_tables;
    category_set _audited_categories;

-    sstring _storage_helper_class_name;
    std::unique_ptr<storage_helper> _storage_helper_ptr;

    const db::config& _cfg;
@@ -125,18 +124,20 @@ public:
    static audit& local_audit_instance() {
        return audit_instance().local();
    }
-    static future<> create_audit(const db::config& cfg, sharded<locator::shared_token_metadata>& stm);
-    static future<> start_audit(const db::config& cfg, sharded<cql3::query_processor>& qp, sharded<service::migration_manager>& mm);
+    static future<> start_audit(const db::config& cfg, sharded<locator::shared_token_metadata>& stm, sharded<cql3::query_processor>& qp, sharded<service::migration_manager>& mm);
    static future<> stop_audit();
    static audit_info_ptr create_audit_info(statement_category cat, const sstring& keyspace, const sstring& table);
    static audit_info_ptr create_no_audit_info();
-    audit(locator::shared_token_metadata& stm, sstring&& storage_helper_name,
+    audit(locator::shared_token_metadata& stm,
+          cql3::query_processor& qp,
+          service::migration_manager& mm,
+          std::set<sstring>&& audit_modes,
          std::set<sstring>&& audited_keyspaces,
          std::map<sstring, std::set<sstring>>&& audited_tables,
          category_set&& audited_categories,
          const db::config& cfg);
    ~audit();
-    future<> start(const db::config& cfg, cql3::query_processor& qp, service::migration_manager& mm);
+    future<> start(const db::config& cfg);
    future<> stop();
    future<> shutdown();
    bool should_log(const audit_info* audit_info) const;
--- a/audit/audit_cf_storage_helper.cc
+++ b/audit/audit_cf_storage_helper.cc
@@ -11,7 +11,6 @@
 #include "cql3/query_processor.hh"
 #include "data_dictionary/keyspace_metadata.hh"
 #include "utils/UUID_gen.hh"
-#include "utils/class_registrator.hh"
 #include "cql3/query_options.hh"
 #include "cql3/statements/ks_prop_defs.hh"
 #include "service/migration_manager.hh"
@@ -198,7 +197,4 @@ cql3::query_options audit_cf_storage_helper::make_login_data(socket_address node
    return cql3::query_options(cql3::default_cql_config, db::consistency_level::ONE, std::nullopt, std::move(values), false, cql3::query_options::specific_options::DEFAULT);
 }

-using registry = class_registrator<storage_helper, audit_cf_storage_helper, cql3::query_processor&, service::migration_manager&>;
-static registry registrator1("audit_cf_storage_helper");
-
 }
--- a/audit/audit_composite_storage_helper.cc
+++ b/audit/audit_composite_storage_helper.cc
@@ -0,0 +1,68 @@
+/*
+ * Copyright (C) 2025 ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#include <seastar/core/loop.hh>
+#include <seastar/core/future-util.hh>
+
+#include "audit/audit_composite_storage_helper.hh"
+
+#include "utils/class_registrator.hh"
+
+namespace audit {
+
+audit_composite_storage_helper::audit_composite_storage_helper(std::vector<std::unique_ptr<storage_helper>>&& storage_helpers)
+    : _storage_helpers(std::move(storage_helpers))
+{}
+
+future<> audit_composite_storage_helper::start(const db::config& cfg) {
+    auto res = seastar::parallel_for_each(
+        _storage_helpers,
+        [&cfg] (std::unique_ptr<storage_helper>& h) {
+            return h->start(cfg);
+        }
+    );
+    return res;
+}
+
+future<> audit_composite_storage_helper::stop() {
+    auto res = seastar::parallel_for_each(
+        _storage_helpers,
+        [] (std::unique_ptr<storage_helper>& h) {
+            return h->stop();
+        }
+    );
+    return res;
+}
+
+future<> audit_composite_storage_helper::write(const audit_info* audit_info,
+                                               socket_address node_ip,
+                                               socket_address client_ip,
+                                               db::consistency_level cl,
+                                               const sstring& username,
+                                               bool error) {
+    return seastar::parallel_for_each(
+        _storage_helpers,
+        [audit_info, node_ip, client_ip, cl, &username, error](std::unique_ptr<storage_helper>& h) {
+            return h->write(audit_info, node_ip, client_ip, cl, username, error);
+        }
+    );
+}
+
+future<> audit_composite_storage_helper::write_login(const sstring& username,
+                                                     socket_address node_ip,
+                                                     socket_address client_ip,
+                                                     bool error) {
+    return seastar::parallel_for_each(
+        _storage_helpers,
+        [&username, node_ip, client_ip, error](std::unique_ptr<storage_helper>& h) {
+            return h->write_login(username, node_ip, client_ip, error);
+        }
+    );
+}
+
+} // namespace audit
--- a/audit/audit_composite_storage_helper.hh
+++ b/audit/audit_composite_storage_helper.hh
@@ -0,0 +1,37 @@
+/*
+ * Copyright (C) 2025 ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+#pragma once
+
+#include "audit/audit.hh"
+#include <seastar/core/future.hh>
+
+#include "storage_helper.hh"
+
+namespace audit {
+
+class audit_composite_storage_helper : public storage_helper {
+    std::vector<std::unique_ptr<storage_helper>> _storage_helpers;
+
+public:
+    explicit audit_composite_storage_helper(std::vector<std::unique_ptr<storage_helper>>&&);
+    virtual ~audit_composite_storage_helper() = default;
+    virtual future<> start(const db::config& cfg) override;
+    virtual future<> stop() override;
+    virtual future<> write(const audit_info* audit_info,
+                           socket_address node_ip,
+                           socket_address client_ip,
+                           db::consistency_level cl,
+                           const sstring& username,
+                           bool error) override;
+    virtual future<> write_login(const sstring& username,
+                                 socket_address node_ip,
+                                 socket_address client_ip,
+                                 bool error) override;
+};
+
+} // namespace audit
--- a/audit/audit_syslog_storage_helper.cc
+++ b/audit/audit_syslog_storage_helper.cc
@@ -21,7 +21,6 @@
 #include <fmt/chrono.h>

 #include "cql3/query_processor.hh"
-#include "utils/class_registrator.hh"

 namespace cql3 {

@@ -143,7 +142,4 @@ future<> audit_syslog_storage_helper::write_login(const sstring& username,
    co_await syslog_send_helper(msg.c_str());
 }

-using registry = class_registrator<storage_helper, audit_syslog_storage_helper, cql3::query_processor&, service::migration_manager&>;
-static registry registrator1("audit_syslog_storage_helper");
-
 }
--- a/auth/CMakeLists.txt
+++ b/auth/CMakeLists.txt
@@ -9,6 +9,7 @@ target_sources(scylla_auth
    allow_all_authorizer.cc
    authenticated_user.cc
    authenticator.cc
+    cache.cc
    certificate_authenticator.cc
    common.cc
    default_authorizer.cc
@@ -44,5 +45,8 @@ target_link_libraries(scylla_auth

 add_whole_archive(auth scylla_auth)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(scylla_auth REUSE_FROM scylla-precompiled-header)
+endif()
 check_headers(check-headers scylla_auth
  GLOB_RECURSE ${CMAKE_CURRENT_SOURCE_DIR}/*.hh)
--- a/auth/allow_all_authenticator.cc
+++ b/auth/allow_all_authenticator.cc
@@ -9,7 +9,6 @@
 #include "auth/allow_all_authenticator.hh"

 #include "service/migration_manager.hh"
-#include "utils/alien_worker.hh"
 #include "utils/class_registrator.hh"

 namespace auth {
@@ -23,6 +22,6 @@ static const class_registrator<
        cql3::query_processor&,
        ::service::raft_group0_client&,
        ::service::migration_manager&,
-        utils::alien_worker&> registration("org.apache.cassandra.auth.AllowAllAuthenticator");
+        cache&> registration("org.apache.cassandra.auth.AllowAllAuthenticator");

 }
--- a/auth/allow_all_authenticator.hh
+++ b/auth/allow_all_authenticator.hh
@@ -12,8 +12,8 @@

 #include "auth/authenticated_user.hh"
 #include "auth/authenticator.hh"
+#include "auth/cache.hh"
 #include "auth/common.hh"
-#include "utils/alien_worker.hh"

 namespace cql3 {
 class query_processor;
@@ -29,7 +29,7 @@ extern const std::string_view allow_all_authenticator_name;

 class allow_all_authenticator final : public authenticator {
 public:
-    allow_all_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&) {
+    allow_all_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&) {
    }

    virtual future<> start() override {
--- a/auth/cache.cc
+++ b/auth/cache.cc
@@ -0,0 +1,188 @@
+/*
+ * Copyright (C) 2017-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#include "auth/cache.hh"
+#include "auth/common.hh"
+#include "auth/roles-metadata.hh"
+#include "cql3/query_processor.hh"
+#include "cql3/untyped_result_set.hh"
+#include "db/consistency_level_type.hh"
+#include "db/system_keyspace.hh"
+#include "schema/schema.hh"
+#include <iterator>
+#include <seastar/core/abort_source.hh>
+#include <seastar/coroutine/maybe_yield.hh>
+#include <seastar/core/format.hh>
+
+namespace auth {
+
+logging::logger logger("auth-cache");
+
+cache::cache(cql3::query_processor& qp, abort_source& as) noexcept
+    : _current_version(0)
+    , _qp(qp)
+    , _loading_sem(1)
+    , _as(as) {
+}
+
+lw_shared_ptr<const cache::role_record> cache::get(const role_name_t& role) const noexcept {
+    auto it = _roles.find(role);
+    if (it == _roles.end()) {
+        return {};
+    }
+    return it->second;
+}
+
+future<lw_shared_ptr<cache::role_record>> cache::fetch_role(const role_name_t& role) const {
+    auto rec = make_lw_shared<role_record>();
+    rec->version = _current_version;
+
+    auto fetch = [this, &role](const sstring& q) {
+        return _qp.execute_internal(q, db::consistency_level::LOCAL_ONE,
+                internal_distributed_query_state(), {role},
+                cql3::query_processor::cache_internal::yes);
+    };
+    // roles
+    {
+        static const sstring q = format("SELECT * FROM {}.{} WHERE role = ?", db::system_keyspace::NAME, meta::roles_table::name);
+        auto rs = co_await fetch(q);
+        if (!rs->empty()) {
+            auto& r = rs->one();
+            rec->is_superuser = r.get_or<bool>("is_superuser", false);
+            rec->can_login = r.get_or<bool>("can_login", false);
+            rec->salted_hash = r.get_or<sstring>("salted_hash", "");
+            if (r.has("member_of")) {
+                auto mo = r.get_set<sstring>("member_of");
+                rec->member_of.insert(
+                        std::make_move_iterator(mo.begin()),
+                        std::make_move_iterator(mo.end()));
+            }
+        } else {
+            // role got deleted
+            co_return nullptr;
+        }
+    }
+    // members
+    {
+        static const sstring q = format("SELECT role, member FROM {}.{} WHERE role = ?", db::system_keyspace::NAME, ROLE_MEMBERS_CF);
+        auto rs = co_await fetch(q);
+        for (const auto& r : *rs) {
+            rec->members.insert(r.get_as<sstring>("member"));
+            co_await coroutine::maybe_yield();
+        }
+    }
+    // attributes
+    {
+        static const sstring q = format("SELECT role, name, value FROM {}.{} WHERE role = ?", db::system_keyspace::NAME, ROLE_ATTRIBUTES_CF);
+        auto rs = co_await fetch(q);
+        for (const auto& r : *rs) {
+            rec->attributes[r.get_as<sstring>("name")] =
+                    r.get_as<sstring>("value");
+            co_await coroutine::maybe_yield();
+        }
+    }
+    // permissions
+    {
+        static const sstring q = format("SELECT role, resource, permissions FROM {}.{} WHERE role = ?", db::system_keyspace::NAME, PERMISSIONS_CF);
+        auto rs = co_await fetch(q);
+        for (const auto& r : *rs) {
+            auto resource = r.get_as<sstring>("resource");
+            auto perms_strings = r.get_set<sstring>("permissions");
+            std::unordered_set<sstring> perms_set(perms_strings.begin(), perms_strings.end());
+            auto pset = permissions::from_strings(perms_set);
+            rec->permissions[std::move(resource)] = std::move(pset);
+            co_await coroutine::maybe_yield();
+        }
+    }
+    co_return rec;
+}
+
+future<> cache::prune_all() noexcept {
+    for (auto it = _roles.begin(); it != _roles.end(); ) {
+        if (it->second->version != _current_version) {
+            _roles.erase(it++);
+            co_await coroutine::maybe_yield();
+        } else {
+            ++it;
+        }
+    }
+    co_return;
+}
+
+future<> cache::load_all() {
+    if (legacy_mode(_qp)) {
+        co_return;
+    }
+    SCYLLA_ASSERT(this_shard_id() == 0);
+    auto units = co_await get_units(_loading_sem, 1, _as);
+
+    ++_current_version;
+
+    logger.info("Loading all roles");
+    const uint32_t page_size = 128;
+    auto loader = [this](const cql3::untyped_result_set::row& r) -> future<stop_iteration> {
+        const auto name = r.get_as<sstring>("role");
+        auto role = co_await fetch_role(name);
+        if (role) {
+            _roles[name] = role;
+        }
+        co_return stop_iteration::no;
+    };
+    co_await _qp.query_internal(format("SELECT * FROM {}.{}",
+            db::system_keyspace::NAME, meta::roles_table::name),
+            db::consistency_level::LOCAL_ONE, {}, page_size, loader);
+
+    co_await prune_all();
+    for (const auto& [name, role] : _roles) {
+        co_await distribute_role(name, role);
+    }
+    co_await container().invoke_on_others([this](cache& c) -> future<> {
+        c._current_version = _current_version;
+        co_await c.prune_all();
+    });
+}
+
+future<> cache::load_roles(std::unordered_set<role_name_t> roles) {
+    if (legacy_mode(_qp)) {
+        co_return;
+    }
+    SCYLLA_ASSERT(this_shard_id() == 0);
+    auto units = co_await get_units(_loading_sem, 1, _as);
+
+    for (const auto& name : roles) {
+        logger.info("Loading role {}", name);
+        auto role = co_await fetch_role(name);
+         if (role) {
+            _roles[name] = role;
+        } else {
+            _roles.erase(name);
+        }
+        co_await distribute_role(name, role);
+    }
+}
+
+future<> cache::distribute_role(const role_name_t& name, lw_shared_ptr<role_record> role) {
+    auto role_ptr = role.get();
+    co_await container().invoke_on_others([&name, role_ptr](cache& c) {
+        if (!role_ptr) {
+            c._roles.erase(name);
+            return;
+        }
+        auto role_copy = make_lw_shared<role_record>(*role_ptr);
+        c._roles[name] = std::move(role_copy);
+    });
+}
+
+bool cache::includes_table(const table_id& id) noexcept {
+    return id == db::system_keyspace::roles()->id()
+            || id == db::system_keyspace::role_members()->id()
+            || id == db::system_keyspace::role_attributes()->id()
+            || id == db::system_keyspace::role_permissions()->id();
+}
+
+} // namespace auth
--- a/auth/cache.hh
+++ b/auth/cache.hh
@@ -0,0 +1,65 @@
+/*
+ * Copyright (C) 2025-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#pragma once
+
+#include <seastar/core/abort_source.hh>
+#include <unordered_set>
+#include <unordered_map>
+
+#include <seastar/core/sstring.hh>
+#include <seastar/core/future.hh>
+#include <seastar/core/sharded.hh>
+#include <seastar/core/shared_ptr.hh>
+#include <seastar/core/semaphore.hh>
+
+#include <absl/container/flat_hash_map.h>
+
+#include "auth/permission.hh"
+#include "auth/common.hh"
+
+namespace cql3 { class query_processor; }
+
+namespace auth {
+
+class cache : public peering_sharded_service<cache> {
+public:
+    using role_name_t = sstring;
+    using version_tag_t = char;
+
+	struct role_record {
+        bool can_login = false;
+        bool is_superuser = false;
+        std::unordered_set<role_name_t> member_of;
+        std::unordered_set<role_name_t> members;
+        sstring salted_hash;
+        std::unordered_map<sstring, sstring> attributes;
+        std::unordered_map<sstring, permission_set> permissions;
+        version_tag_t version; // used for seamless cache reloads
+    };
+
+    explicit cache(cql3::query_processor& qp, abort_source& as) noexcept;
+    lw_shared_ptr<const role_record> get(const role_name_t& role) const noexcept;
+    future<> load_all();
+    future<> load_roles(std::unordered_set<role_name_t> roles);
+    static bool includes_table(const table_id&) noexcept;
+
+private:
+    using roles_map = absl::flat_hash_map<role_name_t, lw_shared_ptr<role_record>>;
+    roles_map _roles;
+    version_tag_t _current_version;
+    cql3::query_processor& _qp;
+    semaphore _loading_sem;
+    abort_source& _as;
+
+    future<lw_shared_ptr<role_record>> fetch_role(const role_name_t& role) const;
+    future<> prune_all() noexcept;
+    future<> distribute_role(const role_name_t& name, const lw_shared_ptr<role_record> role);
+};
+
+} // namespace auth
--- a/auth/certificate_authenticator.cc
+++ b/auth/certificate_authenticator.cc
@@ -8,6 +8,7 @@
 */

 #include "auth/certificate_authenticator.hh"
+#include "auth/cache.hh"

 #include <boost/regex.hpp>
 #include <fmt/ranges.h>
@@ -34,13 +35,13 @@ static const class_registrator<auth::authenticator
    , cql3::query_processor&
    , ::service::raft_group0_client&
    , ::service::migration_manager&
-    , utils::alien_worker&> cert_auth_reg(CERT_AUTH_NAME);
+    , auth::cache&> cert_auth_reg(CERT_AUTH_NAME);

 enum class auth::certificate_authenticator::query_source {
    subject, altname
 };

-auth::certificate_authenticator::certificate_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&)
+auth::certificate_authenticator::certificate_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, auth::cache&)
    : _queries([&] {
        auto& conf = qp.db().get_config();
        auto queries = conf.auth_certificate_role_queries();
@@ -75,9 +76,9 @@ auth::certificate_authenticator::certificate_authenticator(cql3::query_processor
                        throw std::invalid_argument(fmt::format("Invalid source: {}", map.at(cfg_source_attr)));
                    }
                    continue;
-                } catch (std::out_of_range&) {
+                } catch (const std::out_of_range&) {
                    // just fallthrough
-                } catch (boost::regex_error&) {
+                } catch (const boost::regex_error&) {
                    std::throw_with_nested(std::invalid_argument(fmt::format("Invalid query expression: {}", map.at(cfg_query_attr))));
                }
            }
--- a/auth/certificate_authenticator.hh
+++ b/auth/certificate_authenticator.hh
@@ -10,7 +10,6 @@
 #pragma once

 #include "auth/authenticator.hh"
-#include "utils/alien_worker.hh"
 #include <boost/regex_fwd.hpp>  // IWYU pragma: keep

 namespace cql3 {
@@ -26,13 +25,15 @@ class raft_group0_client;

 namespace auth {

+class cache;
+
 extern const std::string_view certificate_authenticator_name;

 class certificate_authenticator : public authenticator {
    enum class query_source;
    std::vector<std::pair<query_source, boost::regex>> _queries;
 public:
-    certificate_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&);
+    certificate_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&);
    ~certificate_authenticator();

    future<> start() override;
--- a/auth/common.cc
+++ b/auth/common.cc
@@ -94,7 +94,7 @@ static future<> create_legacy_metadata_table_if_missing_impl(
        try {
            co_return co_await mm.announce(co_await ::service::prepare_new_column_family_announcement(qp.proxy(), table, ts),
                    std::move(group0_guard), format("auth: create {} metadata table", table->cf_name()));
-        } catch (exceptions::already_exists_exception&) {}
+        } catch (const exceptions::already_exists_exception&) {}
    }
 }

--- a/auth/common.hh
+++ b/auth/common.hh
@@ -48,6 +48,10 @@ extern constinit const std::string_view AUTH_PACKAGE_NAME;

 } // namespace meta

+constexpr std::string_view PERMISSIONS_CF = "role_permissions";
+constexpr std::string_view ROLE_MEMBERS_CF = "role_members";
+constexpr std::string_view ROLE_ATTRIBUTES_CF = "role_attributes";
+
 // This is a helper to check whether auth-v2 is on.
 bool legacy_mode(cql3::query_processor& qp);

--- a/auth/default_authorizer.cc
+++ b/auth/default_authorizer.cc
@@ -37,7 +37,6 @@ std::string_view default_authorizer::qualified_java_name() const {
 static constexpr std::string_view ROLE_NAME = "role";
 static constexpr std::string_view RESOURCE_NAME = "resource";
 static constexpr std::string_view PERMISSIONS_NAME = "permissions";
-static constexpr std::string_view PERMISSIONS_CF = "role_permissions";

 static logging::logger alogger("default_authorizer");

@@ -257,7 +256,7 @@ future<> default_authorizer::revoke_all(std::string_view role_name, ::service::g
        } else {
            co_await collect_mutations(_qp, mc, query, {sstring(role_name)});
        }
-    } catch (exceptions::request_execution_exception& e) {
+    } catch (const exceptions::request_execution_exception& e) {
        alogger.warn("CassandraAuthorizer failed to revoke all permissions of {}: {}", role_name, e);
    }
 }
@@ -294,13 +293,13 @@ future<> default_authorizer::revoke_all_legacy(const resource& resource) {
                                [resource](auto ep) {
                    try {
                        std::rethrow_exception(ep);
-                    } catch (exceptions::request_execution_exception& e) {
+                    } catch (const exceptions::request_execution_exception& e) {
                        alogger.warn("CassandraAuthorizer failed to revoke all permissions on {}: {}", resource, e);
                    }

                });
            });
-        } catch (exceptions::request_execution_exception& e) {
+        } catch (const exceptions::request_execution_exception& e) {
            alogger.warn("CassandraAuthorizer failed to revoke all permissions on {}: {}", resource, e);
            return make_ready_future();
        }
--- a/auth/ldap_role_manager.cc
+++ b/auth/ldap_role_manager.cc
@@ -83,17 +83,18 @@ static const class_registrator<
    ldap_role_manager,
    cql3::query_processor&,
    ::service::raft_group0_client&,
-    ::service::migration_manager&> registration(ldap_role_manager_full_name);
+    ::service::migration_manager&,
+    cache&> registration(ldap_role_manager_full_name);

 ldap_role_manager::ldap_role_manager(
        std::string_view query_template, std::string_view target_attr, std::string_view bind_name, std::string_view bind_password,
-        cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm)
-        : _std_mgr(qp, rg0c, mm), _group0_client(rg0c), _query_template(query_template), _target_attr(target_attr), _bind_name(bind_name)
+        cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm, cache& cache)
+        : _std_mgr(qp, rg0c, mm, cache), _group0_client(rg0c), _query_template(query_template), _target_attr(target_attr), _bind_name(bind_name)
        , _bind_password(bind_password)
        , _connection_factory(bind(std::mem_fn(&ldap_role_manager::reconnect), std::ref(*this))) {
 }

-ldap_role_manager::ldap_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm)
+ldap_role_manager::ldap_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm, cache& cache)
    : ldap_role_manager(
            qp.db().get_config().ldap_url_template(),
            qp.db().get_config().ldap_attr_role(),
@@ -101,7 +102,8 @@ ldap_role_manager::ldap_role_manager(cql3::query_processor& qp, ::service::raft_
            qp.db().get_config().ldap_bind_passwd(),
            qp,
            rg0c,
-            mm) {
+            mm,
+            cache) {
 }

 std::string_view ldap_role_manager::qualified_java_name() const noexcept {
--- a/auth/ldap_role_manager.hh
+++ b/auth/ldap_role_manager.hh
@@ -14,6 +14,7 @@

 #include "ent/ldap/ldap_connection.hh"
 #include "standard_role_manager.hh"
+#include "auth/cache.hh"

 namespace auth {

@@ -43,12 +44,13 @@ class ldap_role_manager : public role_manager {
            std::string_view bind_password, ///< LDAP bind credentials.
            cql3::query_processor& qp, ///< Passed to standard_role_manager.
            ::service::raft_group0_client& rg0c, ///< Passed to standard_role_manager.
-            ::service::migration_manager& mm ///< Passed to standard_role_manager.
+            ::service::migration_manager& mm, ///< Passed to standard_role_manager.
+            cache& cache ///< Passed to standard_role_manager.
    );

    /// Retrieves LDAP configuration entries from qp and invokes the other constructor.  Required by
    /// class_registrator<role_manager>.
-    ldap_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm);
+    ldap_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& rg0c, ::service::migration_manager& mm, cache& cache);

    /// Thrown when query-template parsing fails.
    struct url_error : public std::runtime_error {
--- a/auth/maintenance_socket_role_manager.cc
+++ b/auth/maintenance_socket_role_manager.cc
@@ -11,6 +11,7 @@
 #include <seastar/core/future.hh>
 #include <stdexcept>
 #include <string_view>
+#include "auth/cache.hh"
 #include "cql3/description.hh"
 #include "utils/class_registrator.hh"

@@ -23,7 +24,8 @@ static const class_registrator<
        maintenance_socket_role_manager,
        cql3::query_processor&,
        ::service::raft_group0_client&,
-        ::service::migration_manager&> registration(sstring{maintenance_socket_role_manager_name});
+        ::service::migration_manager&,
+        cache&> registration(sstring{maintenance_socket_role_manager_name});


 std::string_view maintenance_socket_role_manager::qualified_java_name() const noexcept {
--- a/auth/maintenance_socket_role_manager.hh
+++ b/auth/maintenance_socket_role_manager.hh
@@ -8,6 +8,7 @@

 #pragma once

+#include "auth/cache.hh"
 #include "auth/resource.hh"
 #include "auth/role_manager.hh"
 #include <seastar/core/future.hh>
@@ -29,7 +30,7 @@ extern const std::string_view maintenance_socket_role_manager_name;
 // system_auth keyspace, which may be not yet created when the maintenance socket starts listening.
 class maintenance_socket_role_manager final : public role_manager {
 public:
-    maintenance_socket_role_manager(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&) {}
+    maintenance_socket_role_manager(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&) {}

    virtual std::string_view qualified_java_name() const noexcept override;

--- a/auth/password_authenticator.cc
+++ b/auth/password_authenticator.cc
@@ -49,7 +49,7 @@ static const class_registrator<
        cql3::query_processor&,
        ::service::raft_group0_client&,
        ::service::migration_manager&,
-        utils::alien_worker&> password_auth_reg("org.apache.cassandra.auth.PasswordAuthenticator");
+        cache&> password_auth_reg("org.apache.cassandra.auth.PasswordAuthenticator");

 static thread_local auto rng_for_salt = std::default_random_engine(std::random_device{}());

@@ -63,13 +63,13 @@ std::string password_authenticator::default_superuser(const db::config& cfg) {
 password_authenticator::~password_authenticator() {
 }

-password_authenticator::password_authenticator(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm, utils::alien_worker& hashing_worker)
+password_authenticator::password_authenticator(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm, cache& cache)
    : _qp(qp)
    , _group0_client(g0)
    , _migration_manager(mm)
+    , _cache(cache)
    , _stopped(make_ready_future<>()) 
    , _superuser(default_superuser(qp.db().get_config()))
-    , _hashing_worker(hashing_worker)
 {}

 static bool has_salted_hash(const cql3::untyped_result_set_row& row) {
@@ -315,24 +315,31 @@ future<authenticated_user> password_authenticator::authenticate(
    const sstring password = credentials.at(PASSWORD_KEY);

    try {
-        const std::optional<sstring> salted_hash = co_await get_password_hash(username);
-        if (!salted_hash) {
-            throw exceptions::authentication_exception("Username and/or password are incorrect");
+        std::optional<sstring> salted_hash;
+        if (legacy_mode(_qp)) {
+            salted_hash = co_await get_password_hash(username);
+            if (!salted_hash) {
+                throw exceptions::authentication_exception("Username and/or password are incorrect");
+            }
+        } else {
+            auto role = _cache.get(username);
+            if (!role || role->salted_hash.empty()) {
+                throw exceptions::authentication_exception("Username and/or password are incorrect");
+            }
+            salted_hash = role->salted_hash;
        }
-        const bool password_match = co_await _hashing_worker.submit<bool>([password = std::move(password), salted_hash = std::move(salted_hash)]{
-            return passwords::check(password, *salted_hash);
-        });
+        const bool password_match = co_await passwords::check(password, *salted_hash);
        if (!password_match) {
            throw exceptions::authentication_exception("Username and/or password are incorrect");
        }
        co_return username;
-    } catch (std::system_error &) {
+    } catch (const std::system_error &) {
        std::throw_with_nested(exceptions::authentication_exception("Could not verify password"));
-    } catch (exceptions::request_execution_exception& e) {
+    } catch (const exceptions::request_execution_exception& e) {
        std::throw_with_nested(exceptions::authentication_exception(e.what()));
-    } catch (exceptions::authentication_exception& e) {
+    } catch (const exceptions::authentication_exception& e) {
        std::throw_with_nested(e);
-    } catch (exceptions::unavailable_exception& e) {
+    } catch (const exceptions::unavailable_exception& e) {
        std::throw_with_nested(exceptions::authentication_exception(e.get_message()));
    } catch (...) {
        std::throw_with_nested(exceptions::authentication_exception("authentication failed"));
--- a/auth/password_authenticator.hh
+++ b/auth/password_authenticator.hh
@@ -16,8 +16,8 @@
 #include "db/consistency_level_type.hh"
 #include "auth/authenticator.hh"
 #include "auth/passwords.hh"
+#include "auth/cache.hh"
 #include "service/raft/raft_group0_client.hh"
-#include "utils/alien_worker.hh"

 namespace db {
    class config;
@@ -41,19 +41,19 @@ class password_authenticator : public authenticator {
    cql3::query_processor& _qp;
    ::service::raft_group0_client& _group0_client;
    ::service::migration_manager& _migration_manager;
+    cache& _cache;
    future<> _stopped;
    abort_source _as;
    std::string _superuser; // default superuser name from the config (may or may not be present in roles table)
    shared_promise<> _superuser_created_promise;
    // We used to also support bcrypt, SHA-256, and MD5 (ref. scylladb#24524).
    constexpr static auth::passwords::scheme _scheme = passwords::scheme::sha_512;
-    utils::alien_worker& _hashing_worker;

 public:
    static db::consistency_level consistency_for_user(std::string_view role_name);
    static std::string default_superuser(const db::config&);

-    password_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&);
+    password_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&);

    ~password_authenticator();

--- a/auth/passwords.cc
+++ b/auth/passwords.cc
@@ -7,6 +7,8 @@
 */

 #include "auth/passwords.hh"
+#include "utils/crypt_sha512.hh"
+#include <seastar/core/coroutine.hh>

 #include <cerrno>

@@ -21,27 +23,48 @@ static thread_local crypt_data tlcrypt = {};

 namespace detail {

+void verify_hashing_output(const char * res) {
+    if (!res || (res[0] == '*')) {
+        throw std::system_error(errno, std::system_category());
+    }
+}
+
 void verify_scheme(scheme scheme) {
    const sstring random_part_of_salt = "aaaabbbbccccdddd";

    const sstring salt = sstring(prefix_for_scheme(scheme)) + random_part_of_salt;
    const char* e = crypt_r("fisk", salt.c_str(), &tlcrypt);
-
-    if (e && (e[0] != '*')) {
-        return;
+    try {
+        verify_hashing_output(e);
+    } catch (const std::system_error& ex) {
+        throw no_supported_schemes();
    }
-
-    throw no_supported_schemes();
 }

 sstring hash_with_salt(const sstring& pass, const sstring& salt) {
    auto res = crypt_r(pass.c_str(), salt.c_str(), &tlcrypt);
-    if (!res || (res[0] == '*')) {
-        throw std::system_error(errno, std::system_category());
-    }
+    verify_hashing_output(res);
    return res;
 }

+seastar::future<sstring> hash_with_salt_async(const sstring& pass, const sstring& salt) {
+    sstring res;
+    // Only SHA-512 hashes for passphrases shorter than 256 bytes can be computed using
+    // the __crypt_sha512 method. For other computations, we fall back to the
+    // crypt_r implementation from `<crypt.h>`, which can stall.
+    if (salt.starts_with(prefix_for_scheme(scheme::sha_512)) && pass.size() <= 255) {
+        char buf[128];
+        const char * output_ptr = co_await __crypt_sha512(pass.c_str(), salt.c_str(), buf);
+        verify_hashing_output(output_ptr);
+        res = output_ptr;
+    } else {
+        const char * output_ptr = crypt_r(pass.c_str(), salt.c_str(), &tlcrypt);
+        verify_hashing_output(output_ptr);
+        res = output_ptr;
+    }
+    co_return res;
+}
+
 std::string_view prefix_for_scheme(scheme c) noexcept {
    switch (c) {
    case scheme::bcrypt_y: return "$2y$";
@@ -58,8 +81,9 @@ no_supported_schemes::no_supported_schemes()
        : std::runtime_error("No allowed hashing schemes are supported on this system") {
 }

-bool check(const sstring& pass, const sstring& salted_hash) {
-    return detail::hash_with_salt(pass, salted_hash) == salted_hash;
+seastar::future<bool> check(const sstring& pass, const sstring& salted_hash) {
+    const auto pwd_hash = co_await detail::hash_with_salt_async(pass, salted_hash);
+    co_return pwd_hash == salted_hash;
 }

 } // namespace auth::passwords
--- a/auth/passwords.hh
+++ b/auth/passwords.hh
@@ -11,6 +11,7 @@
 #include <random>
 #include <stdexcept>

+#include <seastar/core/future.hh>
 #include <seastar/core/sstring.hh>

 #include "seastarx.hh"
@@ -75,11 +76,23 @@ sstring generate_salt(RandomNumberEngine& g, scheme scheme) {

 ///
 /// Hash a password combined with an implementation-specific salt string.
+/// Deprecated in favor of `hash_with_salt_async`. This function is still used
+/// when generating password hashes for storage to ensure that
+/// `hash_with_salt` and `hash_with_salt_async` produce identical results,
+/// preserving backward compatibility.
 ///
 /// \throws \ref std::system_error when an unexpected implementation-specific error occurs.
 ///
 sstring hash_with_salt(const sstring& pass, const sstring& salt);

+///
+/// Async version of `hash_with_salt` that returns a future.
+/// If possible, hashing uses `coroutine::maybe_yield` to prevent reactor stalls.
+///
+/// \throws \ref std::system_error when an unexpected implementation-specific error occurs.
+///
+seastar::future<sstring> hash_with_salt_async(const sstring& pass, const sstring& salt);
+
 } // namespace detail

 ///
@@ -107,6 +120,6 @@ sstring hash(const sstring& pass, RandomNumberEngine& g, scheme scheme) {
 ///
 /// \throws \ref std::system_error when an unexpected implementation-specific error occurs.
 ///
-bool check(const sstring& pass, const sstring& salted_hash);
+seastar::future<bool> check(const sstring& pass, const sstring& salted_hash);

 } // namespace auth::passwords
--- a/auth/saslauthd_authenticator.cc
+++ b/auth/saslauthd_authenticator.cc
@@ -35,9 +35,9 @@ static const class_registrator<
        cql3::query_processor&,
        ::service::raft_group0_client&,
        ::service::migration_manager&,
-        utils::alien_worker&> saslauthd_auth_reg("com.scylladb.auth.SaslauthdAuthenticator");
+        cache&> saslauthd_auth_reg("com.scylladb.auth.SaslauthdAuthenticator");

-saslauthd_authenticator::saslauthd_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&)
+saslauthd_authenticator::saslauthd_authenticator(cql3::query_processor& qp, ::service::raft_group0_client&, ::service::migration_manager&, cache&)
    : _socket_path(qp.db().get_config().saslauthd_socket_path())
 {}

--- a/auth/saslauthd_authenticator.hh
+++ b/auth/saslauthd_authenticator.hh
@@ -11,7 +11,7 @@
 #pragma once

 #include "auth/authenticator.hh"
-#include "utils/alien_worker.hh"
+#include "auth/cache.hh"

 namespace cql3 {
 class query_processor;
@@ -29,7 +29,7 @@ namespace auth {
 class saslauthd_authenticator : public authenticator {
    sstring _socket_path; ///< Path to the domain socket on which saslauthd is listening.
 public:
-    saslauthd_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, utils::alien_worker&);
+    saslauthd_authenticator(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&);

    future<> start() override;

--- a/auth/service.cc
+++ b/auth/service.cc
@@ -17,6 +17,7 @@
 #include <chrono>

 #include <seastar/core/future-util.hh>
+#include <seastar/core/shard_id.hh>
 #include <seastar/core/sharded.hh>
 #include <seastar/core/shared_ptr.hh>

@@ -157,6 +158,7 @@ static future<> validate_role_exists(const service& ser, std::string_view role_n

 service::service(
        utils::loading_cache_config c,
+        cache& cache,
        cql3::query_processor& qp,
        ::service::raft_group0_client& g0,
        ::service::migration_notifier& mn,
@@ -166,6 +168,7 @@ service::service(
        maintenance_socket_enabled used_by_maintenance_socket)
            : _loading_cache_config(std::move(c))
            , _permissions_cache(nullptr)
+            , _cache(cache)
            , _qp(qp)
            , _group0_client(g0)
            , _mnotifier(mn)
@@ -188,15 +191,16 @@ service::service(
        ::service::migration_manager& mm,
        const service_config& sc,
        maintenance_socket_enabled used_by_maintenance_socket,
-        utils::alien_worker& hashing_worker)
+        cache& cache)
            : service(
                      std::move(c),
+                      cache,
                      qp,
                      g0,
                      mn,
                      create_object<authorizer>(sc.authorizer_java_name, qp, g0, mm),
-                      create_object<authenticator>(sc.authenticator_java_name, qp, g0, mm, hashing_worker),
-                      create_object<role_manager>(sc.role_manager_java_name, qp, g0, mm),
+                      create_object<authenticator>(sc.authenticator_java_name, qp, g0, mm, cache),
+                      create_object<role_manager>(sc.role_manager_java_name, qp, g0, mm, cache),
                      used_by_maintenance_socket) {
 }

@@ -221,7 +225,7 @@ future<> service::create_legacy_keyspace_if_missing(::service::migration_manager
            try {
                co_return co_await mm.announce(::service::prepare_new_keyspace_announcement(db.real_database(), ksm, ts),
                        std::move(group0_guard), seastar::format("auth_service: create {} keyspace", meta::legacy::AUTH_KS));
-            } catch (::service::group0_concurrent_modification&) {
+            } catch (const ::service::group0_concurrent_modification&) {
                log.info("Concurrent operation is detected while creating {} keyspace, retrying.", meta::legacy::AUTH_KS);
            }
        }
@@ -232,6 +236,9 @@ future<> service::start(::service::migration_manager& mm, db::system_keyspace& s
    auto auth_version = co_await sys_ks.get_auth_version();
    // version is set in query processor to be easily available in various places we call auth::legacy_mode check.
    _qp.auth_version = auth_version;
+    if (this_shard_id() == 0) {
+        co_await _cache.load_all();
+    }
    if (!_used_by_maintenance_socket) {
        // this legacy keyspace is only used by cqlsh
        // it's needed when executing `list roles` or `list users`
--- a/auth/service.hh
+++ b/auth/service.hh
@@ -21,12 +21,12 @@
 #include "auth/authorizer.hh"
 #include "auth/permission.hh"
 #include "auth/permissions_cache.hh"
+#include "auth/cache.hh"
 #include "auth/role_manager.hh"
 #include "auth/common.hh"
 #include "cql3/description.hh"
 #include "seastarx.hh"
 #include "service/raft/raft_group0_client.hh"
-#include "utils/alien_worker.hh"
 #include "utils/observable.hh"
 #include "utils/serialized_action.hh"
 #include "service/maintenance_mode.hh"
@@ -77,6 +77,7 @@ public:
 class service final : public seastar::peering_sharded_service<service> {
    utils::loading_cache_config _loading_cache_config;
    std::unique_ptr<permissions_cache> _permissions_cache;
+    cache& _cache;

    cql3::query_processor& _qp;

@@ -107,6 +108,7 @@ class service final : public seastar::peering_sharded_service<service> {
 public:
    service(
            utils::loading_cache_config,
+            cache& cache,
            cql3::query_processor&,
            ::service::raft_group0_client&,
            ::service::migration_notifier&,
@@ -128,7 +130,7 @@ public:
            ::service::migration_manager&,
            const service_config&,
            maintenance_socket_enabled,
-            utils::alien_worker&);
+            cache&);

    future<> start(::service::migration_manager&, db::system_keyspace&);

--- a/auth/standard_role_manager.cc
+++ b/auth/standard_role_manager.cc
@@ -41,21 +41,6 @@

 namespace auth {

-namespace meta {
-
-namespace role_members_table {
-
-constexpr std::string_view name{"role_members" , 12};
-
-}
-
-namespace role_attributes_table {
-
-constexpr std::string_view name{"role_attributes", 15};
-
-}
-
-}

 static logging::logger log("standard_role_manager");

@@ -64,7 +49,8 @@ static const class_registrator<
        standard_role_manager,
        cql3::query_processor&,
        ::service::raft_group0_client&,
-        ::service::migration_manager&> registration("org.apache.cassandra.auth.CassandraRoleManager");
+        ::service::migration_manager&,
+        cache&> registration("org.apache.cassandra.auth.CassandraRoleManager");

 struct record final {
    sstring name;
@@ -121,10 +107,11 @@ static bool has_can_login(const cql3::untyped_result_set_row& row) {
    return row.has("can_login") && !(boolean_type->deserialize(row.get_blob_unfragmented("can_login")).is_null());
 }

-standard_role_manager::standard_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm)
+standard_role_manager::standard_role_manager(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm, cache& cache)
    : _qp(qp)
    , _group0_client(g0)
    , _migration_manager(mm)
+    , _cache(cache)
    , _stopped(make_ready_future<>())
    , _superuser(password_authenticator::default_superuser(qp.db().get_config()))
 {}
@@ -136,7 +123,7 @@ std::string_view standard_role_manager::qualified_java_name() const noexcept {
 const resource_set& standard_role_manager::protected_resources() const {
    static const resource_set resources({
            make_data_resource(meta::legacy::AUTH_KS, meta::roles_table::name),
-            make_data_resource(meta::legacy::AUTH_KS, meta::role_members_table::name)});
+            make_data_resource(meta::legacy::AUTH_KS, ROLE_MEMBERS_CF)});

    return resources;
 }
@@ -160,7 +147,7 @@ future<> standard_role_manager::create_legacy_metadata_tables_if_missing() const
            "  PRIMARY KEY (role, member)"
            ")",
            meta::legacy::AUTH_KS,
-            meta::role_members_table::name);
+            ROLE_MEMBERS_CF);
    static const sstring create_role_attributes_query = seastar::format(
            "CREATE TABLE {}.{} ("
            "  role text,"
@@ -169,7 +156,7 @@ future<> standard_role_manager::create_legacy_metadata_tables_if_missing() const
            "  PRIMARY KEY(role, name)"
            ")",
            meta::legacy::AUTH_KS,
-            meta::role_attributes_table::name);
+            ROLE_ATTRIBUTES_CF);
    return when_all_succeed(
            create_legacy_metadata_table_if_missing(
                    meta::roles_table::name,
@@ -177,12 +164,12 @@ future<> standard_role_manager::create_legacy_metadata_tables_if_missing() const
                    create_roles_query,
                    _migration_manager),
            create_legacy_metadata_table_if_missing(
-                    meta::role_members_table::name,
+                    ROLE_MEMBERS_CF,
                    _qp,
                    create_role_members_query,
                    _migration_manager),
            create_legacy_metadata_table_if_missing(
-                    meta::role_attributes_table::name,
+                    ROLE_ATTRIBUTES_CF,
                    _qp,
                    create_role_attributes_query,
                    _migration_manager)).discard_result();
@@ -205,7 +192,7 @@ future<> standard_role_manager::legacy_create_default_role_if_missing() {
                {_superuser},
                cql3::query_processor::cache_internal::no).discard_result();
        log.info("Created default superuser role '{}'.", _superuser);
-    } catch(const exceptions::unavailable_exception& e) {
+    } catch (const exceptions::unavailable_exception& e) {
        log.warn("Skipped default role setup: some nodes were not ready; will retry");
        throw e;
    }
@@ -429,7 +416,7 @@ future<> standard_role_manager::drop(std::string_view role_name, ::service::grou
    const auto revoke_from_members = [this, role_name, &mc] () -> future<> {
        const sstring query = seastar::format("SELECT member FROM {}.{} WHERE role = ?",
                get_auth_ks_name(_qp),
-                meta::role_members_table::name);
+                ROLE_MEMBERS_CF);
        const auto members = co_await _qp.execute_internal(
                query,
                consistency_for_role(role_name),
@@ -461,7 +448,7 @@ future<> standard_role_manager::drop(std::string_view role_name, ::service::grou
    const auto remove_attributes_of = [this, role_name, &mc] () -> future<> {
        const sstring query = seastar::format("DELETE FROM {}.{} WHERE role = ?",
                get_auth_ks_name(_qp),
-                meta::role_attributes_table::name);
+                ROLE_ATTRIBUTES_CF);
        if (legacy_mode(_qp)) {
            co_await _qp.execute_internal(query, {sstring(role_name)},
                cql3::query_processor::cache_internal::yes).discard_result();
@@ -517,7 +504,7 @@ standard_role_manager::legacy_modify_membership(
            case membership_change::add: {
                const sstring insert_query = seastar::format("INSERT INTO {}.{} (role, member) VALUES (?, ?)",
                        get_auth_ks_name(_qp),
-                        meta::role_members_table::name);
+                        ROLE_MEMBERS_CF);
                co_return co_await _qp.execute_internal(
                        insert_query,
                        consistency_for_role(role_name),
@@ -529,7 +516,7 @@ standard_role_manager::legacy_modify_membership(
            case membership_change::remove: {
                const sstring delete_query = seastar::format("DELETE FROM {}.{} WHERE role = ? AND member = ?",
                        get_auth_ks_name(_qp),
-                        meta::role_members_table::name);
+                        ROLE_MEMBERS_CF);
                co_return co_await _qp.execute_internal(
                        delete_query,
                        consistency_for_role(role_name),
@@ -567,12 +554,12 @@ standard_role_manager::modify_membership(
    case membership_change::add:
        modify_role_members = seastar::format("INSERT INTO {}.{} (role, member) VALUES (?, ?)",
                get_auth_ks_name(_qp),
-                meta::role_members_table::name);
+                ROLE_MEMBERS_CF);
        break;
    case membership_change::remove:
        modify_role_members = seastar::format("DELETE FROM {}.{} WHERE role = ? AND member = ?",
                get_auth_ks_name(_qp),
-                meta::role_members_table::name);
+                ROLE_MEMBERS_CF);
        break;
    default:
        on_internal_error(log, format("unknown membership_change value: {}", int(ch)));
@@ -666,7 +653,7 @@ future<role_set> standard_role_manager::query_granted(std::string_view grantee_n
 future<role_to_directly_granted_map> standard_role_manager::query_all_directly_granted(::service::query_state& qs) {
    const sstring query = seastar::format("SELECT * FROM {}.{}",
            get_auth_ks_name(_qp),
-            meta::role_members_table::name);
+            ROLE_MEMBERS_CF);

    const auto results = co_await _qp.execute_internal(
            query,
@@ -731,15 +718,21 @@ future<bool> standard_role_manager::is_superuser(std::string_view role_name) {
 }

 future<bool> standard_role_manager::can_login(std::string_view role_name) {
-    return require_record(_qp, role_name).then([](record r) {
-        return r.can_login;
-    });
+    if (legacy_mode(_qp)) {
+       const auto r = co_await require_record(_qp, role_name);
+       co_return r.can_login;
+    }
+    auto role = _cache.get(sstring(role_name));
+    if (!role) {
+        throw nonexistant_role(role_name);
+    }
+    co_return role->can_login;
 }

 future<std::optional<sstring>> standard_role_manager::get_attribute(std::string_view role_name, std::string_view attribute_name, ::service::query_state& qs) {
    const sstring query = seastar::format("SELECT name, value FROM {}.{} WHERE role = ? AND name = ?",
            get_auth_ks_name(_qp),
-            meta::role_attributes_table::name);
+            ROLE_ATTRIBUTES_CF);
    const auto result_set = co_await _qp.execute_internal(query, db::consistency_level::ONE, qs, {sstring(role_name), sstring(attribute_name)}, cql3::query_processor::cache_internal::yes);
    if (!result_set->empty()) {
        const cql3::untyped_result_set_row &row = result_set->one();
@@ -770,7 +763,7 @@ future<> standard_role_manager::set_attribute(std::string_view role_name, std::s
    }
    const sstring query = seastar::format("INSERT INTO {}.{} (role, name, value)  VALUES (?, ?, ?)",
            get_auth_ks_name(_qp),
-            meta::role_attributes_table::name);
+            ROLE_ATTRIBUTES_CF);
    if (legacy_mode(_qp)) {
        co_await _qp.execute_internal(query, {sstring(role_name), sstring(attribute_name), sstring(attribute_value)}, cql3::query_processor::cache_internal::yes).discard_result();
    } else {
@@ -785,7 +778,7 @@ future<> standard_role_manager::remove_attribute(std::string_view role_name, std
    }
    const sstring query = seastar::format("DELETE FROM {}.{} WHERE role = ? AND name = ?",
            get_auth_ks_name(_qp),
-            meta::role_attributes_table::name);
+            ROLE_ATTRIBUTES_CF);
    if (legacy_mode(_qp)) {
        co_await _qp.execute_internal(query, {sstring(role_name), sstring(attribute_name)}, cql3::query_processor::cache_internal::yes).discard_result();
    } else {
--- a/auth/standard_role_manager.hh
+++ b/auth/standard_role_manager.hh
@@ -10,6 +10,7 @@

 #include "auth/common.hh"
 #include "auth/role_manager.hh"
+#include "auth/cache.hh"

 #include <string_view>

@@ -36,13 +37,14 @@ class standard_role_manager final : public role_manager {
    cql3::query_processor& _qp;
    ::service::raft_group0_client& _group0_client;
    ::service::migration_manager& _migration_manager;
+    cache& _cache;
    future<> _stopped;
    abort_source _as;
    std::string _superuser;
    shared_promise<> _superuser_created_promise;

 public:
-    standard_role_manager(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&);
+    standard_role_manager(cql3::query_processor&, ::service::raft_group0_client&, ::service::migration_manager&, cache&);

    virtual std::string_view qualified_java_name() const noexcept override;

--- a/auth/transitional.cc
+++ b/auth/transitional.cc
@@ -13,6 +13,7 @@
 #include "auth/authorizer.hh"
 #include "auth/default_authorizer.hh"
 #include "auth/password_authenticator.hh"
+#include "auth/cache.hh"
 #include "auth/permission.hh"
 #include "service/raft/raft_group0_client.hh"
 #include "utils/class_registrator.hh"
@@ -37,8 +38,8 @@ class transitional_authenticator : public authenticator {
 public:
    static const sstring PASSWORD_AUTHENTICATOR_NAME;

-    transitional_authenticator(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm, utils::alien_worker& hashing_worker)
-            : transitional_authenticator(std::make_unique<password_authenticator>(qp, g0, mm, hashing_worker)) {
+    transitional_authenticator(cql3::query_processor& qp, ::service::raft_group0_client& g0, ::service::migration_manager& mm, cache& cache)
+            : transitional_authenticator(std::make_unique<password_authenticator>(qp, g0, mm, cache)) {
    }
    transitional_authenticator(std::unique_ptr<authenticator> a)
            : _authenticator(std::move(a)) {
@@ -80,7 +81,7 @@ public:
        }).handle_exception([](auto ep) {
            try {
                std::rethrow_exception(ep);
-            } catch (exceptions::authentication_exception&) {
+            } catch (const exceptions::authentication_exception&) {
                // return anon user
                return make_ready_future<authenticated_user>(anonymous_user());
            }
@@ -125,7 +126,7 @@ public:
            virtual bytes evaluate_response(bytes_view client_response) override {
                try {
                    return _sasl->evaluate_response(client_response);
-                } catch (exceptions::authentication_exception&) {
+                } catch (const exceptions::authentication_exception&) {
                    _complete = true;
                    return {};
                }
@@ -140,7 +141,7 @@ public:
                    return _sasl->get_authenticated_user().handle_exception([](auto ep) {
                        try {
                            std::rethrow_exception(ep);
-                        } catch (exceptions::authentication_exception&) {
+                        } catch (const exceptions::authentication_exception&) {
                            // return anon user
                            return make_ready_future<authenticated_user>(anonymous_user());
                        }
@@ -240,7 +241,7 @@ static const class_registrator<
        cql3::query_processor&,
        ::service::raft_group0_client&,
        ::service::migration_manager&,
-        utils::alien_worker&> transitional_authenticator_reg(auth::PACKAGE_NAME + "TransitionalAuthenticator");
+        auth::cache&> transitional_authenticator_reg(auth::PACKAGE_NAME + "TransitionalAuthenticator");

 static const class_registrator<
        auth::authorizer,
--- a/backlog_controller.hh
+++ b/backlog_controller.hh
@@ -15,6 +15,7 @@
 #include <cmath>

 #include "seastarx.hh"
+#include "backlog_controller_fwd.hh"

 // Simple proportional controller to adjust shares for processes for which a backlog can be clearly
 // defined.
@@ -128,11 +129,21 @@ public:
    static constexpr unsigned normalization_factor = 30;
    static constexpr float disable_backlog = std::numeric_limits<double>::infinity();
    static constexpr float backlog_disabled(float backlog) { return std::isinf(backlog); }
-    compaction_controller(backlog_controller::scheduling_group sg, float static_shares, std::chrono::milliseconds interval, std::function<float()> current_backlog)
+    static inline const std::vector<backlog_controller::control_point> default_control_points = {
+            backlog_controller::control_point{0.0, 50}, {1.5, 100}, {normalization_factor, default_compaction_maximum_shares}};
+    compaction_controller(backlog_controller::scheduling_group sg, float static_shares, std::optional<float> max_shares,
+        std::chrono::milliseconds interval, std::function<float()> current_backlog)
        : backlog_controller(std::move(sg), std::move(interval),
-          std::vector<backlog_controller::control_point>({{0.0, 50}, {1.5, 100} , {normalization_factor, 1000}}),
+          default_control_points,
          std::move(current_backlog),
          static_shares
        )
-    {}
+    {
+        if (max_shares) {
+            set_max_shares(*max_shares);
+        }
+    }
+
+    // Updates the maximum output value for control points.
+    void set_max_shares(float max_shares);
 };
--- a/backlog_controller_fwd.hh
+++ b/backlog_controller_fwd.hh
@@ -0,0 +1,13 @@
+/*
+ * Copyright (C) 2025-present ScyllaDB
+ */
+
+/*
+ * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
+ */
+
+#pragma once
+
+#include <cstdint>
+
+static constexpr uint64_t default_compaction_maximum_shares = 1000;
--- a/cdc/CMakeLists.txt
+++ b/cdc/CMakeLists.txt
@@ -17,5 +17,8 @@ target_link_libraries(cdc
  PRIVATE
    replica)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(cdc REUSE_FROM scylla-precompiled-header)
+endif()
 check_headers(check-headers cdc
  GLOB_RECURSE ${CMAKE_CURRENT_SOURCE_DIR}/*.hh)
--- a/cdc/generation.cc
+++ b/cdc/generation.cc
@@ -204,7 +204,7 @@ future<topology_description> topology_description::clone_async() const {

    for (const auto& entry : _entries) {
        vec.push_back(entry);
-        co_await seastar::maybe_yield();
+        co_await coroutine::maybe_yield();
    }

    co_return topology_description{std::move(vec)};
@@ -1209,7 +1209,7 @@ future<mutation> create_table_streams_mutation(table_id table, db_clock::time_po
    co_return std::move(m);
 }

-future<mutation> create_table_streams_mutation(table_id table, db_clock::time_point stream_ts, const std::vector<cdc::stream_id>& stream_ids, api::timestamp_type ts) {
+future<mutation> create_table_streams_mutation(table_id table, db_clock::time_point stream_ts, const utils::chunked_vector<cdc::stream_id>& stream_ids, api::timestamp_type ts) {
    auto s = db::system_keyspace::cdc_streams_state();

    mutation m(s, partition_key::from_single_value(*s,
@@ -1252,24 +1252,24 @@ future<> generation_service::load_cdc_tablet_streams(std::optional<std::unordere
        tables_to_process = _cdc_metadata.get_tables_with_cdc_tablet_streams() | std::ranges::to<std::unordered_set<table_id>>();
    }

-    auto read_streams_state = [this] (const std::optional<std::unordered_set<table_id>>& tables, noncopyable_function<future<>(table_id, db_clock::time_point, std::vector<cdc::stream_id>)> f) -> future<> {
+    auto read_streams_state = [this] (const std::optional<std::unordered_set<table_id>>& tables, noncopyable_function<future<>(table_id, db_clock::time_point, utils::chunked_vector<cdc::stream_id>)> f) -> future<> {
        if (tables) {
            for (auto table : *tables) {
-                co_await _sys_ks.local().read_cdc_streams_state(table, [&] (table_id table, db_clock::time_point base_ts, std::vector<cdc::stream_id> base_stream_set) -> future<> {
+                co_await _sys_ks.local().read_cdc_streams_state(table, [&] (table_id table, db_clock::time_point base_ts, utils::chunked_vector<cdc::stream_id> base_stream_set) -> future<> {
                    return f(table, base_ts, std::move(base_stream_set));
                });
            }
        } else {
-            co_await _sys_ks.local().read_cdc_streams_state(std::nullopt, [&] (table_id table, db_clock::time_point base_ts, std::vector<cdc::stream_id> base_stream_set) -> future<> {
+            co_await _sys_ks.local().read_cdc_streams_state(std::nullopt, [&] (table_id table, db_clock::time_point base_ts, utils::chunked_vector<cdc::stream_id> base_stream_set) -> future<> {
                return f(table, base_ts, std::move(base_stream_set));
            });
        }
    };

-    co_await read_streams_state(changed_tables, [this, &tables_to_process] (table_id table, db_clock::time_point base_ts, std::vector<cdc::stream_id> base_stream_set) -> future<> {
+    co_await read_streams_state(changed_tables, [this, &tables_to_process] (table_id table, db_clock::time_point base_ts, utils::chunked_vector<cdc::stream_id> base_stream_set) -> future<> {
        table_streams new_table_map;

-        auto append_stream = [&new_table_map] (db_clock::time_point stream_tp, std::vector<cdc::stream_id> stream_set) {
+        auto append_stream = [&new_table_map] (db_clock::time_point stream_tp, utils::chunked_vector<cdc::stream_id> stream_set) {
            auto ts = std::chrono::duration_cast<api::timestamp_clock::duration>(stream_tp.time_since_epoch()).count();
            new_table_map[ts] = committed_stream_set {stream_tp, std::move(stream_set)};
        };
@@ -1345,7 +1345,7 @@ future<> generation_service::query_cdc_timestamps(table_id table, bool ascending
    }
 }

-future<> generation_service::query_cdc_streams(table_id table, noncopyable_function<future<>(db_clock::time_point, const std::vector<cdc::stream_id>& current, cdc::cdc_stream_diff)> f) {
+future<> generation_service::query_cdc_streams(table_id table, noncopyable_function<future<>(db_clock::time_point, const utils::chunked_vector<cdc::stream_id>& current, cdc::cdc_stream_diff)> f) {
    const auto& all_tables = _cdc_metadata.get_all_tablet_streams();
    auto table_it = all_tables.find(table);
    if (table_it == all_tables.end()) {
@@ -1402,8 +1402,8 @@ future<> generation_service::generate_tablet_resize_update(utils::chunked_vector
        co_return;
    }

-    std::vector<cdc::stream_id> new_streams;
-    new_streams.reserve(new_tablet_map.tablet_count());
+    utils::chunked_vector<cdc::stream_id> new_streams;
+    co_await utils::reserve_gently(new_streams, new_tablet_map.tablet_count());
    for (auto tid : new_tablet_map.tablet_ids()) {
        new_streams.emplace_back(new_tablet_map.get_last_token(tid), 0);
        co_await coroutine::maybe_yield();
@@ -1425,7 +1425,7 @@ future<> generation_service::generate_tablet_resize_update(utils::chunked_vector
    muts.emplace_back(std::move(mut));
 }

-future<utils::chunked_vector<mutation>> get_cdc_stream_gc_mutations(table_id table, db_clock::time_point base_ts, const std::vector<cdc::stream_id>& base_stream_set, api::timestamp_type ts) {
+future<utils::chunked_vector<mutation>> get_cdc_stream_gc_mutations(table_id table, db_clock::time_point base_ts, const utils::chunked_vector<cdc::stream_id>& base_stream_set, api::timestamp_type ts) {
    utils::chunked_vector<mutation> muts;
    muts.reserve(2);

--- a/cdc/generation.hh
+++ b/cdc/generation.hh
@@ -143,12 +143,12 @@ stream_state read_stream_state(int8_t val);

 struct committed_stream_set {
    db_clock::time_point ts;
-    std::vector<cdc::stream_id> streams;
+    utils::chunked_vector<cdc::stream_id> streams;
 };

 struct cdc_stream_diff {
-    std::vector<stream_id> closed_streams;
-    std::vector<stream_id> opened_streams;
+    utils::chunked_vector<stream_id> closed_streams;
+    utils::chunked_vector<stream_id> opened_streams;
 };

 using table_streams = std::map<api::timestamp_type, committed_stream_set>;
@@ -220,11 +220,11 @@ future<utils::chunked_vector<mutation>> get_cdc_generation_mutations_v3(
    size_t mutation_size_threshold, api::timestamp_type mutation_timestamp);

 future<mutation> create_table_streams_mutation(table_id, db_clock::time_point, const locator::tablet_map&, api::timestamp_type);
-future<mutation> create_table_streams_mutation(table_id, db_clock::time_point, const std::vector<cdc::stream_id>&, api::timestamp_type);
+future<mutation> create_table_streams_mutation(table_id, db_clock::time_point, const utils::chunked_vector<cdc::stream_id>&, api::timestamp_type);
 utils::chunked_vector<mutation> make_drop_table_streams_mutations(table_id, api::timestamp_type ts);

 future<mutation> get_switch_streams_mutation(table_id table, db_clock::time_point stream_ts, cdc_stream_diff diff, api::timestamp_type ts);
-future<utils::chunked_vector<mutation>> get_cdc_stream_gc_mutations(table_id table, db_clock::time_point base_ts, const std::vector<cdc::stream_id>& base_stream_set, api::timestamp_type ts);
+future<utils::chunked_vector<mutation>> get_cdc_stream_gc_mutations(table_id table, db_clock::time_point base_ts, const utils::chunked_vector<cdc::stream_id>& base_stream_set, api::timestamp_type ts);
 table_streams::const_iterator get_new_base_for_gc(const table_streams&, std::chrono::seconds ttl);

 } // namespace cdc
--- a/cdc/generation_service.hh
+++ b/cdc/generation_service.hh
@@ -149,7 +149,7 @@ public:
    future<> load_cdc_tablet_streams(std::optional<std::unordered_set<table_id>> changed_tables);

    future<> query_cdc_timestamps(table_id table, bool ascending, noncopyable_function<future<>(db_clock::time_point)> f);
-    future<> query_cdc_streams(table_id table, noncopyable_function<future<>(db_clock::time_point, const std::vector<cdc::stream_id>& current, cdc::cdc_stream_diff)> f);
+    future<> query_cdc_streams(table_id table, noncopyable_function<future<>(db_clock::time_point, const utils::chunked_vector<cdc::stream_id>& current, cdc::cdc_stream_diff)> f);

    future<> generate_tablet_resize_update(utils::chunked_vector<canonical_mutation>& muts, table_id table, const locator::tablet_map& new_tablet_map, api::timestamp_type ts);

--- a/cdc/log.cc
+++ b/cdc/log.cc
@@ -25,6 +25,7 @@
 #include "locator/abstract_replication_strategy.hh"
 #include "locator/topology.hh"
 #include "replica/database.hh"
+#include "db/config.hh"
 #include "db/schema_tables.hh"
 #include "gms/feature_service.hh"
 #include "schema/schema.hh"
@@ -68,10 +69,15 @@ shared_ptr<locator::abstract_replication_strategy> generate_replication_strategy
    return locator::abstract_replication_strategy::create_replication_strategy(ksm.strategy_name(), params, topo);
 }

+// When dropping a column from a CDC log table, we set the drop timestamp
+// `column_drop_leeway` seconds into the future to ensure that for writes concurrent
+// with column drop, the write timestamp is before the column drop timestamp.
+constexpr auto column_drop_leeway = std::chrono::seconds(5);
+
 } // anonymous namespace

 namespace cdc {
-static schema_ptr create_log_schema(const schema&, const replica::database&, const keyspace_metadata&,
+static schema_ptr create_log_schema(const schema&, const replica::database&, const keyspace_metadata&, api::timestamp_type,
        std::optional<table_id> = {}, schema_ptr = nullptr);
 }

@@ -183,7 +189,7 @@ public:
        muts.emplace_back(std::move(mut));
    }

-    void on_pre_create_column_families(const keyspace_metadata& ksm, std::vector<schema_ptr>& cfms) override {
+    void on_pre_create_column_families(const keyspace_metadata& ksm, std::vector<schema_ptr>& cfms, api::timestamp_type ts) override {
        std::vector<schema_ptr> new_cfms;

        for (auto sp : cfms) {
@@ -202,7 +208,7 @@ public:
            }

            // in seastar thread
-            auto log_schema = create_log_schema(schema, db, ksm);
+            auto log_schema = create_log_schema(schema, db, ksm, ts);
            new_cfms.push_back(std::move(log_schema));
        }

@@ -249,7 +255,7 @@ public:
            }

            std::optional<table_id> maybe_id = log_schema ? std::make_optional(log_schema->id()) : std::nullopt;
-            auto new_log_schema = create_log_schema(new_schema, db, *keyspace.metadata(), std::move(maybe_id), log_schema);
+            auto new_log_schema = create_log_schema(new_schema, db, *keyspace.metadata(), timestamp, std::move(maybe_id), log_schema);

            auto log_mut = log_schema 
                ? db::schema_tables::make_update_table_mutations(_ctxt._proxy, keyspace.metadata(), log_schema, new_log_schema, timestamp)
@@ -581,11 +587,9 @@ bytes log_data_column_deleted_elements_name_bytes(const bytes& column_name) {
    return to_bytes(cdc_deleted_elements_column_prefix) + column_name;
 }

-static schema_ptr create_log_schema(const schema& s, const replica::database& db,
-        const keyspace_metadata& ksm, std::optional<table_id> uuid, schema_ptr old)
+static void set_default_properties_log_table(schema_builder& b, const schema& s,
+        const replica::database& db, const keyspace_metadata& ksm)
 {
-    schema_builder b(s.ks_name(), log_name(s.cf_name()));
-    b.with_partitioner(cdc::cdc_partitioner::classname);
    b.set_compaction_strategy(compaction::compaction_strategy_type::time_window);
    b.set_comment(fmt::format("CDC log for {}.{}", s.ks_name(), s.cf_name()));
    auto ttl_seconds = s.cdc_options().ttl();
@@ -611,13 +615,44 @@ static schema_ptr create_log_schema(const schema& s, const replica::database& db
                        std::to_string(std::max(1, window_seconds / 2))},
        });
    }
+    b.set_caching_options(caching_options::get_disabled_caching_options());
+
+    auto rs = generate_replication_strategy(ksm, db.get_token_metadata().get_topology());
+    auto tombstone_gc_ext = seastar::make_shared<tombstone_gc_extension>(get_default_tombstone_gc_mode(*rs, db.get_token_metadata(), false));
+    b.add_extension(tombstone_gc_extension::NAME, std::move(tombstone_gc_ext));
+}
+
+static void add_columns_to_cdc_log(schema_builder& b, const schema& s,
+        const api::timestamp_type timestamp, const schema_ptr old)
+{
    b.with_column(log_meta_column_name_bytes("stream_id"), bytes_type, column_kind::partition_key);
    b.with_column(log_meta_column_name_bytes("time"), timeuuid_type, column_kind::clustering_key);
    b.with_column(log_meta_column_name_bytes("batch_seq_no"), int32_type, column_kind::clustering_key);
    b.with_column(log_meta_column_name_bytes("operation"), data_type_for<operation_native_type>());
    b.with_column(log_meta_column_name_bytes("ttl"), long_type);
    b.with_column(log_meta_column_name_bytes("end_of_batch"), boolean_type);
-    b.set_caching_options(caching_options::get_disabled_caching_options());
+
+    auto validate_new_column = [&] (const sstring& name) {
+        // When dropping a column from a CDC log table, we set the drop timestamp to be
+        // `column_drop_leeway` seconds into the future (see `create_log_schema`).
+        // Therefore, when recreating a column with the same name, we need to validate
+        // that it's not recreated too soon and that the drop timestamp has passed.
+        if (old && old->dropped_columns().contains(name)) {
+            const auto& drop_info = old->dropped_columns().at(name);
+            auto create_time = api::timestamp_clock::time_point(api::timestamp_clock::duration(timestamp));
+            auto drop_time = api::timestamp_clock::time_point(api::timestamp_clock::duration(drop_info.timestamp));
+            if (drop_time > create_time) {
+                throw exceptions::invalid_request_exception(format("Cannot add column {} because a column with the same name was dropped too recently. Please retry after {} seconds",
+                        name, std::chrono::duration_cast<std::chrono::seconds>(drop_time - create_time).count() + 1));
+            }
+        }
+    };
+
+    auto add_column = [&] (sstring name, data_type type) {
+        validate_new_column(name);
+        b.with_column(to_bytes(name), type);
+    };
+
    auto add_columns = [&] (const schema::const_iterator_range_type& columns, bool is_data_col = false) {
        for (const auto& column : columns) {
            auto type = column.type;
@@ -639,9 +674,9 @@ static schema_ptr create_log_schema(const schema& s, const replica::database& db
                    }
                ));
            }
-            b.with_column(log_data_column_name_bytes(column.name()), type);
+            add_column(log_data_column_name(column.name_as_text()), type);
            if (is_data_col) {
-                b.with_column(log_data_column_deleted_name_bytes(column.name()), boolean_type);
+                add_column(log_data_column_deleted_name(column.name_as_text()), boolean_type);
            }
            if (column.type->is_multi_cell()) {
                auto dtype = visit(*type, make_visitor(
@@ -657,7 +692,7 @@ static schema_ptr create_log_schema(const schema& s, const replica::database& db
                        throw std::invalid_argument("Should not reach");
                    }
                ));
-                b.with_column(log_data_column_deleted_elements_name_bytes(column.name()), dtype);
+                add_column(log_data_column_deleted_elements_name(column.name_as_text()), dtype);
            }
        }
    };
@@ -665,15 +700,28 @@ static schema_ptr create_log_schema(const schema& s, const replica::database& db
    add_columns(s.clustering_key_columns());
    add_columns(s.static_columns(), true);
    add_columns(s.regular_columns(), true);
+}
+
+static schema_ptr create_log_schema(const schema& s, const replica::database& db,
+        const keyspace_metadata& ksm, api::timestamp_type timestamp, std::optional<table_id> uuid, schema_ptr old)
+{
+    schema_builder b(s.ks_name(), log_name(s.cf_name()));
+
+    b.with_partitioner(cdc::cdc_partitioner::classname);
+
+    if (old) {
+        // If the user reattaches the log table, do not change its properties.
+        b.set_properties(old->get_properties());
+    } else {
+        set_default_properties_log_table(b, s, db, ksm);
+    }
+
+    add_columns_to_cdc_log(b, s, timestamp, old);

    if (uuid) {
        b.set_uuid(*uuid);
    }

-    auto rs = generate_replication_strategy(ksm, db.get_token_metadata().get_topology());
-    auto tombstone_gc_ext = seastar::make_shared<tombstone_gc_extension>(get_default_tombstone_gc_mode(*rs, db.get_token_metadata()));
-    b.add_extension(tombstone_gc_extension::NAME, std::move(tombstone_gc_ext));
-
    /**
     * #10473 - if we are redefining the log table, we need to ensure any dropped
     * columns are registered in "dropped_columns" table, otherwise clients will not
@@ -683,7 +731,8 @@ static schema_ptr create_log_schema(const schema& s, const replica::database& db
        // not super efficient, but we don't do this often.
        for (auto& col : old->all_columns()) {
            if (!b.has_column({col.name(), col.name_as_text() })) {
-                b.without_column(col.name_as_text(), col.type, api::new_timestamp());
+                auto drop_ts = api::timestamp_clock::now() + column_drop_leeway;
+                b.without_column(col.name_as_text(), col.type, drop_ts.time_since_epoch().count());
            }
        }
    }
@@ -903,9 +952,6 @@ static managed_bytes merge(const abstract_type& type, const managed_bytes_opt& p
    throw std::runtime_error(format("cdc merge: unknown type {}", type.name()));
 }

-using cell_map = std::unordered_map<const column_definition*, managed_bytes_opt>;
-using row_states_map = std::unordered_map<clustering_key, cell_map, clustering_key::hashing, clustering_key::equality>;
-
 static managed_bytes_opt get_col_from_row_state(const cell_map* state, const column_definition& cdef) {
    if (state) {
        if (auto it = state->find(&cdef); it != state->end()) {
@@ -915,7 +961,12 @@ static managed_bytes_opt get_col_from_row_state(const cell_map* state, const col
    return std::nullopt;
 }

-static cell_map* get_row_state(row_states_map& row_states, const clustering_key& ck) {
+cell_map* get_row_state(row_states_map& row_states, const clustering_key& ck) {
+    auto it = row_states.find(ck);
+    return it == row_states.end() ? nullptr : &it->second;
+}
+
+const cell_map* get_row_state(const row_states_map& row_states, const clustering_key& ck) {
    auto it = row_states.find(ck);
    return it == row_states.end() ? nullptr : &it->second;
 }
@@ -1385,6 +1436,8 @@ struct process_change_visitor {
    row_states_map& _clustering_row_states;
    cell_map& _static_row_state;

+    const bool _is_update = false;
+
    const bool _generate_delta_values = true;

    void static_row_cells(auto&& visit_row_cells) {
@@ -1408,12 +1461,13 @@ struct process_change_visitor {

        struct clustering_row_cells_visitor : public process_row_visitor {
            operation _cdc_op = operation::update;
+            operation _marker_op = operation::insert;

            using process_row_visitor::process_row_visitor;

            void marker(const row_marker& rm) {
                _ttl_column = get_ttl(rm);
-                _cdc_op = operation::insert;
+                _cdc_op = _marker_op;
            }
        };

@@ -1421,6 +1475,9 @@ struct process_change_visitor {
                log_ck, _touched_parts, _builder,
                _enable_updating_state, &ckey, get_row_state(_clustering_row_states, ckey),
                _clustering_row_states, _generate_delta_values);
+        if (_is_update && _request_options.alternator) {
+            v._marker_op = operation::update;
+        }
        visit_row_cells(v);

        if (_enable_updating_state) {
@@ -1574,6 +1631,11 @@ private:

    row_states_map _clustering_row_states;
    cell_map _static_row_state;
+    // True if the mutated row existed before applying the mutation. In other
+    // words, if the preimage is enabled and it isn't empty (otherwise, we
+    // assume that the row is non-existent). Used for Alternator Streams (see
+    // #6918).
+    bool _is_update = false;

    const bool _uses_tablets;

@@ -1590,7 +1652,7 @@ public:
        : _ctx(ctx)
        , _schema(std::move(s))
        , _dk(std::move(dk))
-        , _log_schema(ctx._proxy.get_db().local().find_schema(_schema->ks_name(), log_name(_schema->cf_name())))
+        , _log_schema(_schema->cdc_schema() ? _schema->cdc_schema() : ctx._proxy.get_db().local().find_schema(_schema->ks_name(), log_name(_schema->cf_name())))
        , _options(options)
        , _clustering_row_states(0, clustering_key::hashing(*_schema), clustering_key::equality(*_schema))
        , _uses_tablets(ctx._proxy.get_db().local().find_keyspace(_schema->ks_name()).uses_tablets())
@@ -1700,6 +1762,7 @@ public:
            ._enable_updating_state = _enable_updating_state,
            ._clustering_row_states = _clustering_row_states,
            ._static_row_state = _static_row_state,
+            ._is_update = _is_update,
            ._generate_delta_values = generate_delta_values(_builder->base_schema())
        };
        cdc::inspect_mutation(m, v);
@@ -1710,6 +1773,10 @@ public:
        _builder->end_record();
    }

+    const row_states_map& clustering_row_states() const override {
+        return _clustering_row_states;
+    }
+
    // Takes and returns generated cdc log mutations and associated statistics about parts touched during transformer's lifetime.
    // The `transformer` object on which this method was called on should not be used anymore.
    std::tuple<utils::chunked_vector<mutation>, stats::part_type_set> finish() && {
@@ -1833,6 +1900,7 @@ public:
                    _static_row_state[&c] = std::move(*maybe_cell_view);
                }
            }
+            _is_update = true;
        }

        if (static_only) {
@@ -1920,6 +1988,7 @@ cdc::cdc_service::impl::augment_mutation_call(lowres_clock::time_point timeout,
                return make_ready_future<>();
            }

+            const bool alternator_increased_compatibility = options.alternator && options.alternator_streams_increased_compatibility;
            transformer trans(_ctxt, s, m.decorated_key(), options);

            auto f = make_ready_future<lw_shared_ptr<cql3::untyped_result_set>>(nullptr);
@@ -1927,7 +1996,7 @@ cdc::cdc_service::impl::augment_mutation_call(lowres_clock::time_point timeout,
                // Preimage has been fetched by upper layers.
                tracing::trace(tr_state, "CDC: Using a prefetched preimage");
                f = make_ready_future<lw_shared_ptr<cql3::untyped_result_set>>(options.preimage);
-            } else if (s->cdc_options().preimage() || s->cdc_options().postimage()) {
+            } else if (s->cdc_options().preimage() || s->cdc_options().postimage() || alternator_increased_compatibility) {
                // Note: further improvement here would be to coalesce the pre-image selects into one
                // if a batch contains several modifications to the same table. Otoh, batch is rare(?)
                // so this is premature.
@@ -1944,7 +2013,7 @@ cdc::cdc_service::impl::augment_mutation_call(lowres_clock::time_point timeout,
                tracing::trace(tr_state, "CDC: Preimage not enabled for the table, not querying current value of {}", m.decorated_key());
            }

-            return f.then([trans = std::move(trans), &mutations, idx, tr_state, &details] (lw_shared_ptr<cql3::untyped_result_set> rs) mutable {
+            return f.then([alternator_increased_compatibility, trans = std::move(trans), &mutations, idx, tr_state, &details, &options] (lw_shared_ptr<cql3::untyped_result_set> rs) mutable {
                auto& m = mutations[idx];
                auto& s = m.schema();

@@ -1959,13 +2028,13 @@ cdc::cdc_service::impl::augment_mutation_call(lowres_clock::time_point timeout,
                details.had_preimage |= preimage;
                details.had_postimage |= postimage;
                tracing::trace(tr_state, "CDC: Generating log mutations for {}", m.decorated_key());
-                if (should_split(m)) {
+                if (should_split(m, options)) {
                    tracing::trace(tr_state, "CDC: Splitting {}", m.decorated_key());
                    details.was_split = true;
-                    process_changes_with_splitting(m, trans, preimage, postimage);
+                    process_changes_with_splitting(m, trans, preimage, postimage, alternator_increased_compatibility);
                } else {
                    tracing::trace(tr_state, "CDC: No need to split {}", m.decorated_key());
-                    process_changes_without_splitting(m, trans, preimage, postimage);
+                    process_changes_without_splitting(m, trans, preimage, postimage, alternator_increased_compatibility);
                }
                auto [log_mut, touched_parts] = std::move(trans).finish();
                const int generated_count = log_mut.size();
--- a/cdc/log.hh
+++ b/cdc/log.hh
@@ -52,6 +52,9 @@ class database;

 namespace cdc {

+using cell_map = std::unordered_map<const column_definition*, managed_bytes_opt>;
+using row_states_map = std::unordered_map<clustering_key, cell_map, clustering_key::hashing, clustering_key::equality>;
+
 // cdc log table operation
 enum class operation : int8_t {
    // note: these values will eventually be read by a third party, probably not privvy to this
@@ -73,6 +76,14 @@ struct per_request_options {
    // Scylla. Currently, only TTL expiration implementation for Alternator
    // uses this.
    const bool is_system_originated = false;
+    // True if this mutation was emitted by Alternator.
+    const bool alternator = false;
+    // Sacrifice performance for the sake of better compatibility with DynamoDB
+    // Streams. It's important for correctness that
+    // alternator_streams_increased_compatibility config flag be read once per
+    // request, because it's live-updateable. As a result, the flag may change
+    // between reads.
+    const bool alternator_streams_increased_compatibility = false;
 };

 struct operation_result_tracker;
@@ -142,4 +153,7 @@ bool is_cdc_metacolumn_name(const sstring& name);

 utils::UUID generate_timeuuid(api::timestamp_type t);

+cell_map* get_row_state(row_states_map& row_states, const clustering_key& ck);
+const cell_map* get_row_state(const row_states_map& row_states, const clustering_key& ck);
+
 } // namespace cdc
--- a/cdc/metadata.cc
+++ b/cdc/metadata.cc
@@ -54,7 +54,7 @@ cdc::stream_id get_stream(
 }

 static cdc::stream_id get_stream(
-        const std::vector<cdc::stream_id>& streams,
+        const utils::chunked_vector<cdc::stream_id>& streams,
        dht::token tok) {
    if (streams.empty()) {
        on_internal_error(cdc_log, "get_stream: streams empty");
@@ -159,7 +159,7 @@ cdc::stream_id cdc::metadata::get_vnode_stream(api::timestamp_type ts, dht::toke
    return ret;
 }

-const std::vector<cdc::stream_id>& cdc::metadata::get_tablet_stream_set(table_id tid, api::timestamp_type ts) const {
+const utils::chunked_vector<cdc::stream_id>& cdc::metadata::get_tablet_stream_set(table_id tid, api::timestamp_type ts) const {
    auto now = api::new_timestamp();
    if (ts > now + get_generation_leeway().count()) {
        throw exceptions::invalid_request_exception(seastar::format(
@@ -259,10 +259,10 @@ bool cdc::metadata::prepare(db_clock::time_point tp) {
    return !it->second;
 }

-future<std::vector<cdc::stream_id>> cdc::metadata::construct_next_stream_set(
-        const std::vector<cdc::stream_id>& prev_stream_set,
-        std::vector<cdc::stream_id> opened,
-        const std::vector<cdc::stream_id>& closed) {
+future<utils::chunked_vector<cdc::stream_id>> cdc::metadata::construct_next_stream_set(
+        const utils::chunked_vector<cdc::stream_id>& prev_stream_set,
+        utils::chunked_vector<cdc::stream_id> opened,
+        const utils::chunked_vector<cdc::stream_id>& closed) {

    if (closed.size() == prev_stream_set.size()) {
        // all previous streams are closed, so the next stream set is just the opened streams.
@@ -273,8 +273,8 @@ future<std::vector<cdc::stream_id>> cdc::metadata::construct_next_stream_set(
    // streams and removing the closed streams. we assume each stream set is
    // sorted by token, and the result is sorted as well.

-    std::vector<cdc::stream_id> next_stream_set;
-    next_stream_set.reserve(prev_stream_set.size() + opened.size() - closed.size());
+    utils::chunked_vector<cdc::stream_id> next_stream_set;
+    co_await utils::reserve_gently(next_stream_set, prev_stream_set.size() + opened.size() - closed.size());

    auto next_prev = prev_stream_set.begin();
    auto next_closed = closed.begin();
@@ -318,8 +318,8 @@ std::vector<table_id> cdc::metadata::get_tables_with_cdc_tablet_streams() const
    return _tablet_streams | std::views::keys | std::ranges::to<std::vector<table_id>>();
 }

-future<cdc::cdc_stream_diff> cdc::metadata::generate_stream_diff(const std::vector<stream_id>& before, const std::vector<stream_id>& after) {
-    std::vector<stream_id> closed, opened;
+future<cdc::cdc_stream_diff> cdc::metadata::generate_stream_diff(const utils::chunked_vector<stream_id>& before, const utils::chunked_vector<stream_id>& after) {
+    utils::chunked_vector<stream_id> closed, opened;

    auto before_it = before.begin();
    auto after_it = after.begin();
--- a/cdc/metadata.hh
+++ b/cdc/metadata.hh
@@ -49,7 +49,7 @@ class metadata final {

    container_t::const_iterator gen_used_at(api::timestamp_type ts) const;

-    const std::vector<stream_id>& get_tablet_stream_set(table_id tid, api::timestamp_type ts) const;
+    const utils::chunked_vector<stream_id>& get_tablet_stream_set(table_id tid, api::timestamp_type ts) const;

 public:
    /* Is a generation with the given timestamp already known or obsolete? It is obsolete if and only if
@@ -111,14 +111,14 @@ public:

    std::vector<table_id> get_tables_with_cdc_tablet_streams() const;

-    static future<std::vector<stream_id>> construct_next_stream_set(
-        const std::vector<cdc::stream_id>& prev_stream_set,
-        std::vector<cdc::stream_id> opened,
-        const std::vector<cdc::stream_id>& closed);
+    static future<utils::chunked_vector<stream_id>> construct_next_stream_set(
+        const utils::chunked_vector<cdc::stream_id>& prev_stream_set,
+        utils::chunked_vector<cdc::stream_id> opened,
+        const utils::chunked_vector<cdc::stream_id>& closed);

    static future<cdc_stream_diff> generate_stream_diff(
-        const std::vector<stream_id>& before,
-        const std::vector<stream_id>& after);
+        const utils::chunked_vector<stream_id>& before,
+        const utils::chunked_vector<stream_id>& after);

 };

--- a/cdc/split.cc
+++ b/cdc/split.cc
@@ -6,15 +6,28 @@
 * SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0
 */

+#include "bytes.hh"
+#include "bytes_fwd.hh"
+#include "mutation/atomic_cell.hh"
+#include "mutation/atomic_cell_or_collection.hh"
+#include "mutation/collection_mutation.hh"
 #include "mutation/mutation.hh"
+#include "mutation/tombstone.hh"
 #include "schema/schema.hh"

+#include <seastar/core/sstring.hh>
 #include "types/concrete_types.hh"
+#include "types/types.hh"
 #include "types/user.hh"

 #include "split.hh"
 #include "log.hh"
 #include "change_visitor.hh"
+#include "utils/managed_bytes.hh"
+#include <string_view>
+#include <unordered_map>
+
+extern logging::logger cdc_log;

 struct atomic_column_update {
    column_id id;
@@ -490,6 +503,8 @@ struct should_split_visitor {
    // Otherwise we store the change's ttl.
    std::optional<gc_clock::duration> _ttl = std::nullopt;

+    virtual ~should_split_visitor() = default;
+
    inline bool finished() const { return _result; }
    inline void stop() { _result = true; }

@@ -512,7 +527,7 @@ struct should_split_visitor {

    void collection_tombstone(const tombstone& t) { visit(t.timestamp + 1); }

-    void live_collection_cell(bytes_view, const atomic_cell_view& cell) {
+    virtual void live_collection_cell(bytes_view, const atomic_cell_view& cell) {
        if (_had_row_marker) {
            // nonatomic updates cannot be expressed with an INSERT.
            return stop();
@@ -522,7 +537,7 @@ struct should_split_visitor {
    void dead_collection_cell(bytes_view, const atomic_cell_view& cell) { visit(cell); }
    void collection_column(const column_definition&, auto&& visit_collection) { visit_collection(*this); }

-    void marker(const row_marker& rm) {
+    virtual void marker(const row_marker& rm) {
        _had_row_marker = true;
        visit(rm.timestamp(), get_ttl(rm));
    }
@@ -563,7 +578,29 @@ struct should_split_visitor {
    }
 };

-bool should_split(const mutation& m) {
+// This is the same as the above, but it doesn't split a row marker away from
+// an update. As a result, updates that create an item appear as a single log
+// row.
+class alternator_should_split_visitor : public should_split_visitor {
+public:
+    ~alternator_should_split_visitor() override = default;
+
+    void live_collection_cell(bytes_view, const atomic_cell_view& cell) override {
+        visit(cell.timestamp());
+    }
+
+    void marker(const row_marker& rm) override {
+        visit(rm.timestamp());
+    }
+};
+
+bool should_split(const mutation& m, const per_request_options& options) {
+    if (options.alternator) {
+        alternator_should_split_visitor v;
+        cdc::inspect_mutation(m, v);
+        return v._result || v._ts == api::missing_timestamp;
+    }
+
    should_split_visitor v;

    cdc::inspect_mutation(m, v);
@@ -573,8 +610,109 @@ bool should_split(const mutation& m) {
        || v._ts == api::missing_timestamp;
 }

+// Returns true if the row state and the atomic and nonatomic entries represent
+// an equivalent item.
+static bool entries_match_row_state(const schema_ptr& base_schema, const cell_map& row_state, const std::vector<atomic_column_update>& atomic_entries,
+        std::vector<nonatomic_column_update>& nonatomic_entries) {
+    for (const auto& update : atomic_entries) {
+        const column_definition& cdef = base_schema->column_at(column_kind::regular_column, update.id);
+        const auto it = row_state.find(&cdef);
+        if (it == row_state.end()) {
+            return false;
+        }
+        if (to_managed_bytes_opt(update.cell.value().linearize()) != it->second) {
+            return false;
+        }
+    }
+    if (nonatomic_entries.empty()) {
+        return true;
+    }
+
+    for (const auto& update : nonatomic_entries) {
+        const column_definition& cdef = base_schema->column_at(column_kind::regular_column, update.id);
+        const auto it = row_state.find(&cdef);
+        if (it == row_state.end()) {
+            return false;
+        }
+
+        // The only collection used by Alternator is a non-frozen map.
+        auto current_raw_map = cdef.type->deserialize(*it->second);
+        map_type_impl::native_type current_values = value_cast<map_type_impl::native_type>(current_raw_map);
+
+        if (current_values.size() != update.cells.size()) {
+            return false;
+        }
+        
+        std::unordered_map<sstring_view, bytes> current_values_map;
+        for (const auto& entry : current_values) {
+            const auto attr_name = std::string_view(value_cast<sstring>(entry.first));
+            current_values_map[attr_name] = value_cast<bytes>(entry.second);
+        }
+
+        for (const auto& [key, value] : update.cells) {
+            const auto key_str = to_string_view(key);
+            if (!value.is_live()) {
+                if (current_values_map.contains(key_str)) {
+                    return false;
+                }
+            } else if (current_values_map[key_str] != value.value().linearize()) {
+                return false;
+            }
+        }
+    }
+    return true;
+}
+
+bool should_skip(batch& changes, const mutation& base_mutation, change_processor& processor) {
+    const schema_ptr& base_schema = base_mutation.schema();
+    // Alternator doesn't use static updates and clustered range deletions.
+    if (!changes.static_updates.empty() || !changes.clustered_range_deletions.empty()) {
+        return false;
+    }
+
+    for (clustered_row_insert& u : changes.clustered_inserts) {
+        const cell_map* row_state = get_row_state(processor.clustering_row_states(), u.key);
+        if (!row_state) {
+            return false;
+        }
+        if (!entries_match_row_state(base_schema, *row_state, u.atomic_entries, u.nonatomic_entries)) {
+            return false;
+        }
+    }
+
+    for (clustered_row_update& u : changes.clustered_updates) {
+        const cell_map* row_state = get_row_state(processor.clustering_row_states(), u.key);
+        if (!row_state) {
+            return false;
+        }
+        if (!entries_match_row_state(base_schema, *row_state, u.atomic_entries, u.nonatomic_entries)) {
+            return false;
+        }
+    }
+
+    // Skip only if the row being deleted does not exist (i.e. the deletion is a no-op).
+    for (const auto& row_deletion : changes.clustered_row_deletions) {
+        if (processor.clustering_row_states().contains(row_deletion.key)) {
+            return false;
+        }
+    }
+
+    // Don't skip if the item exists.
+    //
+    // Increased DynamoDB Streams compatibility guarantees that single-item
+    // operations will read the item and store it in the clustering row states.
+    // If it is not found there, we may skip CDC. This is safe as long as the
+    // assumptions of this operation's write isolation are not violated.
+    if (changes.partition_deletions && processor.clustering_row_states().contains(clustering_key::make_empty())) {
+        return false;
+    }
+
+    cdc_log.trace("Skipping CDC log for mutation {}", base_mutation);
+    return true;
+}
+
 void process_changes_with_splitting(const mutation& base_mutation, change_processor& processor,
-        bool enable_preimage, bool enable_postimage) {
+        bool enable_preimage, bool enable_postimage, bool alternator_strict_compatibility) {
    const auto base_schema = base_mutation.schema();
    auto changes = extract_changes(base_mutation);
    auto pk = base_mutation.key();
@@ -586,9 +724,6 @@ void process_changes_with_splitting(const mutation& base_mutation, change_proces
    const auto last_timestamp = changes.rbegin()->first;

    for (auto& [change_ts, btch] : changes) {
-        const bool is_last = change_ts == last_timestamp;
-        processor.begin_timestamp(change_ts, is_last);
-
        clustered_column_set affected_clustered_columns_per_row{clustering_key::less_compare(*base_schema)};
        one_kind_column_set affected_static_columns{base_schema->static_columns_count()};

@@ -597,6 +732,12 @@ void process_changes_with_splitting(const mutation& base_mutation, change_proces
            affected_clustered_columns_per_row = btch.get_affected_clustered_columns_per_row(*base_mutation.schema());
        }

+        if (alternator_strict_compatibility && should_skip(btch, base_mutation, processor)) {
+            continue;
+        }
+
+        const bool is_last = change_ts == last_timestamp;
+        processor.begin_timestamp(change_ts, is_last);
        if (enable_preimage) {
            if (affected_static_columns.count() > 0) {
                processor.produce_preimage(nullptr, affected_static_columns);
@@ -684,7 +825,13 @@ void process_changes_with_splitting(const mutation& base_mutation, change_proces
 }

 void process_changes_without_splitting(const mutation& base_mutation, change_processor& processor,
-        bool enable_preimage, bool enable_postimage) {
+        bool enable_preimage, bool enable_postimage, bool alternator_strict_compatibility) {
+    if (alternator_strict_compatibility) {
+        auto changes = extract_changes(base_mutation);
+        if (should_skip(changes.begin()->second, base_mutation, processor)) {
+            return;
+        }
+    }
    auto ts = find_timestamp(base_mutation);
    processor.begin_timestamp(ts, true);

--- a/cdc/split.hh
+++ b/cdc/split.hh
@@ -9,6 +9,7 @@
 #pragma once

 #include <boost/dynamic_bitset.hpp>  // IWYU pragma: keep
+#include "cdc/log.hh"
 #include "replica/database_fwd.hh"
 #include "mutation/timestamp.hh"

@@ -65,12 +66,14 @@ public:
    // Tells processor we have reached end of record - last part
    // of a given timestamp batch
    virtual void end_record() = 0;
+
+    virtual const row_states_map& clustering_row_states() const = 0;
 };

-bool should_split(const mutation& base_mutation);
+bool should_split(const mutation& base_mutation, const per_request_options& options);
 void process_changes_with_splitting(const mutation& base_mutation, change_processor& processor,
-        bool enable_preimage, bool enable_postimage);
+        bool enable_preimage, bool enable_postimage, bool alternator_strict_compatibility);
 void process_changes_without_splitting(const mutation& base_mutation, change_processor& processor,
-        bool enable_preimage, bool enable_postimage);
+        bool enable_preimage, bool enable_postimage, bool alternator_strict_compatibility);

 }
--- a/client_data.hh
+++ b/client_data.hh
@@ -10,7 +10,9 @@
 #include <seastar/net/inet_address.hh>
 #include <seastar/core/sstring.hh>
 #include "seastarx.hh"
+#include "utils/loading_shared_values.hh"

+#include <list>
 #include <optional>

 enum class client_type {
@@ -27,6 +29,20 @@ enum class client_connection_stage {
    ready,
 };

+// We implement a keys cache using a map-like utils::loading_shared_values container by storing empty values.
+struct options_cache_value_type {};
+using client_options_cache_type = utils::loading_shared_values<sstring, options_cache_value_type>;
+using client_options_cache_entry_type = client_options_cache_type::entry_ptr;
+using client_options_cache_key_type = client_options_cache_type::key_type;
+
+// This struct represents a single OPTION key-value pair from the client's connection options.
+// Both key and value are represented by corresponding "references" to their cached values.
+// Each "reference" is effectively a lw_shared_ptr value.
+struct client_option_key_value_cached_entry {
+    client_options_cache_entry_type key;
+    client_options_cache_entry_type value;
+};
+
 sstring to_string(client_connection_stage ct);

 // Representation of a row in `system.clients'. std::optionals are for nullable cells.
@@ -37,8 +53,8 @@ struct client_data {
    client_connection_stage connection_stage = client_connection_stage::established;
    int32_t shard_id;  /// ID of server-side shard which is processing the connection.

-    std::optional<sstring> driver_name;
-    std::optional<sstring> driver_version;
+    std::optional<client_options_cache_entry_type> driver_name;
+    std::optional<client_options_cache_entry_type> driver_version;
    std::optional<sstring> hostname;
    std::optional<int32_t> protocol_version;
    std::optional<sstring> ssl_cipher_suite;
@@ -46,6 +62,7 @@ struct client_data {
    std::optional<sstring> ssl_protocol;
    std::optional<sstring> username;
    std::optional<sstring> scheduling_group_name;
+    std::list<client_option_key_value_cached_entry> client_options;

    sstring stage_str() const { return to_string(connection_stage); }
    sstring client_type_str() const { return to_string(ct); }
--- a/cmake/mode.common.cmake
+++ b/cmake/mode.common.cmake
@@ -125,10 +125,6 @@ if(target_arch)
  add_compile_options("-march=${target_arch}")
 endif()

-if(CMAKE_CXX_COMPILER_ID STREQUAL "Clang")
-  add_compile_options("SHELL:-Xclang -fexperimental-assignment-tracking=disabled")
-endif()
-
 function(maybe_limit_stack_usage_in_KB stack_usage_threshold_in_KB config)
  math(EXPR _stack_usage_threshold_in_bytes "${stack_usage_threshold_in_KB} * 1024")
  set(_stack_usage_threshold_flag "-Wstack-usage=${_stack_usage_threshold_in_bytes}")
--- a/compaction/CMakeLists.txt
+++ b/compaction/CMakeLists.txt
@@ -21,5 +21,8 @@ target_link_libraries(compaction
    mutation_writer
    replica)

+if (Scylla_USE_PRECOMPILED_HEADER_USE)
+  target_precompile_headers(compaction REUSE_FROM scylla-precompiled-header)
+endif()
 check_headers(check-headers compaction
  GLOB_RECURSE ${CMAKE_CURRENT_SOURCE_DIR}/*.hh)
--- a/compaction/compaction_group_view.hh
+++ b/compaction/compaction_group_view.hh
@@ -12,6 +12,7 @@
 #include <seastar/core/condition-variable.hh>

 #include "schema/schema_fwd.hh"
+#include "sstables/open_info.hh"
 #include "compaction_descriptor.hh"

 class reader_permit;
@@ -44,7 +45,7 @@ public:
    virtual compaction_strategy_state& get_compaction_strategy_state() noexcept = 0;
    virtual reader_permit make_compaction_reader_permit() const = 0;
    virtual sstables::sstables_manager& get_sstables_manager() noexcept = 0;
-    virtual sstables::shared_sstable make_sstable() const = 0;
+    virtual sstables::shared_sstable make_sstable(sstables::sstable_state) const = 0;
    virtual sstables::sstable_writer_config configure_writer(sstring origin) const = 0;
    virtual api::timestamp_type min_memtable_timestamp() const = 0;
    virtual api::timestamp_type min_memtable_live_timestamp() const = 0;
--- a/compaction/compaction_manager.cc
+++ b/compaction/compaction_manager.cc
@@ -416,7 +416,9 @@ future<compaction_result> compaction_task_executor::compact_sstables(compaction_
        descriptor.enable_garbage_collection(co_await sstable_set_for_tombstone_gc(t));
    }
    descriptor.creator = [&t] (shard_id) {
-        return t.make_sstable();
+        // All compaction types going through this path will work on normal input sstables only.
+        // Off-strategy, for example, waits until the sstables move out of staging state.
+        return t.make_sstable(sstables::sstable_state::normal);
    };
    descriptor.replacer = [this, &t, &on_replace, offstrategy] (compaction_completion_desc desc) {
        t.get_compaction_strategy().notify_completion(t, desc.old_sstables, desc.new_sstables);
@@ -867,8 +869,8 @@ auto fmt::formatter<compaction::compaction_task_executor>::format(const compacti

 namespace compaction {

-inline compaction_controller make_compaction_controller(const compaction_manager::scheduling_group& csg, uint64_t static_shares, std::function<double()> fn) {
-    return compaction_controller(csg, static_shares, 250ms, std::move(fn));
+inline compaction_controller make_compaction_controller(const compaction_manager::scheduling_group& csg, uint64_t static_shares, std::optional<float> max_shares, std::function<double()> fn) {
+    return compaction_controller(csg, static_shares, max_shares, 250ms, std::move(fn));
 }

 compaction::compaction_state::~compaction_state() {
@@ -1014,7 +1016,7 @@ compaction_manager::compaction_manager(config cfg, abort_source& as, tasks::task
    , _sys_ks("compaction_manager::system_keyspace")
    , _cfg(std::move(cfg))
    , _compaction_submission_timer(compaction_sg(), compaction_submission_callback())
-    , _compaction_controller(make_compaction_controller(compaction_sg(), static_shares(), [this] () -> float {
+    , _compaction_controller(make_compaction_controller(compaction_sg(), static_shares(), _cfg.max_shares.get(), [this] () -> float {
        _last_backlog = backlog();
        auto b = _last_backlog / available_memory();
        // This means we are using an unimplemented strategy
@@ -1033,6 +1035,10 @@ compaction_manager::compaction_manager(config cfg, abort_source& as, tasks::task
    , _throughput_updater(serialized_action([this] { return update_throughput(throughput_mbs()); }))
    , _update_compaction_static_shares_action([this] { return update_static_shares(static_shares()); })
    , _compaction_static_shares_observer(_cfg.static_shares.observe(_update_compaction_static_shares_action.make_observer()))
+    , _compaction_max_shares_observer(_cfg.max_shares.observe([this] (const float& max_shares) {
+        cmlog.info("Updating max shares to {}", max_shares);
+        _compaction_controller.set_max_shares(max_shares);
+    }))
    , _strategy_control(std::make_unique<strategy_control>(*this))
    , _tombstone_gc_state(_shared_tombstone_gc_state) {
    tm.register_module(_task_manager_module->get_name(), _task_manager_module);
@@ -1051,11 +1057,12 @@ compaction_manager::compaction_manager(tasks::task_manager& tm)
    , _sys_ks("compaction_manager::system_keyspace")
    , _cfg(config{ .available_memory = 1 })
    , _compaction_submission_timer(compaction_sg(), compaction_submission_callback())
-    , _compaction_controller(make_compaction_controller(compaction_sg(), 1, [] () -> float { return 1.0; }))
+    , _compaction_controller(make_compaction_controller(compaction_sg(), 1, std::nullopt, [] () -> float { return 1.0; }))
    , _backlog_manager(_compaction_controller)
    , _throughput_updater(serialized_action([this] { return update_throughput(throughput_mbs()); }))
    , _update_compaction_static_shares_action([] { return make_ready_future<>(); })
    , _compaction_static_shares_observer(_cfg.static_shares.observe(_update_compaction_static_shares_action.make_observer()))
+    , _compaction_max_shares_observer(_cfg.max_shares.observe([] (const float& max_shares) {}))
    , _strategy_control(std::make_unique<strategy_control>(*this))
    , _tombstone_gc_state(_shared_tombstone_gc_state) {
    tm.register_module(_task_manager_module->get_name(), _task_manager_module);
@@ -1842,6 +1849,10 @@ protected:
                throw make_compaction_stopped_exception();
            }
        }, false);
+        if (utils::get_local_injector().is_enabled("split_sstable_force_stop_exception")) {
+            throw make_compaction_stopped_exception();
+        }
+
        co_return co_await do_rewrite_sstable(std::move(sst));
    }
 };
@@ -2279,12 +2290,16 @@ future<compaction_manager::compaction_stats_opt> compaction_manager::perform_spl
 }

 future<std::vector<sstables::shared_sstable>>
-compaction_manager::maybe_split_sstable(sstables::shared_sstable sst, compaction_group_view& t, compaction_type_options::split opt) {
+compaction_manager::maybe_split_new_sstable(sstables::shared_sstable sst, compaction_group_view& t, compaction_type_options::split opt) {
    if (!split_compaction_task_executor::sstable_needs_split(sst, opt)) {
        co_return std::vector<sstables::shared_sstable>{sst};
    }
-    if (!can_proceed(&t)) {
-        co_return std::vector<sstables::shared_sstable>{sst};
+    // Throw an error if split cannot be performed due to e.g. out of space prevention.
+    // We don't want to prevent split because compaction is temporarily disabled on a view only for synchronization,
+    // which is unneeded against new sstables that aren't part of any set yet, so never use can_proceed(&t) here.
+    if (is_disabled()) {
+        co_return coroutine::exception(std::make_exception_ptr(std::runtime_error(format("Cannot split {} because manager has compaction disabled, " \
+                                                                                         "reason might be out of space prevention", sst->get_filename()))));
    }
    std::vector<sstables::shared_sstable> ret;

@@ -2292,8 +2307,11 @@ compaction_manager::maybe_split_sstable(sstables::shared_sstable sst, compaction
    compaction_progress_monitor monitor;
    compaction_data info = create_compaction_data();
    compaction_descriptor desc = split_compaction_task_executor::make_descriptor(sst, opt);
-    desc.creator = [&t] (shard_id _) {
-        return t.make_sstable();
+    desc.creator = [&t, sst] (shard_id _) {
+        // NOTE: preserves the sstable state, since we want the output to be on the same state as the original.
+        // For example, if base table has views, it's important that sstable produced by repair will be
+        // in the staging state.
+        return t.make_sstable(sst->state());
    };
    desc.replacer = [&] (compaction_completion_desc d) {
        std::move(d.new_sstables.begin(), d.new_sstables.end(), std::back_inserter(ret));
--- a/compaction/compaction_manager.hh
+++ b/compaction/compaction_manager.hh
@@ -80,6 +80,7 @@ public:
        scheduling_group maintenance_sched_group;
        size_t available_memory = 0;
        utils::updateable_value<float> static_shares = utils::updateable_value<float>(0);
+        utils::updateable_value<float> max_shares = utils::updateable_value<float>(0);
        utils::updateable_value<uint32_t> throughput_mb_per_sec = utils::updateable_value<uint32_t>(0);
        std::chrono::seconds flush_all_tables_before_major = std::chrono::duration_cast<std::chrono::seconds>(std::chrono::days(1));
    };
@@ -159,6 +160,7 @@ private:
    std::optional<utils::observer<uint32_t>> _throughput_option_observer;
    serialized_action _update_compaction_static_shares_action;
    utils::observer<float> _compaction_static_shares_observer;
+    utils::observer<float> _compaction_max_shares_observer;
    uint64_t _validation_errors = 0;

    class strategy_control;
@@ -291,6 +293,10 @@ public:
        return _cfg.static_shares.get();
    }

+    float max_shares() const noexcept {
+        return _cfg.max_shares.get();
+    }
+
    uint32_t throughput_mbs() const noexcept {
        return _cfg.throughput_mb_per_sec.get();
    }
@@ -370,7 +376,8 @@ public:
    // Splits a single SSTable by segregating all its data according to the classifier.
    // If SSTable doesn't need split, the same input SSTable is returned as output.
    // If SSTable needs split, then output SSTables are returned and the input SSTable is deleted.
-    future<std::vector<sstables::shared_sstable>> maybe_split_sstable(sstables::shared_sstable sst, compaction_group_view& t, compaction_type_options::split opt);
+    // Exception is thrown if the input sstable cannot be split due to e.g. out of space prevention.
+    future<std::vector<sstables::shared_sstable>> maybe_split_new_sstable(sstables::shared_sstable sst, compaction_group_view& t, compaction_type_options::split opt);

    // Run a custom job for a given table, defined by a function
    // it completes when future returned by job is ready or returns immediately
--- a/Show More
+++ b/Show More