Commit Graph

5011 Commits

Author SHA1 Message Date
Nadav Har'El
75a05fc2b3 Merge 'cql3: fix stack overflow and quadratic behavior' from Avi Kivity
This series fixes two vulnerabilities:

unbounded recursion during expression evaluation with deeply nested expressions
quadratic computation with large WHERE clauses
The fixes simply bound the depth of recursion and the length of the WHERE clause.

The WHERE clause limits are configurable. Nesting is less likely to be exceeded, so not configurable.

Limits inspired by Common Expression Language:

https://github.com/google/cel-spec/blob/master/doc/langdef.md#syntax

Implementations are required to support at least:

24-32 repetitions of repeating rules
12 repetitions of recursive rules

CVE-2026-31948
CVE-2026-31947

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1003
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1002
Fixes https://github.com/scylladb/scylladb/issues/14472

Closes scylladb/scylladb-ghsa-m4h7-g37h-mgxf#3

* github.com:scylladb/scylladb-ghsa-m4h7-g37h-mgxf:
  cql3: limit number of relations in WHERE clause
  cql3: add max_relations_in_where_clause to dialect
  test/cqlpy: add tests for WHERE clause relation count limit
  cql3: limit nesting depth of function calls and CASTs in CQL parser
  test/cqlpy: add tests for deeply nested function calls and CASTs
2026-06-01 22:31:56 +03:00
Avi Kivity
fdcc44c425 cql3: add max_relations_in_where_clause to dialect
Add a configurable max_relations_in_where_clause parameter (default 100)
to the CQL dialect, plumbed through db::config, transport server, and
test environment. This will be used by the CQL parser to reject WHERE
clauses with too many relations that cause quadratic complexity.
2026-06-01 14:01:27 +03:00
Botond Dénes
bb81dbf65e Merge 'guardrails: Add replica-side large data guardrails' from Taras Veretilnyk
Adds write-path guardrails that reject or warn on mutations targeting partitions, rows, or collections that already exceed configured size thresholds, based on SSTable `large_data_record` metadata.
ScyllaDB already detects and records large partitions/rows/cells in `system.large_data_records` after compaction, but takes no preventive action on the write path. Once a partition grows past operational limits it causes latency spikes, OOM, and repair failures. These guardrails let operators set hard and soft thresholds so that writes to already-oversized data are rejected (hard) or logged as warnings (soft) before they make the problem worse.
- **Intrusive index over SSTable metadata**: A per-table `large_data_record_index` maintains three `boost::intrusive::multiset`s (partitions, rows, cells) using `auto_unlink` hooks directly on `large_data_record`. SSTable destruction automatically removes records from the index — no explicit deregistration needed.
- **Virtual dispatch for zero-cost disabled path**: `large_data_guardrail_base` → `noop_large_data_guardrail` / `large_data_guardrail`. Tables without guardrails enabled pay only a virtual call to a no-op. No index is built or maintained for disabled tables.
-  **Schema storage**: The per-table flag is stored as a scylla_tables column, following the tablets pattern: only write a live cell when enabled, omit entirely when disabled. The CQL feature gate prevents enabling until all nodes are upgraded.
- **Write-path integration**: The guardrail check runs in `do_apply` after the frozen mutation is deserialized but before it is applied to the memtable. Hint replay and Paxos learn skip the check via `skip_large_data_guardrails`.
Uses existing `large_*_warn_threshold` config options as soft limits and new `large_*_fail_threshold` options as hard limits. Checked dimensions:
- Partition size (bytes)
- Partition row count
- Row size (bytes)
- Collection element count

Backport is not required

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-180

Closes scylladb/scylladb#29733

* github.com:scylladb/scylladb:
  test/cqlpy: add per-table toggle, LWT exemption, and multi-category tests
  test/cqlpy: add large collection guardrail tests
  test/cqlpy: add large row guardrail tests
  test/cqlpy: add large partition guardrail tests
  test/boost: add large_data_guardrail unit tests
  test/cluster: add large data guardrails rolling upgrade test
  replica: wire large_data_guardrail into the write path
  schema: add per-table large_data_guardrails_enabled flag
  db: implement large_data_guardrail
  db: implement large_data_record_index
  sstables: add intrusive index hook to large_data_record
  db: add large_collection_elements_fail_threshold config option
  db: add large_row_fail_threshold_mb config option
  db: add rows_count_fail_threshold config option
  db: add large_partition_fail_threshold_mb config option
  replica: introduce large_data_exception
2026-06-01 13:26:00 +03:00
Pavel Emelyanov
8b2ff16cae schema: Move grace_period from schema_ctxt to schema_registry
The schema_registry_grace_period field on schema_ctxt was only used by
schema_registry itself for eviction timing. Move it to be a direct member
of schema_registry, passed at init() time. This removes one db::config
dependency from schema_ctxt.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>

Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>

Closes scylladb/scylladb#30038
2026-05-29 13:42:23 +03:00
Taras Veretilnyk
23881db289 replica: wire large_data_guardrail into the write path
Thread the per-table large_data_guardrail through the write path so
that mutations exceeding configured thresholds are rejected before
being applied to the memtable.

The guardrail is selected in database::do_apply — either the table's
own guardrail or a static noop when skip_large_data_guardrails is set.
It flows through apply_in_memory → table::apply →
memtable::apply, where the check runs after partition_builder
deserializes the frozen mutation. For large mutations (>128KB), the
check runs after unfreeze_gently instead.
2026-05-29 12:18:33 +02:00
Taras Veretilnyk
5a0974e781 schema: add per-table large_data_guardrails_enabled flag
Add a per-table large_data_guardrails_enabled flag controlled via the CQL
table property WITH large_data_guardrails_enabled = true|false.

Store the flag as a boolean column in system_schema_ext.scylla_tables.
Only write a live cell when enabled; when disabled (the default), omit
the cell entirely so that old nodes that don't know this column can
still read the SSTable during rolling upgrade or rollback.  When the
property transitions from true to false via ALTER TABLE, a tombstone is
written in make_update_table_mutations to override the previous live
cell — this is safe because the CQL feature gate ensures all nodes are
upgraded before the property can be set to true.

Gate the CQL property behind the LARGE_DATA_GUARDRAILS cluster feature:
attempting to set large_data_guardrails_enabled = true before all nodes
advertise the feature raises a ConfigurationException.
2026-05-29 12:18:33 +02:00
Taras Veretilnyk
f7ffc64703 db: implement large_data_guardrail
Checks partition size, row count, row size, and collection element count
against config thresholds using large_data_record_index lookups.
Warns on soft limit, throws large_data_exception on hard limit.
2026-05-28 18:29:32 +02:00
Taras Veretilnyk
ae8879d2f6 db: implement large_data_record_index
Per-table index over large_data_records from all live SSTables.
Uses three intrusive multisets (partitions, rows, cells) with member
hooks directly on large_data_record.  Auto-unlink handles cleanup
when SSTables are destroyed.  Aggregation (max across SSTables for
the same key) happens at lookup time via equal_range.
2026-05-28 12:59:22 +02:00
Piotr Dulikowski
f44d57c7c7 Merge 'Deprecate HOST_ID_BASED_HINTED_HANDOFF feature and drop migration code' from Gleb Natapov
The feature was included in 2024.2 and present on all supported versions. No upgrade from a version that does not have it is possible to the HEAD. It means that the feature can be deprecated features list and all the migration code can be dropped.

No need to backport since the is code removal.

Closes scylladb/scylladb#30087

* github.com:scylladb/scylladb:
  hints: remove hint_directory_manager and IP-based hint directory infrastructure
  hints: remove migration infrastructure
  hints: deprecate HOST_ID_BASED_HINTED_HANDOFF feature
2026-05-28 10:09:02 +02:00
Nadav Har'El
21ecc12fc6 Merge 'index: fix local vector index locality detection after schema reload' from Michał Hudobski
After schema reload, `target_parser::is_local()` did not recognize the
vector-index local target format `{"pk": [...], "tc": "..."}`, causing
local vector indexes to be treated as global. This broke duplicate
detection when both a global and a local vector index existed on the same
column. Fix by introducing `vector_index::is_local()` and dispatching
to it from `create_index_from_index_row()` based on the index class.
Also adds tests for local/global vector index coexistence.

Fixes: SCYLLADB-987

backport reasoning: we added local vector index support in 2026.1

Closes scylladb/scylladb#29492

* github.com:scylladb/scylladb:
  test/cqlpy: add tests for global and local vector index coexistence
  index: fix local vector index locality detection after schema reload
2026-05-27 15:34:57 +03:00
Gleb Natapov
54a423986e hints: remove hint_directory_manager and IP-based hint directory infrastructure
Now that HOST_ID_BASED_HINTED_HANDOFF is always enabled, remove the
hint_directory_manager class and all code paths that dealt with
IP-named hint directories and IP-to-host-ID mappings.

- Remove hint_directory_manager class from hint_storage.hh/.cc
- Simplify drain_for to take only host_id (no IP parameter)
- Simplify initialize_endpoint_managers to only scan host-ID directories
- Simplify with_file_update_mutex_for to take host_id directly
- Simplify resource_manager's space_watchdog to use host_id only
- Make storage_proxy::on_leave_cluster empty (draining via on_released)
- Remove uses_host_id() checks from storage_proxy::on_released
2026-05-27 11:13:28 +03:00
Botond Dénes
555cfbcd38 Merge 'treewide: replace deprecated smp::count and smp::all_cpus() with new APIs' from Avi Kivity
Replace all uses of the deprecated seastar::smp::count with this_smp_shard_count() and smp::all_cpus() with this_smp_all_shards() across the ScyllaDB codebase (seastar submodule untouched).

Both replacement functions require a reactor thread context. All call sites were verified to run on reactor threads.

Notable cases:
- dht/token-sharding.hh: this_smp_shard_count() is used as a default parameter value. This is safe since all callers are on reactor threads, but the expression is now evaluated at each call site rather than being a reference to a global variable.
- service/storage_service.hh, locator/abstract_replication_strategy.hh, ent/encryption/encryption.cc: used in default member initializers and constructor member-init-lists. Objects are always constructed on reactor threads.
- schema_builder: sometimes called from BOOST_AUTO_TEST_CASE without a reactor. Added pre-patch that makes the implicit shard count parameter implicit and pass 1 in those cases.

Not changed:
- scylla-gdb.py: reads smp::count as a GDB symbol (no reactor context).
- Python test files: only reference smp::count in comments/strings.

No backport: the Seastar commit that deprecated these function hasn't (and won't) make its way into any release branches (and the warnings are cosmetic anyway)

Closes scylladb/scylladb#29990

* github.com:scylladb/scylladb:
  treewide: replace deprecated smp::count and smp::all_cpus() with new APIs
  scylla-gdb: read shard count from smp::_this_smp instead of smp::count
  schema_builder: make shard_count an explicit constructor parameter
2026-05-27 09:42:06 +03:00
Avi Kivity
8010e408a2 treewide: replace deprecated smp::count and smp::all_cpus() with new APIs
Replace all uses of the deprecated seastar::smp::count with
this_smp_shard_count() and smp::all_cpus() with this_smp_all_shards()
across the ScyllaDB codebase (seastar submodule untouched).

Both replacement functions require a reactor thread context. All call
sites were verified to run on reactor threads.

Notable cases:
- dht/token-sharding.hh: this_smp_shard_count() is used as a default
  parameter value. This is safe since all callers are on reactor threads,
  but the expression is now evaluated at each call site rather than being
  a reference to a global variable.
- service/storage_service.hh, locator/abstract_replication_strategy.hh,
  ent/encryption/encryption.cc: used in default member initializers and
  constructor member-init-lists. Objects are always constructed on reactor
  threads.

Not changed:
- scylla-gdb.py: reads smp::count as a GDB symbol (no reactor context).
- Python test files: only reference smp::count in comments/strings.
2026-05-26 17:35:20 +03:00
Wojciech Mitros
ae0d77257f mv: fix view_update_builder losing fragments across batch boundaries
When a mutation generates more view updates than max_rows_for_view_updates
(100), view_update_builder::build_some() splits the work into multiple
batches. There was a bug in how fragments were read between batches:

When should_stop_updates() returned true, the old code called stop()
which returned stop_iteration::yes without reading the next fragments.
On the next build_some() call, read_both_next_fragments() was called
at the start, which advanced BOTH readers - skipping any fragment that
was already read but not yet consumed. A row could be not consumed if
either:
- the 100th (last in the batch) update was a row insertion and we still
  had insertions/updates remaining
- the 100th (last in the batch) update was a row deletion and we still
  had deletions/updates remaining
For the most common case where work is split in batches, i.e. range
deletions, we couldn't hit this because range delete generates only
view row deletions.
On tables with a single materialized view, we also couldn't get this
for any batches with less than 50 statements (unless the batch also
contained range deletions), because one non-range-delete update can
generate up to 2 view updates.
Howeveer, for a range of scenarios outside these 2, we could lose
view updates, resulting in persistent inconsistencies.

The fix:
- read_*_next_fragment() now accept a stop_iteration parameter, so the
  next fragments are always read after consuming (even when stopping),
  but stop_iteration::yes is correctly propagated to break the loop.
- build_some() no longer re-reads fragments at the start. Instead, an
  initialize() method performs the initial read once at construction.
- because now we only advance readers after consuming, we won't advance
  readers after end_of_partition, so we extend the break condition to
  accept either readers evaluating to `false` or them being at the
  end_of_partition. We also handle the optimization with
  _skip_row_updates

Fixes: scylladb/scylladb#29155

Closes scylladb/scylladb#29498
2026-05-26 14:15:12 +02:00
Avi Kivity
f165b396fd schema_builder: make shard_count an explicit constructor parameter
A recent Seastar update deprecated smp::count and introduced
this_smp_shard_count() as a replacement. One difference is that
this_smp_shard_count() wants to run on a reactor thread.

This poses a problem for non-reactor tests (BOOST_AUTO_TEST_CASE)
that nevertheless use a schema, as the schema_builder constructor
references smp::count. If we replace it with this_smp_shard_count()
then it will crash when running without a reactor.

To fix, remove the implicit this_smp_shard_count() call from raw_schema's
constructor and require callers to pass shard_count explicitly to
schema_builder. This allows tests that don't run on a reactor thread
to construct schemas without crashing.

Production code and reactor-based tests pass this_smp_shard_count().
Non-reactor test files (expr_test, keys_test, nonwrapping_interval_test,
wrapping_interval_test, bti_key_translation_test, range_tombstone_list_test)
pass a fixed shard count of 1.

Note: sstable_test.cc is a Seastar test file (SEASTAR_THREAD_TEST_CASE)
but also contains one plain BOOST_AUTO_TEST_CASE
(test_empty_key_view_comparison) that constructs a schema_builder without
a reactor context. This test also receives a fixed shard count of 1.
2026-05-26 11:55:56 +03:00
Gleb Natapov
d48b8fd1f0 hints: remove migration infrastructure
Remove migrate_ip_directories(), perform_migration(), and all
associated state: _migration_callback, _migrating_done,
_migration_mutex, state::migrating.

Make _uses_host_id a static constexpr true — the dead IP-based
branches still compile but will be removed in the next commit.
2026-05-26 11:44:57 +03:00
Gleb Natapov
10d37494ca hints: deprecate HOST_ID_BASED_HINTED_HANDOFF feature
The host_id_based_hinted_handoff feature is now guaranteed to be
enabled on all supported upgrade paths. Move it to the deprecated
features list (still advertised via gossip for compatibility) and
remove the feature checks from the hint manager startup.
2026-05-26 11:44:57 +03:00
Botond Dénes
597d4252dc types: abstract_type::from_string() switch to fragmented buffers (interface)
Change input: str::string_view -> utils::chunked_string_view.
Change return value: bytes -> managed_bytes.

This patch only changes the interface, with some to_bytes() sprinkled in
the internals to deal with recursive calls.
Internals will be updated in the next patch, to keep the churn of
updating callers separate from the actually important changes.
2026-05-26 09:08:06 +03:00
Nadav Har'El
96dd3121e7 Merge 'cql: rewrite CassIO SAI metadata index to regular secondary index' from Szymon Wasik
CassIO (the library backing LangChain's `langchain_community.vectorstores.Cassandra` integration) issues the following DDL during schema setup to create a metadata index:

```sql
CREATE CUSTOM INDEX IF NOT EXISTS eidx_metadata_s_<table>
ON <keyspace>.<table> (ENTRIES(metadata_s))
USING 'org.apache.cassandra.index.sai.StorageAttachedIndex';
```

ScyllaDB does not support Cassandra's StorageAttachedIndex (SAI) for non-vector columns and previously rejected this statement with:

```
StorageAttachedIndex (SAI) is only supported on vector columns; use a secondary index for non-vector columns
```

This blocks seamless migration of existing LangChain/CassIO applications from Cassandra to ScyllaDB — applications fail during initialization before any application-level workaround can run, even when metadata filtering is not used (`metadata_indexing="none"`).

CassIO is no longer actively maintained but remains the only official LangChain integration path for Apache Cassandra over CQL, meaning existing applications will continue using this setup pattern.

Instead of rejecting the CassIO metadata-map SAI DDL, detect the pattern and rewrite it to a standard ScyllaDB secondary index on collection entries:

- **Detection**: SAI class name + single `ENTRIES` target on a non-frozen `map` column
- **Rewrite**: Clear the custom class so the index is created through the standard secondary index path (which already fully supports indexing map entries)
- **Warning**: Emit a CQL warning informing the user that SAI is not supported by ScyllaDB, a regular secondary index was created instead, and metadata filtering behavior may differ from Cassandra SAI

The rewrite is placed early in `validate_while_executing()`, before the rf-rack-validity check, so the standard secondary index code path handles all subsequent validation naturally — no code duplication.

After this change, the CassIO schema setup succeeds on ScyllaDB:
- `CREATE CUSTOM INDEX ... USING 'sai'` on `ENTRIES(metadata_s)` creates a real secondary index
- The index is functional and can accelerate metadata filtering queries
- A CQL warning makes the rewrite transparent to operators
- SAI on non-vector, non-map-entries columns is still rejected as before
- Vector SAI indexes continue to be rewritten to `vector_index` as before

- `test_sai_entries_on_map_creates_regular_index` — verifies the index is created and the warning is emitted (fully-qualified SAI class name)
- `test_sai_entries_on_map_short_name` — same with the `'sai'` short alias
- `test_sai_on_regular_column_rejected` — confirms SAI on regular scalar columns is still rejected

All 148 tests in `test_vector_index.py` and `test_secondary_index.py` pass with no regressions (125 passed, 22 xfailed, 1 skipped).

Fixes: SCYLLADB-2113
Backport: 2026.2 as this is the version where the support for SAI class needed by LangChain was added.

Closes scylladb/scylladb#29981

* github.com:scylladb/scylladb:
  cql: rewrite CassIO SAI metadata index to regular secondary index
  db/config: add enable_cassio_compatibility flag
2026-05-26 00:19:03 +03:00
Avi Kivity
305346a3ec Merge 'Don't materialize collections into intermediate representations' from Botond Dénes
Collections have an age-old problem in ScyllaDB: they had to be unserialized into an intermediate representation for any access or manipulation. The intermediate representation needs effort to produce and also requires additional memory to store. Both can be significant for large collections. This intermediate representation is then either discarded immediately after use, or re-serialized again.
This problem was significant enough for us to consider the use of collections as somewhat of an anti-pattern. But our customers keep using it. Alternator is also a heavy user of collections.

This PR aims to solve this problem once and for all.  The plan is as follows:
* Promote direct use of the serialized collection format:
    - Add accessor methods to `collection_mutation_view` which read from the serialized format directly: `tomb()`, `size()` and `begin()`/`end()`.
    - Add a `collection_mutation_writer` which provides container semantics for generating a serialized `collection_mutation` directly on the go (`push_back()`).
* Replace all usage of `collection_mutation_description`, `collection_mutation_view_description` and friends with use of the new infrastructure.
* Drop the old infrastructure, to avoid accidental regressions.

Continues the work started by https://github.com/scylladb/scylladb/pull/29033 and takes it to its conclusion.

To help focus review, here is a summary of the patches:
* [1, 2] preparatory refactoring: drop some unused abstract_type params
* [3, 6] introduce new infrastructure to write and read serialized collections directly; this is the meat of the PR
* [6, -1) replace all usage of old materializing infrastructure with usage of the new one
* [-1] drop old infrastructure

**Command:**
```
dbuild -it -- build/release/scylla perf-simple-query --collection=16 -c1 -m2G --default-log-level=error
```

| Metric                   |  Before |   After | Change     |
|--------------------------|--------:|--------:|------------|
| Throughput (median tps)  | 315,760 | 332,021 | **+5.1%**  |
| Instructions/op (median) |  53,776 |  48,681 | **-9.5%**  |
| CPU cycles/op (median)   |  17,365 |  16,471 | **-5.1%**  |
| Allocations/op           |    85.1 |    82.1 | **-3.5%**  |

**Significant improvement.** Throughput is up ~5%, and both instruction count and cycle count are meaningfully reduced.

---

**Command:**
```
dbuild -it -- build/release/scylla perf-simple-query --collection=16 -c1 -m2G --default-log-level=error --write
```

| Metric                   |    Before |    After | Change    |
|--------------------------|----------:|---------:|-----------|
| Throughput (median tps)  |   150,823 |  149,678 | **-0.8%** |
| Instructions/op (median) |   108,388 |  103,858 | **-4.2%** |
| CPU cycles/op (median)   |    34,860 |   35,371 | **+1.5%** |
| Allocations/op           | ~105–108  | ~102–103 | **-3.0%** |

**Mixed, mostly neutral.** Throughput is essentially flat (within noise). Instructions/op improved by ~4%, allocations dropped slightly, but cycles/op edged up marginally.

---

**Command:**
```
dbuild -it -- build/release/scylla perf-alternator --workload write --developer-mode=1 --alternator-port=8000 --alternator-write-isolation=unsafe -c1 -m2G --default-log-level=error
```

| Metric                   |  Before |  After | Change    |
|--------------------------|--------:|-------:|-----------|
| Throughput (median tps)  |  55,777 | 56,051 | **+0.5%** |
| Instructions/op (median) | 246,215 |246,610 | **+0.2%** |
| CPU cycles/op (median)   |  77,641 | 77,020 | **-0.8%** |
| Allocations/op           |   340.4 |  335.4 | **-1.5%** |

**Essentially neutral.** All metrics are within noise margins. Slight reduction in allocations and cycles, negligible otherwise.

---

The change has a **clear, substantial positive effect on reads** (~5% throughput gain, ~9.5% fewer instructions per op).
The write and alternator paths are **unaffected in practice** — changes there are within measurement noise. No regressions are apparent.
This is expected: https://github.com/scylladb/scylladb/pull/29033 did the heavy lifting when it comes to the write path, this PR finishes the job, mostly improving reads.

Fixes: #3602

Improvement, no backport.

Closes scylladb/scylladb#29127

* github.com:scylladb/scylladb:
  mutation/collection_mutation: make collection_mutation::_data private
  mutation_collection: drop collection_mutation_description and friends
  test: move away from collection_mutation_description
  tree: move away from collection_mutation_description
  test: move away from collection_mutation_view::with_deserialized()
  tree: move away from collection_mutation_view::with_deserialized()
  types: fix indendation, left broken by previous commit
  types: move away from collection_mutation_view::with_deserialized()
  types: serialize_for_cql(): use throwing_assert() instead of SCYLLA_ASSERT()
  schema: column_computation: move away from collection_mutation_view::with_deserialized()
  mutation: move away from collection_mutation_view::with_deserialized()
  alternator: move away from collection_mutation_view::with_deserialized()
  cdc: move away from collection_mutation_view::with_deserialized()
  mutation/collection_mutation: printer: don't deserialize collections
  mutation/collection_mutation: difference(): don't deserialize collections
  mutation/collection_mutation: merge(): don't deserialize collections
  mutation/collection_mutation: extract compact_and_expire() to free function
  mutation/collection_mutation: refactor empty(), is_any_live() and last_update()
  compaction_garbage_collector: pass collection_mutation to collect()
  test/boost/mutation_test: add tests for collection_mutation_{view,writer}
  mutation/collaction_mutation: collection_mutation_view: add methods to inspect content
  mutation/collection_mutation: add collection_mutation_writer
  mutation/collection_mutation: collection_mutation(): generate valid collection
  mutation/collection_mutation: collection_mutation(): remove unused abstract_type param
  mutation/atomic_cell: drop unused type param from from_bytes()
2026-05-21 17:10:40 +03:00
Piotr Dulikowski
6148316f66 Merge 'db/view/view_building_coordinator: add flag to mark if any remote work was finished' from Michał Jadwiszczak
There is small windows just after view building coordinator releases
group0 guard and before it waits on view_building_state_machine's CV,
when the coordinator may miss CV broadcast triggered by finished remote
work.

To fix it, this patch adds a boolean flag, which is set to true before
broadcasting the CV and is checked before awaiting on the CV.

Fixes SCYLLADB-2029

The problem is not critical but it should be backported to 2025.4 and newer version, all of them contains view building coordinator.

Closes scylladb/scylladb#27313

* github.com:scylladb/scylladb:
  test/cluster/test_view_building_coordinator: add reproducer
  db/view/view_building_coordinator: add flag to mark if any remote work was finished
2026-05-21 15:11:58 +02:00
Szymon Wasik
242eb96b16 db/config: add enable_cassio_compatibility flag
Add a new live-updatable boolean configuration option
'enable_cassio_compatibility' (default: false).

When enabled, it allows ScyllaDB to rewrite CassIO's SAI index DDL
on map entries to a regular secondary index, so that LangChain/CassIO
applications can run without DDL errors.

The flag is disabled by default to avoid affecting users who don't
need CassIO compatibility.
2026-05-21 13:26:02 +02:00
Pavel Emelyanov
4b13b24695 Merge 's3: make S3 connection pool size configurable per scheduling group' from Ernest Zaslavsky
The S3 client creates a separate HTTP connection pool per scheduling group. Previously, the pool size was hardcoded as shares/100, yielding 1-10 connections. This was not tunable and could under-provision connections for groups with low share counts.
**Changes**
- A missing include (short_streams.hh) in sstables_loader.cc is added first to fix CMake builds where the header is not transitively included.
- The hardcoded per-share divisor is replaced with a per-shard connection budget. The new `object_storage_connections_per_shard` config option (default 128) specifies the total number of connections available on each shard. Connections are distributed proportionally across scheduling groups based on their shares: `max_connections = budget * group_shares / total_shares`. Remainder connections are assigned to the group with the most shares. When a new scheduling group client is created, all existing groups are rebalanced via `set_maximum_connections`. Creation and rebalance are serialized with a semaphore to prevent concurrent rebalances from racing.
- The config option is made live-updateable: a `storage_manager` observer propagates changes to all existing S3 clients, triggering rebalance under the same semaphore.
Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1704
No backport needed since this change affects KS on object storage which is not operational yet.

Closes scylladb/scylladb#29719

* github.com:scylladb/scylladb:
  s3: make connections_per_shard live-updateable
  s3: distribute connection pool proportionally across scheduling groups
2026-05-21 12:12:36 +03:00
Andrzej Jackowski
f8156702de tree: add missing -present to copyright headers
~2076 files used "Copyright (C) YYYY-present ScyllaDB" while
~88 files used "Copyright (C) YYYY ScyllaDB". This
inconsistency leads to unnecessary code review discussions
and gradual spread of the less common format.

Standardize all ScyllaDB copyright headers to use -present.

Fixes SCYLLADB-1984

Closes scylladb/scylladb#29876
2026-05-21 10:57:42 +02:00
Michał Hudobski
cf372ba87b index: fix local vector index locality detection after schema reload
When index metadata was deserialized from system tables during schema
reload, target_parser::is_local() failed to recognize local vector
indexes. It only handled the non-vector JSON format {"pk": [...],
"ck": [...]}, but vector indexes serialize their targets as
{"pk": [...], "tc": "..."}. As a result, every local vector index was
incorrectly marked as global after a schema reload.

Fix this by introducing vector_index::is_local() that recognizes the
vector-specific target format, and dispatching to it from the schema
deserialization code based on the index class name. This keeps
target_parser as secondary-index-specific and follows the same dispatch
pattern already used for target serialization.

Also remove the now-unused has_vector_index_on_column() helper (its
callers were removed by #29407).
2026-05-21 10:35:48 +02:00
Botond Dénes
636e2877e2 tree: move away from collection_mutation_description
Use collection_mutation_writer instead.

Add to_managed_bytes() to cql3::raw_value to help avoid some copies.

A special note for sstables/kl/reader.cc: this conversion is not
straighforward, so we accumulate a list of cells and feed to the writer
at the end. This is sub-optimal but this code is rarely used, best to be
conservative.
2026-05-21 10:23:29 +03:00
Botond Dénes
35a776d043 tree: move away from collection_mutation_view::with_deserialized()
Use the collection_mutation_view directly.

This is the remainder after the previous patches collecting larger
changes by module.
2026-05-21 10:23:29 +03:00
Botond Dénes
24fdfa34dd mutation/collection_mutation: collection_mutation(): remove unused abstract_type param 2026-05-21 08:34:21 +03:00
Ernest Zaslavsky
86f678a592 s3: make connections_per_shard live-updateable
Wire the object_storage_connections_per_shard config option as
LiveUpdate so it can be changed at runtime without restart. When
the value changes, the storage_manager observer propagates it to
all existing S3 clients, which rebalance their connection pools
under the rebalance semaphore.
2026-05-20 20:45:14 +03:00
Ernest Zaslavsky
b9e1dcc0fe s3: distribute connection pool proportionally across scheduling groups
The S3 client creates a separate HTTP connection pool per scheduling
group. Previously, the pool size was computed per-group using a
per-share multiplier (connections = shares * multiplier), which did
not account for the total number of groups sharing the shard's
connection budget.

Replace the per-share multiplier with a per-shard connection budget:
the new object_storage_connections_per_shard config option (default
100) specifies the total number of connections available on each
shard. When a new scheduling group's client is created, connections
are distributed proportionally across all groups based on their
shares (connections = budget * group_shares / total_shares), and
existing groups are rebalanced via set_maximum_connections.

When the endpoint_config has an explicit max_connections override,
it is used directly without proportional distribution.
2026-05-20 20:45:04 +03:00
Marcin Maliszkiewicz
83823149e9 Merge 'audit: implement audit_rules config' from Andrzej Jackowski
This patch series adds `audit_rules`, a new audit configuration option for fine-grained, role-aware audit filtering with per-rule sink routing. Rules can be configured in `scylla.yaml` or updated live through `system.config` without restarting the node. Each rule specifies target sinks (`table`, `syslog`), statement categories, qualified table name patterns, and role patterns. Table and role patterns use POSIX `fnmatch` with extended glob syntax. For table-scoped categories (`DML`, `DDL`, `QUERY`), a rule matches only when the category, role, and qualified table name all match. For table-independent categories (`AUTH`, `ADMIN`, `DCL`), the table filter is ignored. Empty category or role lists match nothing; an empty table list matches nothing only for table-scoped categories. The new rules are additive with the existing `audit_categories`, `audit_keyspaces`, and `audit_tables` settings: both mechanisms are evaluated for each audit event, and the final sink set is the union of all matches.

To avoid evaluating glob patterns on every audit event, audit rules use a preprocessed cache of known roles and tables. The cache is kept in sync through group0 role/table snapshots, role-change notifications, and schema migration notifications. For known entities, rule matching uses precomputed role/table rule sets; unknown entities fall back to direct rule evaluation. When `audit_rules` is empty, per-event rule matching returns immediately and does not evaluate glob patterns. Audit still keeps known role/table metadata in sync while audit is enabled, so rules can be enabled later through live configuration updates without restarting the node.

**Performance**
Measured with `perf-simple-query --smp 1 --duration 100` against a null syslog socket. Results show no regression when audit is disabled, and audit-rules performance has at most 1% more instructions than legacy config for equivalent workloads:

```
===============================================================================================================================================================================
Configuration                                     | Binary     |         throughput (tps) | insns/op                 | cpu_cycles/op            | alloc/op | logal/op | task/op
===============================================================================================================================================================================
audit=none [1]                                    | baseline   |                 206922.4 |                  36591.6 |                  15348.3 |     58.1 |      0.0 |    14.1
audit=none [1]                                    | this PR    |        207856.4  (+0.5%) |         36544.9  (-0.1%) |         15274.0  (-0.5%) |     58.1 |      0.0 |    14.1
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
audit=syslog keyspaces=ks [2]                     | baseline   |                  94871.8 |                  54163.0 |                  27172.4 |     72.0 |      0.0 |    24.0
audit=syslog keyspaces=ks [2]                     | this PR    |         96138.4  (+1.3%) |         54072.3  (-0.2%) |         26699.3  (-1.7%) |     72.0 |      0.0 |    24.0
audit=syslog audit-rules=ks [3]                   | this PR    |         95142.1  (+0.3%) |         54457.8  (+0.5%) |         26953.8  (-0.8%) |     72.0 |      0.0 |    24.0
-------------------------------------------------------------------------------------------------------------------------------------------------------------------------------
audit=syslog keyspaces=ks-non-existent [4]        | baseline   |                 213997.8 |                  36735.6 |                  14848.1 |     58.1 |      0.0 |    14.1
audit=syslog keyspaces=ks-non-existent [4]        | this PR    |        219297.2  (+2.5%) |         36667.3  (-0.2%) |         14500.1  (-2.3%) |     58.1 |      0.0 |    14.1
audit=syslog audit-rules=ks-non-existent [5]      | this PR    |        211038.7  (-1.4%) |         36999.7  (+0.7%) |         15048.6  (+1.4%) |     58.1 |      0.0 |    14.1
===============================================================================================================================================================================

[1] ./scylla perf-simple-query --smp 1 --duration 100 --audit "none"
[2] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks" --audit-categories "DCL,DDL,AUTH,DML,QUERY" --audit-unix-socket-path "/tmp/audit-null.sock"
[3] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categories":["DCL","DDL","AUTH","DML","QUERY"],"qualified_table_names":["ks.*"],"roles":["*"]}]' --audit-unix-socket-path "/tmp/audit-null.sock"
[4] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-keyspaces "ks-non-existent" --audit-categories "DCL,DDL,AUTH,DML,QUERY" --audit-unix-socket-path "/tmp/audit-null.sock"
[5] ./scylla perf-simple-query --smp 1 --duration 100 --audit "syslog" --audit-rules '[{"sinks":["syslog"],"categories":["DCL","DDL","AUTH","DML","QUERY"],"qualified_table_names":["ks-non-existent.*"],"roles":["*"]}]' --audit-unix-socket-path "/tmp/audit-null.sock"

audit-null.sock was created with `socat -u UNIX-RECV:/tmp/audit-null.sock,type=2 OPEN:/dev/null`
```

Fixes: SCYLLADB-1430
No backport: new feature

Closes scylladb/scylladb#29267

* github.com:scylladb/scylladb:
  test: alternator: audit: rules filtering and batch bypass
  test: perf: add --audit-rules option to perf-simple-query
  docs: add audit rules section to the auditing guide
  test: audit: cover role and schema cache notifications
  test: audit: cover audit rules cluster behavior
  audit: rebuild rule caches on group0 snapshot and role changes
  audit: refresh rule caches on schema, role, and config changes
  audit: route matching rules to configured sinks
  test: cover preprocessed audit rule cache
  audit: add preprocessed rule matching cache
  audit: pass sink targets to storage helpers
  test: audit: cover rule matching semantics
  audit: add rule matching and sink helpers
  test: audit: cover audit_rules configuration
  config: add live audit_rules option
  test: cover audit rule parsing and validation
  audit: define audit_rule type with parsing and validation
2026-05-20 14:10:45 +02:00
Avi Kivity
6df04c9e5b Update seastar submodule
Changed seastar::http::experimental to seastar::http to reflect
graduation of the seastar http API.

Changed call to seastar::rename_file() (in sstables/storage.cc,
sstables/sstable_directory.cc, sstable/sstables.cc and
db/hints/internal/hint_storage.cc) to reflect new default parameter.

Updated scylla_gdb test helper get_task() to work with updated
accept loop in Seatar. This is just test code (attempts to find
a task to operate on), not used in real scylla-gdb.py work, but
nevertheless the adjustment keeps backward compatibility.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1798
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-2043

* seastar 485a62b2...510f3148 (43):
  > reactor_backend: fix iocb double-free and shutdown hang during AIO teardown
  > file: fix default DMA alignment
  > http: add to_reply() to redirect_exception with extra-header support
  > core: propagate syscall errors via `coroutine::exception`
  > file: assert dma alignments are powers of two
  > doc: Document undocumented io_tester features and fix output example
  > backtrace: print the build_id along with the backtrace
  > reactor: default to oneline backtraces
  > Merge 'json: formatter: support types with user-defined conversion to sstring' from Benny Halevy
    tests: json_formatter: test formatter::write with string types
    json: formatter: support types with user-defined conversion to sstring
  > httpd_test: fix build failure with Seastar_SSTRING=OFF
  > net/tls: introduce ssl_call wrapper for SSL I/O
  > build: disable unused command line argument error for C++ module
  > coroutine/generator: fix setup of generator's waiting task
  > tests/tls: set 1000-day validity for self-signed CA cert
  > net: tls: openssl: disable certificate compression
  > reactor: reduce steady_clock::now() calls per scheduling quantum
  > fair_queue: remove notify_request_finished()
  > loop: use small_vector for parallel_for_each_state incomplete futures
  > dodge false sharing in spinlock
  > Merge 'Handle nowait support for reads and writes independently' from Pavel Emelyanov
    file: Change nowait_works mode detection
    file: Introduce read-only nowait_mode
    filesystem: Make nowait_works bit a enum class too
    file: Make nowait_works bit a enum class
  > Merge 'net/tls: improve OpenSSL error queue hygiene' from Gellért Peresztegi-Nagy
    net/tls: assert clean error queue before SSL operations
    net/tls: clear error queue after successful SSL operations
    net/tls: clear error queue after successful SSL_CTX_new
    net/tls: drain error queue on unexpected error codes
    net/tls: use make_openssl_error for BIO creation failure
  > vla.hh: add missing includes
  > Merge 'smp: make smp::count non-static' from Avi Kivity
    smp: convert all smp::count usages to instance-aware alternatives
    smp: add per-instance shard_count and this_smp() infrastructure
    disk_params: document pre-init smp::count access with explicit 0
    reactor_backend: document pre-init smp::count access with explicit 0
    tests: alien_test: pass shard count to alien thread explicitly
  > build: fix cmake missing ninja on Ubuntu 26.04
  > rpc: Fix uint64 wraparound of expired timeout in send_entry()
  > Merge 'Generalize some RPC tests' from Pavel Emelyanov
    tests: Generalize async connection-based scheduling RPC tests
    tests: Generalize sync connection-based scheduling RPC tests
    tests: Remove redundant variadic/nonvariadic RPC tuple tests
    tests: Generalize max timeout RPC tests
  > net: tls: openssl: Share BIO ptrs across shards
  > http: fix compilation on clang 22 with c++26
  > build: openssl tools needed for test cert generation
  > reactor: support rename2
  > future: fix forwarding of reference types
  > Merge 'Zero-copy http chunked data sink' from Pavel Emelyanov
    http: Make chunked data sink zero-copy
    tests/prometheus_http: Rewrite on top of http::client
    tests/httpd: Rewrite content_length_limit on top of http::client
  > tests: Replace ad-hoc http_consumer with production HTTP parser
  > Merge 'co_return to accept same expressions and types as return' from Alexey Bashtanov
    tests/unit/{coroutines,futures}: strict types on co_return and set_value
    api: introduce version 10:
    core/{coroutine,future}: make `co_return` more strict with types
    core/{coroutine,future}: preparations to fix `co_return` type semantics
  > Merge 'Perftune.py: add special handling for mlx5 rss queues number calculation' from Vladislav Zolotarov
    perftune.py: NetPerfTuner: enhance RSS (a.k.a. "Rx") queues accounting for mlx5 devices
    perftune.py: update docstring of NetPerfTuner.__get_rps_cpus() method
    perftune.py: add a method that parses and models the output of the 'ethtool -l' command for a given interface
  > httpd: rewrite do_accepts/do_accept_one as coroutines
  > file: add mmap support to file
  > http: Move client code out of experimental namespace
  > file: add hugetlbfs support to file system detection
  > tests: Replace test_source_impl with util::as_input_stream
  > tests: Replace buf_source_impl with util::as_input_stream
  > Merge 'rpc_tester: expose throuput for rpc tester' from Marcin Szopa
    rpc_tester: remove unused payload size variable from job_rpc_streaming class
    rpc_tester: add start time tracking for throughput calculation, print throughput and msg/s for job_rpc
    rpc_tester: refactor result emission to use dedicated functions for messages and throughput
  > iostream: cast first argument of `std::min` to `size_t`

Closes scylladb/scylladb#29952
2026-05-20 13:47:12 +03:00
Andrzej Jackowski
f3a7e2e3dc config: add live audit_rules option
Operators need to configure audit rules through YAML, CQL,
and CLI with live-update support so routing can be
reconfigured without restart.

Add audit_rules as a LiveUpdate config option with YAML
decoding, JSON parsing for CQL updates, CLI --audit-rules
flag, and a custom serializer that avoids double-quoting
the JSON array.

Refs SCYLLADB-1430
2026-05-20 06:55:14 +02:00
Szymon Malewski
6b2fce03f9 alternator: optional stripping of http response headers
In Alternator's HTTP API, response headers can dominate bandwidth for
small payloads. The Server, Date, and Content-Type headers were sent on
every response but many clients never use them.

This patch introduces three Alternator config options:
  - alternator_http_response_server_header,
  - alternator_http_response_disable_date_header,
  - alternator_http_response_disable_content_type_header,
which allow customizing or suppressing the respective HTTP response
headers. All three options support live update (no restart needed).
The Server header is no longer sent by default; the Date and
Content-Type defaults preserve the existing behavior.

The Server and Date header suppression uses Seastar's
set_server_header() and set_generate_date_header() APIs added in
https://github.com/scylladb/seastar/pull/3217. This patch also
fixes deprecation warnings from older Seastar HTTP APIs.

Tests are in test/alternator/test_http_headers.py.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-70

Closes scylladb/scylladb#28288
2026-05-19 10:47:13 +03:00
Piotr Dulikowski
26671d4d5f Merge 'Refactor view_update_builder' from Wojciech Mitros
This series improves the readability and structure of
view_update_builder, the component that generates materialized view
updates from base-table mutations.

The first four patches are pure renames and refactoring with no
semantic changes:

  1. Document that the builder operates on a single base partition.
  2. Rename member fields to clearly distinguish readers (the
     mutation_reader streams) from the cached fragments (the last
     mutation_fragment_v2 read from each stream).
  3. Rename advance/on_results methods to names that describe what
     they actually do: read the next fragment, or generate view
     updates.
  4. Extract partition-start handling into its own method.

The next two patches are minor optimizations:

  5. Simplify clustering-row handling by moving the row out of the
     fragment before applying the tombstone, avoiding an unnecessary
     memory-usage recalculation in the reader permit.
  6. Replace deep copies with moves in the existing-only tail path,
     matching the pattern used everywhere else.

Finally, patch 7 deduplicates the fragment-consuming logic by
extracting the three repeated blocks into consume_both_fragments(),
consume_update_fragment(), and consume_existing_fragment().

Code reorganization - no backport needed

Closes scylladb/scylladb#29497

* github.com:scylladb/scylladb:
  mv: deduplicate code for consuming fragments in view_update_builder
  mv: avoid unnecessary copies of existing rows in generate_updates()
  mv: simplify clustering row handling in generate_updates()
  mv: rename methods in view_update_builder for clarity
  mv: rename view_update_builder readers and cached fragments
  mv: drop redundant std::move from partition key extraction
  mv: document single-partition builder scope
2026-05-18 15:52:26 +02:00
Piotr Dulikowski
5efb43195e Merge 'db/schema_tables: don't emit empty view_building_tasks mutation on ALTER TABLE' from Michał Jadwiszczak
After recent change (1a32ccd) `make_update_indices_mutations()` is unconditionally adding a mutation for `system.view_building_tasks`, even when no indices were being dropped.

In a mixed-version cluster, the older node may not have this table, causing the Raft schema applier to fail with 'Can't find a column family with UUID ...'.

This patch fixes the bug by emitting the mutation when indices are actually dropped (i.e., when the view building cleanup code path was entered).

Fixes: SCYLLADB-2026
Refs: scylladb#26557

scylladb#26557 wasn't backported, so this patch also doesn't need to be.

Closes scylladb/scylladb#29908

* github.com:scylladb/scylladb:
  db/schema_tables: don't emit empty view_building_tasks mutation on ALTER TABLE
  db/view_building_task_mutation_builder: add `empty()` method
2026-05-18 15:37:02 +02:00
Michał Jadwiszczak
a9b2baf36b db/schema_tables: don't emit empty view_building_tasks mutation on ALTER TABLE
After recent change (1a32ccd) `make_update_indices_mutations()` is unconditionally
adding a mutation for `system.view_building_tasks`, even when no indices were being dropped.

In a mixed-version cluster, the older node may not have this table,
causing the Raft schema applier to fail with 'Can't find a column
family with UUID ...'.

This patch fixes the bug by emitting the mutation when indices are actually
dropped (i.e., when the view building cleanup code path was entered).

Fixes: SCYLLADB-2026
Refs: scylladb#26557
2026-05-18 10:01:21 +02:00
Michał Jadwiszczak
82eb5611ab db/view_building_task_mutation_builder: add empty() method
The method allows to check if the builder contains any changes,
so it will allow to skip emitting empty mutation.
2026-05-18 09:54:26 +02:00
Michał Jadwiszczak
c767ac7ef3 test/cluster/test_view_building_coordinator: add reproducer
Add test which reproduces scylladb/scylladb#27298
2026-05-18 09:18:56 +02:00
Michał Jadwiszczak
c7f65131bf db/view/view_building_coordinator: add flag to mark if any remote work was finished
In the main coordinator loop (`view_building_coordinator::run()`),
there is small windows just after view building coordinator releases
group0 guard and before it waits on view_building_state_machine's CV,
when the coordinator may miss CV broadcast triggered by finished remote
work (`view_building_coordinator::work_on_tasks()`).

To fix it, this patch adds a boolean flag, which is set to true before
broadcasting the CV by finished/failed RPC call
and is checked before awaiting on the CV.

Fixes scylladb/scylladb#27298
2026-05-18 09:18:07 +02:00
Petr Gusev
9e3209e4a3 cql: refactor add_tablet_info to take tablet_routing_info directly
Change add_tablet_info() to accept locator::tablet_routing_info instead
of destructured (tablet_replica_set, token_range) pair. This simplifies
all three call sites.

Remove the empty-replicas guard inside add_tablet_info(): the only
producer of tablet_routing_info is tablet ERM's check_locality(), which
returns either nullopt (correctly routed) or info with replicas copied
from tablet_info — a tablet always has replicas. All callers already
check for nullopt before calling add_tablet_info(), so by the time we
enter the function replicas are guaranteed non-empty.
2026-05-15 12:28:33 +02:00
Michał Jadwiszczak
b175f5b97d db/view/view_building_worker: add more logs when flushing base table
Add debug logs around flushing the base table to see how long does it
take in case of some stalls in view building.

Refs SCYLLADB-1261
2026-05-14 10:23:42 +02:00
Piotr Dulikowski
3c2c814215 Merge 'db/view/view_building: replace system keyspace functions with mutation builder' from Michał Jadwiszczak
`system.view_building_tasks` is a single partition table, so it makes more sense to use a mutation builder and generate 1 mutation per group0 command instead of generating multiple mutations.

This PR removes all `make_..._mutation()` system keyspace functions related to view building tasks and replaces them with mutation builder.

Refs https://github.com/scylladb/scylladb/issues/25929

This patch doesn't fix any bug, it only reduces number of generated mutations, no need to backport it.

Closes scylladb/scylladb#26557

* github.com:scylladb/scylladb:
  db/system_keyspace: replace `make_remove_view_building_task_mutation()` with mutation builder
  db/view/view_building_task_mutation_builder: make uuid generator optional
  db/system_keyspace: replace `make_view_building_task_mutation()` with mutation builder
  db/view/view_building_task_mutation_builder: add helper method
2026-05-13 16:10:55 +02:00
Taras Veretilnyk
ed8817724a db: add large_collection_elements_fail_threshold config option
Reject writes targeting a collection whose element count already
exceeds this threshold.  Code default is 0 (disabled) for existing
clusters; scylla.yaml ships 20000 for new deployments.
2026-05-13 13:47:45 +02:00
Taras Veretilnyk
ca2b8352ac db: add large_row_fail_threshold_mb config option
Reject writes targeting a row whose on-disk size already exceeds
this threshold (MB).  Code default is 0 (disabled) for existing
clusters; scylla.yaml ships 20 for new deployments.
2026-05-13 13:47:32 +02:00
Taras Veretilnyk
64fc53cb7c db: add rows_count_fail_threshold config option
Reject writes targeting a partition whose on-disk row count already
exceeds this threshold.  Code default is 0 (disabled) for existing
clusters; scylla.yaml ships 200000 for new deployments.
2026-05-13 13:47:17 +02:00
Taras Veretilnyk
c12b16603d db: add large_partition_fail_threshold_mb config option
Reject writes targeting a partition whose on-disk size already exceeds
this threshold (MB).  Code default is 0 (disabled) for existing
clusters; scylla.yaml ships 2000 for new deployments.
2026-05-13 13:47:02 +02:00
Michał Jadwiszczak
1a32ccd8f6 db/system_keyspace: replace make_remove_view_building_task_mutation() with mutation builder
Again, get rid of system keyspace method in favor of mutation builder,
because `system.view_building_tasks` is a single parition table.
2026-05-13 10:06:18 +02:00
Michał Jadwiszczak
2561cc1546 db/view/view_building_task_mutation_builder: make uuid generator optional
After scylladb/scylladb#28929 `task_uuid_generator` became necassary
dependency of `view_building_task_mutation_builder`.

However to create the generator we need `view_building_state`, which in
some parts of the code (schema_tables.cc, migration_manager.cc) requires
remote proxy to be obtained.

But sometimes we need the mutation builder to just remove some view
building task. In those cases, we don't need the uuid generator and the
remote proxy requirement is not necassary.
2026-05-13 09:58:27 +02:00
Michał Jadwiszczak
e002665aa7 db/system_keyspace: replace make_view_building_task_mutation() with mutation builder
`system.view_building_tasks` is a single partition table, so it makes
more sense to use a mutation builder and generate 1 mutation per group0
command instead of generating multiple mutations.
2026-05-12 21:49:18 +02:00