Commit Graph

449 Commits

Author SHA1 Message Date
Botond Dénes
3613d9a07d test/cqlpy/nodetool: fix NameError in compact_keyspace nodetool path
compact_keyspace() operates on a whole keyspace and has no 'cf' variable
in scope, but the nodetool fallback branch mistakenly passed it to
args.extend([ks, cf]), which would raise NameError whenever that path
was taken. Fix by passing only the keyspace.

Closes scylladb/scylladb#30097
2026-06-02 08:44:06 +03:00
Nadav Har'El
75a05fc2b3 Merge 'cql3: fix stack overflow and quadratic behavior' from Avi Kivity
This series fixes two vulnerabilities:

unbounded recursion during expression evaluation with deeply nested expressions
quadratic computation with large WHERE clauses
The fixes simply bound the depth of recursion and the length of the WHERE clause.

The WHERE clause limits are configurable. Nesting is less likely to be exceeded, so not configurable.

Limits inspired by Common Expression Language:

https://github.com/google/cel-spec/blob/master/doc/langdef.md#syntax

Implementations are required to support at least:

24-32 repetitions of repeating rules
12 repetitions of recursive rules

CVE-2026-31948
CVE-2026-31947

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1003
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1002
Fixes https://github.com/scylladb/scylladb/issues/14472

Closes scylladb/scylladb-ghsa-m4h7-g37h-mgxf#3

* github.com:scylladb/scylladb-ghsa-m4h7-g37h-mgxf:
  cql3: limit number of relations in WHERE clause
  cql3: add max_relations_in_where_clause to dialect
  test/cqlpy: add tests for WHERE clause relation count limit
  cql3: limit nesting depth of function calls and CASTs in CQL parser
  test/cqlpy: add tests for deeply nested function calls and CASTs
2026-06-01 22:31:56 +03:00
Avi Kivity
520b130b97 cql3: limit number of relations in WHERE clause
A WHERE clause with many relations (e.g. hundreds of AND-ed conditions)
can cause quadratic complexity. Check the relation count during parsing
and reject queries exceeding the configurable max_relations_in_where_clause
limit (default 100) with a SyntaxException.

The changes to IDL don't cause problems during upgrade, because
CQL forwarding is not in any released version, and because
it is part of an experimental feature.

CVE-2026-31947

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1002
2026-06-01 14:01:27 +03:00
Avi Kivity
1ad1c8ef7f test/cqlpy: add tests for WHERE clause relation count limit
Add tests that verify the CQL parser rejects WHERE clauses with too
many relations (e.g. WHERE a=1 AND b=1 AND ... repeated 200 times),
and that a reasonable number of relations (50) is still accepted.
2026-06-01 14:01:25 +03:00
Botond Dénes
bb81dbf65e Merge 'guardrails: Add replica-side large data guardrails' from Taras Veretilnyk
Adds write-path guardrails that reject or warn on mutations targeting partitions, rows, or collections that already exceed configured size thresholds, based on SSTable `large_data_record` metadata.
ScyllaDB already detects and records large partitions/rows/cells in `system.large_data_records` after compaction, but takes no preventive action on the write path. Once a partition grows past operational limits it causes latency spikes, OOM, and repair failures. These guardrails let operators set hard and soft thresholds so that writes to already-oversized data are rejected (hard) or logged as warnings (soft) before they make the problem worse.
- **Intrusive index over SSTable metadata**: A per-table `large_data_record_index` maintains three `boost::intrusive::multiset`s (partitions, rows, cells) using `auto_unlink` hooks directly on `large_data_record`. SSTable destruction automatically removes records from the index — no explicit deregistration needed.
- **Virtual dispatch for zero-cost disabled path**: `large_data_guardrail_base` → `noop_large_data_guardrail` / `large_data_guardrail`. Tables without guardrails enabled pay only a virtual call to a no-op. No index is built or maintained for disabled tables.
-  **Schema storage**: The per-table flag is stored as a scylla_tables column, following the tablets pattern: only write a live cell when enabled, omit entirely when disabled. The CQL feature gate prevents enabling until all nodes are upgraded.
- **Write-path integration**: The guardrail check runs in `do_apply` after the frozen mutation is deserialized but before it is applied to the memtable. Hint replay and Paxos learn skip the check via `skip_large_data_guardrails`.
Uses existing `large_*_warn_threshold` config options as soft limits and new `large_*_fail_threshold` options as hard limits. Checked dimensions:
- Partition size (bytes)
- Partition row count
- Row size (bytes)
- Collection element count

Backport is not required

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-180

Closes scylladb/scylladb#29733

* github.com:scylladb/scylladb:
  test/cqlpy: add per-table toggle, LWT exemption, and multi-category tests
  test/cqlpy: add large collection guardrail tests
  test/cqlpy: add large row guardrail tests
  test/cqlpy: add large partition guardrail tests
  test/boost: add large_data_guardrail unit tests
  test/cluster: add large data guardrails rolling upgrade test
  replica: wire large_data_guardrail into the write path
  schema: add per-table large_data_guardrails_enabled flag
  db: implement large_data_guardrail
  db: implement large_data_record_index
  sstables: add intrusive index hook to large_data_record
  db: add large_collection_elements_fail_threshold config option
  db: add large_row_fail_threshold_mb config option
  db: add rows_count_fail_threshold config option
  db: add large_partition_fail_threshold_mb config option
  replica: introduce large_data_exception
2026-06-01 13:26:00 +03:00
Taras Veretilnyk
9abf594397 test/cqlpy: add per-table toggle, LWT exemption, and multi-category tests
Per-table toggle: disabled-at-create, alter-disable, alter-reenable.
LWT exemption: Paxos learn must bypass the guardrail.
Multi-category independence: all three guardrails warn/reject
independently when SSTable records span partition, row, and collection
categories.
2026-05-29 12:51:43 +02:00
Taras Veretilnyk
7d365844a3 test/cqlpy: add large collection guardrail tests
Tests for collection element-count guardrail: hard-limit rejection,
disabled-when-zero, soft-limit log warning, and no-warning below
threshold.
2026-05-29 12:51:43 +02:00
Taras Veretilnyk
19a9e45da8 test/cqlpy: add large row guardrail tests
Tests for row-size guardrail: hard-limit rejection, disabled-when-zero,
soft-limit log warning, and no-warning below threshold.
2026-05-29 12:51:42 +02:00
Taras Veretilnyk
67b659e2bf test/cqlpy: add large partition guardrail tests
Tests for partition size and row-count guardrails: hard-limit rejection,
disabled-when-zero, soft-limit log warnings, and no-warning below
threshold.  Includes shared helpers and log assertion utilities used by
subsequent commits.
2026-05-29 12:51:42 +02:00
Nadav Har'El
21ecc12fc6 Merge 'index: fix local vector index locality detection after schema reload' from Michał Hudobski
After schema reload, `target_parser::is_local()` did not recognize the
vector-index local target format `{"pk": [...], "tc": "..."}`, causing
local vector indexes to be treated as global. This broke duplicate
detection when both a global and a local vector index existed on the same
column. Fix by introducing `vector_index::is_local()` and dispatching
to it from `create_index_from_index_row()` based on the index class.
Also adds tests for local/global vector index coexistence.

Fixes: SCYLLADB-987

backport reasoning: we added local vector index support in 2026.1

Closes scylladb/scylladb#29492

* github.com:scylladb/scylladb:
  test/cqlpy: add tests for global and local vector index coexistence
  index: fix local vector index locality detection after schema reload
2026-05-27 15:34:57 +03:00
Wojciech Mitros
ae0d77257f mv: fix view_update_builder losing fragments across batch boundaries
When a mutation generates more view updates than max_rows_for_view_updates
(100), view_update_builder::build_some() splits the work into multiple
batches. There was a bug in how fragments were read between batches:

When should_stop_updates() returned true, the old code called stop()
which returned stop_iteration::yes without reading the next fragments.
On the next build_some() call, read_both_next_fragments() was called
at the start, which advanced BOTH readers - skipping any fragment that
was already read but not yet consumed. A row could be not consumed if
either:
- the 100th (last in the batch) update was a row insertion and we still
  had insertions/updates remaining
- the 100th (last in the batch) update was a row deletion and we still
  had deletions/updates remaining
For the most common case where work is split in batches, i.e. range
deletions, we couldn't hit this because range delete generates only
view row deletions.
On tables with a single materialized view, we also couldn't get this
for any batches with less than 50 statements (unless the batch also
contained range deletions), because one non-range-delete update can
generate up to 2 view updates.
Howeveer, for a range of scenarios outside these 2, we could lose
view updates, resulting in persistent inconsistencies.

The fix:
- read_*_next_fragment() now accept a stop_iteration parameter, so the
  next fragments are always read after consuming (even when stopping),
  but stop_iteration::yes is correctly propagated to break the loop.
- build_some() no longer re-reads fragments at the start. Instead, an
  initialize() method performs the initial read once at construction.
- because now we only advance readers after consuming, we won't advance
  readers after end_of_partition, so we extend the break condition to
  accept either readers evaluating to `false` or them being at the
  end_of_partition. We also handle the optimization with
  _skip_row_updates

Fixes: scylladb/scylladb#29155

Closes scylladb/scylladb#29498
2026-05-26 14:15:12 +02:00
Nadav Har'El
f65a52f3ec Merge 'vector_search: test: migrate rescoring tests from C++/Boost to pytest' from Szymon Malewski
Migrate mock-based rescoring and oversampling tests from
test/vector_search/rescoring_test.cc to pytest and delete the C++ file.
Index option validation tests go to test_vector_index.py; rescoring tests
go to a new test_vector_search_rescoring.py which introduces shared
infrastructure (EmbeddingRow dataclass, TEST_DATA dict,
reversed_ann_response() helper, rescoring_test_table() context manager).

Two tests have updated assertions (semantic change):
filters_invalid_similarity_scores now uses per-function expected result
sets including a zero-vector row, and rescoring_with_zerovector_query
asserts empty results after NaN filtering (cosine only). Both are marked
xfail pending SCYLLADB-924.

Follow-up to #29593.

Does not require backport - simple refactoring of tests

Closes scylladb/scylladb#29906

* github.com:scylladb/scylladb:
  test/vector_search: migrate zero-vector query rescoring test to pytest; delete rescoring_test.cc
  test/vector_search: migrate invalid similarity score filtering test to pytest
  test/vector_search: migrate non-ANN similarity argument rescoring test to pytest
  test/vector_search: migrate wildcard select rescoring test to pytest
  test/vector_search: migrate similarity_function rescoring test to pytest
  test/vector_search: migrate rescoring and f32 quantization tests to pytest
  test/vector_search: migrate oversampling tests to pytest
  test/vector_search: migrate vector_index option validation tests to pytest
2026-05-26 09:45:40 +03:00
Szymon Malewski
2151a4fac3 test/vector_search: migrate zero-vector query rescoring test to pytest; delete rescoring_test.cc
Migrate rescoring_with_zerovector_query from rescoring_test.cc to pytest
as test_rescoring_with_zerovector_query. Tested with cosine similarity only
because zero vectors produce NaN only for cosine; other functions yield
valid scores.

The test is marked xfail: similarity_cosine now returns NaN for zero vectors
(SCYLLADB-456 fix) and rescoring should filter out NaN scores, yielding an
empty result set.

Semantic change: the test now asserts the desired empty-result behavior
instead of asserting that the query does not throw.

Delete rescoring_test.cc now that all tests have been migrated and remove
its entries from configure.py and test/vector_search/CMakeLists.txt.
2026-05-26 00:37:54 +02:00
Szymon Malewski
533a8e65fe test/vector_search: migrate invalid similarity score filtering test to pytest
Migrate no_nulls_in_rescored_results from rescoring_test.cc to pytest,
renamed to test_filters_invalid_similarity_scores_in_rescored_results.

The test now also inserts a zero-vector row (id=14) to cover the case
introduced when similarity_cosine was changed to return NaN for zero
vectors instead of throwing (SCYLLADB-456). The expected surviving set
of rows is refined per similarity function based on which inputs produce
valid (non-NaN, non-Infinity) similarity scores. Marked xfail because
rescoring does not yet filter rows with invalid scores.

Semantic change: the expected surviving row set is updated per the
behavior described above.
2026-05-26 00:37:54 +02:00
Szymon Malewski
63d9b7445f test/vector_search: migrate non-ANN similarity argument rescoring test to pytest
Migrate select_similarity_function_other_than_ann_ordering from
rescoring_test.cc to pytest. The test verifies that similarity scores in
SELECT are computed against the explicitly supplied argument vector rather
than the ANN ordering vector. No semantic change.
2026-05-26 00:37:54 +02:00
Szymon Malewski
0cb557695a test/vector_search: migrate wildcard select rescoring test to pytest
Migrate wildcard_select_is_correctly_rescored from rescoring_test.cc to
pytest. The test verifies that SELECT * with rescoring returns rows in
the correct similarity order with correct embedding values, covering a
slightly different processing path from the explicit-column SELECT test.
No semantic change.
2026-05-26 00:37:53 +02:00
Szymon Malewski
cae816a8c6 test/vector_search: migrate similarity_function rescoring test to pytest
Migrate similarity_function_returns_correctly_rescored_results from
rescoring_test.cc to pytest. The test verifies that similarity scores
in the SELECT clause are computed correctly after rescoring, for both
argument orderings of the similarity function. No semantic change.
2026-05-26 00:37:53 +02:00
Szymon Malewski
78d72309b8 test/vector_search: migrate rescoring and f32 quantization tests to pytest
Introduce shared test infrastructure in test_vector_search_rescoring.py:
EmbeddingRow dataclass, TEST_DATA dict keyed by similarity function name,
ANN_QUERY_VECTOR, reversed_ann_response() helper, and rescoring_test_table()
context manager.

Migrate result_returned_by_vector_store_is_rescored and
f32_quantization_disables_rescoring from rescoring_test.cc. No semantic change.
2026-05-26 00:37:53 +02:00
Szymon Malewski
400c0dbb22 test/vector_search: migrate oversampling tests to pytest
Migrate oversampling_multiplies_limit_for_vector_store_query and
oversampled_vector_store_results_are_limited_to_cql_limit from
rescoring_test.cc to test_vector_search_rescoring_with_mock.py.
No semantic change.
2026-05-26 00:37:53 +02:00
Szymon Malewski
9f632182fb test/vector_search: migrate vector_index option validation tests to pytest
CREATE INDEX option tests for quantization, oversampling, and rescoring
are moved from rescoring_test.cc to test_vector_index.py alongside the
existing index option tests. These tests exercise only option parsing and
validation - no vector store mock needed. No semantic change.
2026-05-26 00:37:52 +02:00
Nadav Har'El
96dd3121e7 Merge 'cql: rewrite CassIO SAI metadata index to regular secondary index' from Szymon Wasik
CassIO (the library backing LangChain's `langchain_community.vectorstores.Cassandra` integration) issues the following DDL during schema setup to create a metadata index:

```sql
CREATE CUSTOM INDEX IF NOT EXISTS eidx_metadata_s_<table>
ON <keyspace>.<table> (ENTRIES(metadata_s))
USING 'org.apache.cassandra.index.sai.StorageAttachedIndex';
```

ScyllaDB does not support Cassandra's StorageAttachedIndex (SAI) for non-vector columns and previously rejected this statement with:

```
StorageAttachedIndex (SAI) is only supported on vector columns; use a secondary index for non-vector columns
```

This blocks seamless migration of existing LangChain/CassIO applications from Cassandra to ScyllaDB — applications fail during initialization before any application-level workaround can run, even when metadata filtering is not used (`metadata_indexing="none"`).

CassIO is no longer actively maintained but remains the only official LangChain integration path for Apache Cassandra over CQL, meaning existing applications will continue using this setup pattern.

Instead of rejecting the CassIO metadata-map SAI DDL, detect the pattern and rewrite it to a standard ScyllaDB secondary index on collection entries:

- **Detection**: SAI class name + single `ENTRIES` target on a non-frozen `map` column
- **Rewrite**: Clear the custom class so the index is created through the standard secondary index path (which already fully supports indexing map entries)
- **Warning**: Emit a CQL warning informing the user that SAI is not supported by ScyllaDB, a regular secondary index was created instead, and metadata filtering behavior may differ from Cassandra SAI

The rewrite is placed early in `validate_while_executing()`, before the rf-rack-validity check, so the standard secondary index code path handles all subsequent validation naturally — no code duplication.

After this change, the CassIO schema setup succeeds on ScyllaDB:
- `CREATE CUSTOM INDEX ... USING 'sai'` on `ENTRIES(metadata_s)` creates a real secondary index
- The index is functional and can accelerate metadata filtering queries
- A CQL warning makes the rewrite transparent to operators
- SAI on non-vector, non-map-entries columns is still rejected as before
- Vector SAI indexes continue to be rewritten to `vector_index` as before

- `test_sai_entries_on_map_creates_regular_index` — verifies the index is created and the warning is emitted (fully-qualified SAI class name)
- `test_sai_entries_on_map_short_name` — same with the `'sai'` short alias
- `test_sai_on_regular_column_rejected` — confirms SAI on regular scalar columns is still rejected

All 148 tests in `test_vector_index.py` and `test_secondary_index.py` pass with no regressions (125 passed, 22 xfailed, 1 skipped).

Fixes: SCYLLADB-2113
Backport: 2026.2 as this is the version where the support for SAI class needed by LangChain was added.

Closes scylladb/scylladb#29981

* github.com:scylladb/scylladb:
  cql: rewrite CassIO SAI metadata index to regular secondary index
  db/config: add enable_cassio_compatibility flag
2026-05-26 00:19:03 +03:00
Michał Hudobski
1d17d2144f index, vector_index: limit primary key columns to 255
The vector-store's InvariantKey type supports at most 255 key
components. Reject vector index creation when the base table's
primary key (partition + clustering columns) exceeds this limit.

Fixes: VECTOR-553

Closes scylladb/scylladb#29317
2026-05-25 19:24:17 +03:00
Szymon Wasik
5ee339b11d cql: rewrite CassIO SAI metadata index to regular secondary index
When CassIO creates a SAI ENTRIES index on a map column,
ScyllaDB now rewrites it to a regular secondary index and emits
a CQL warning. This allows LangChain/CassIO applications to work
without DDL errors.

The rewrite is gated behind the enable_cassio_compatibility flag
(disabled by default).

Refs: SCYLLADB-2113
2026-05-25 15:11:43 +02:00
Dmitry Kropachev
74fa423271 transport: report host id in SUPPORTED
Currently driver creates network layout (node IP addresses and ports)
from `system.local`, `system.peers`, `system.client_routes` and then
runs on assumption that this network layout is correct.
It does not check if it is.
If, for example it happens so that node ip/port (say on proxy) will not
match what driver calculated it will go unnoticed.

The goal of this feature is to provide driver host-id on SUPPORTED frame,
so that it would know which node it connected to and could make decision
wether keep connection or drop it.

- add `SCYLLA_HOST_ID` to the CQL `SUPPORTED` response
- add a regression test that hooks the Python driver handshake and
  verifies the reported host id

- `python3.12 -m py_compile test/cqlpy/test_protocol_exceptions.py`
- syntax-only compile of `transport/server.cc` with the repo toolchain
  flags inside `dbuild`

Refs #27452
Refs https://scylladb.atlassian.net/browse/DRIVER-610

Closes scylladb/scylladb#29809
2026-05-25 14:36:53 +03:00
Nadav Har'El
f8aaeb5e87 cql: atomic add/subtract operations with LWT
ScyllaDB has special counter columns for which atomic add/subtract
operations like `SET a = a + 1` are allowed. Such operations have not
been allowed on ordinary non-counter columns, as they would not be
properly atomic - the read an the write are separate, and concurrent
operations can have incorrect results.

This patch makes it allowed to use such atomic add/subtract operations
in *LWT* statements. Some examples:

        UPDATE ... SET a = a - 1 IF a > 0

        UPDATE ... SET a = a + 1 IF EXISTS

        UPDATE ... SET a = a + 1 a != NULL

The row updated in the operation, and the updated column (a) should
be initialized before the update - arithmetic operations on missing
column values silently leave the column null (no error is generated).

This add/subtract operations is allowed on any numeric column -
integer or floating point of any size.

The ability of LWT to fetch the old values of a column and use it to
calculate the new value has long been available in our internal CAS
implementation - and has been in use for years in Alternator - but until
this patch it was not exposed in CQL's LWT.

This patch does not add new syntax to CQL - the "SET a = a + b"
and "SET a = a - b" syntax that already existed for counters is now
allowed for non-counters.

This is a new Scylla-only feature that does not exist in Cassandra.

Fixes #10568

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-25 10:09:11 +03:00
Yaniv Michael Kaul
bb69ae5a02 test: assert ALTER TYPE RENAME rejected on frozen PK UDTs
Add assertion that ALTER TYPE RENAME is rejected when the UDT is used
as a frozen partition key column. The existing test only covered ALTER
TYPE ADD. This closes the coverage gap from dtest
udtencoding_test.py::test_udt_change_in_partition_key, enabling its
removal.

Refs: SCYLLADB-1929

Closes scylladb/scylladb#29840
2026-05-22 12:29:43 +02:00
Avi Kivity
e35c388f65 cql3: limit nesting depth of function calls and CASTs in CQL parser
Deeply nested expressions like f(f(f(...))) can overflow the evaluator
stack. Add depth tracking in the recursive entry points of the CQL
grammar (unaliasedSelector, term, relation), rejecting expressions that
exceed the max_expression_nesting limit (12) with a SyntaxException.

CVE-2026-31948

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1003
2026-05-21 23:24:03 +03:00
Avi Kivity
c27e32299d test/cqlpy: add tests for deeply nested function calls and CASTs
Deeply nested expressions like f(f(f(f(f(x))))) overflow the evaluator
stack. Add tests that verify such expressions are rejected by the parser.

Tests cover all recursive paths in Cql.g:
  - function calls in selectors (unaliasedSelector -> selectionFunctionArgs)
  - CAST in selectors (unaliasedSelector -> K_CAST -> unaliasedSelector)
  - function calls in terms (term -> functionArgs -> term)
  - C-style casts in terms (term -> '(' comparatorType ')' -> term)
  - parenthesized relations ( relation -> '(' relation ')')

Marked skip pending the parser fix.
2026-05-21 23:22:30 +03:00
Dario Mirovic
f9e8518776 cql: fix request-side custom payload parsing
When a CQL client sends a request with the CUSTOM_PAYLOAD flag (0x04)
set, the frame body starts with a [bytes map] before the message.
Scylla never implemented parsing of this map on the request side.
This caused it to fail parsing with protocol errors such as
"truncated frame: expected 65546 bytes".

Fix this by skipping over the custom payload [bytes map] from the frame
body before dispatching to opcode-specific handlers. The payload
contents are discarded since Scylla has no pluggable QueryHandler.
Cassandra's default QueryHandler also discards them.

Fixes SCYLLADB-745
2026-05-21 18:36:37 +02:00
Dario Mirovic
8e6d2d0631 test/cqlpy: add tests for request-side custom payload handling
Add tests that verify Scylla's handling of CQL native protocol
requests with the CUSTOM_PAYLOAD flag (0x04) set. Each test asserts
the specific parse error that the unfixed server produces.

A separate CQL session is used for each test. The protocol error
kills the driver connection, and we need to catch it properly.

Refs SCYLLADB-745
2026-05-21 18:34:43 +02:00
Michał Hudobski
119ef942f8 test/cqlpy: add tests for global and local vector index coexistence
Add integration tests verifying that both a global and a local vector
index can be created on the same column without triggering a spurious
"duplicate custom index" error. This was fixed by #29407.

Tests cover:
- Creating global+local and local+global index pairs on the same column.
- Duplicate detection still rejects a second index of the same locality.
- IF NOT EXISTS is a no-op for a duplicate same-locality index (and
  verifies no extra index is created).
- IF NOT EXISTS with a different locality creates both indexes.
- Two indexes with the same name on different tables are rejected
  (partially validates VECTOR-643).

Fixes: SCYLLADB-987
2026-05-21 10:35:48 +02:00
Dawid Pawlik
6387c61506 test/cqlpy: add duplicate and view tests for fulltext index
Verify that fulltext indexes, which have no backing materialized view,
correctly reject duplicate index creation and respect IF NOT EXISTS
semantics. Named indexes must not be created twice under the same name;
unnamed indexes on the same column must be detected as duplicates.
IF NOT EXISTS must silently succeed rather than create a second index,
including the known edge cases where the same name is reused across
different tables or columns in the same keyspace (VECTOR-641).
2026-05-19 08:52:47 +02:00
Dawid Pawlik
232b1a3725 cql3: generalize viewless index handling in CREATE INDEX statement
Replace the `vector_index`-specific checks in `create_index_statement`
with a generic `is_viewless_custom_class()` helper that queries the
index factory to determine whether an index type creates a backing
materialized view.

This covers both existing (`vector_index`) and new (`fulltext_index`)
viewless index types:
- Reject view properties (WITH clause) for any viewless index
- Use name-based duplicate detection for named viewless indexes,
  since they have no backing view table for `has_schema()` to find
  (issue #26672)
2026-05-19 08:52:47 +02:00
Dawid Pawlik
215a1e3f00 test/cqlpy: add CDC validation tests for fulltext index
Verify that fulltext index creation and ALTER TABLE enforce the
CDC requirements: creation is rejected when TTL is below the 24-hour
minimum, or when the delta mode is neither 'full' nor compensated
by postimage. Also verify that enabling postimage or full delta mode
allows index creation to succeed, that DROP INDEX works,
and that ALTER TABLE cannot disable CDC while a fulltext index
is present.
2026-05-19 08:52:47 +02:00
Dawid Pawlik
558de64773 test/cqlpy: add tablet requirement test for fulltext index
Add `test_create_fulltext_index_requires_tablets` to verify that
creating a fulltext index on a keyspace with tablets disabled is
rejected.
2026-05-19 08:52:47 +02:00
Dawid Pawlik
69dc62c373 fulltext_index: require tablet storage for fulltext indexes
Fulltext indexes, like vector indexes, require the base table's
keyspace to use tablets. Add `check_uses_tablets()` validation to
`fulltext_index::validate()` that rejects index creation when the
keyspace does not use tablet storage.

Also add `skip_without_tablets` fixture to all existing fulltext index
tests so they are skipped in environments where tablets are not
available.
2026-05-19 08:52:47 +02:00
Dawid Pawlik
61d658106a index: introduce external_index base class for VS/FTS indexes
Add `external_index` as a common base for `vector_index` and `fulltext_index`,
both of which are backed by an external Vector Store engine and share CDC
requirements.
2026-05-19 08:52:47 +02:00
Dawid Pawlik
c2d27d1a50 index: remove Chinese, Japanese, and Korean language analyzers
Remove "chinese", "japanese", and "korean" from the list of accepted
full-text search analyzer options. Exposing these options commits
ScyllaDB to supporting them long-term — if we ever switch from one
backend search engine to another, CJK analyzers are the most likely
to lose out-of-the-box support, unlike the popular European languages
that are broadly available across text analysis libraries.

Restrict the accepted set now, while FTS is still new, to avoid a
future compatibility burden.

Add a test to check if the CJK language analyzer options are rejected.

Fixes: VECTOR-672

Closes scylladb/scylladb#29877
2026-05-18 18:20:47 +03:00
Szymon Malewski
15493872b2 vector_search: fix decimal/varint precision loss in filter value_to_json()
value_to_json() converts CQL values to JSON for vector search filters.
For decimal and varint types, it used rjson::parse() on the JSON string,
which parses through a double and silently loses precision for values
exceeding ~15 significant digits — producing wrong filter results.

Additionally, for decimal type we need an exact string representation
that preserves the original (unscaled, scale) pair, because partition
keys use byte-level identity: different serialized representations of
the same numeric value are distinct rows, so the filter must reproduce
the exact representation stored in the key.

Add big_decimal::to_string_canonical() which follows the Java BigDecimal
toString() spec (JDK 8+), producing a bijective string representation
that uses exponential notation for extreme scales instead of expanding
trailing zeros (which could cause OOM). This could replace to_string(),
but doing so has wider consequences (e.g. hash/equality contract for
decimal_type) described in SCYLLADB-1574. Use it in value_to_json() for
decimal_type, and use rjson::from_string() for varint_type, both
bypassing the lossy double parse path.

Tests cover the new to_string_canonical() and the filter fix, as well as
existing decimal type behavior (key representation, clustering order,
toJson) that we rely on and must not break. The CQL decimal type tests
(test_type_decimal.py) also pass against Cassandra.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1583
Refs: https://scylladb.atlassian.net/browse/SCYLLADB-1574

Closes scylladb/scylladb#29505
2026-05-18 17:07:26 +03:00
Evgeniy Naydanov
39a10d6d67 test: remove dead suite subclasses and legacy execution pipeline
After all test suites migrated to test_config.yaml with type: Python,
the specialized suite classes (Topology, CQLApproval, Run, Tool) and
the legacy execution pipeline (find_tests, run_test, TestSuite.run,
Test.run) became unreachable. Remove all this dead code.

Deleted files:
- suite/topology.py, suite/cql_approval.py, suite/run.py, suite/tool.py

Simplified:
- base.py: remove run_test(), read_log(), TestSuite.run(),
  add_test_list(), build_test_list(), all_tests(), test_count(),
  SUITE_CONFIG_FILENAME, disabled/flaky test tracking, and dead
  Test attributes (args, core_args, valid_exit_codes, allure_dir,
  is_flaky, is_cancelled, etc.)
- python.py: remove PythonTestSuite.run(), PythonTest.run(),
  _prepare_pytest_params(), pattern, test_file_ext, xmlout,
  server_log, scylla_env setup, and shlex import.
  Simplify run_ctx() to take no parameters.
- runner.py: remove --scylla-log-filename option,
  print_scylla_log_filename fixture, SUITE_CONFIG_FILENAME import,
  and suite.yaml probe in TestSuiteConfig.from_pytest_node().
- __init__.py: remove re-exports of deleted classes.
- test_config.yaml: Topology -> Python, Approval -> Python.
- conftest files: run_ctx(options=...) -> run_ctx().
- docs/dev/testing.md: update to reflect current pytest-based
  architecture, log paths, and removed features.

Co-Authored-By: Claude Opus 4.6 (200K context) <noreply@anthropic.com>

Closes scylladb/scylladb#29613
2026-05-17 22:16:31 +03:00
Marcin Maliszkiewicz
ec8f8e3a5b Merge 'test: make test_vector_search_with_vector_store_mock 30 times faster!' from Nadav Har'El
Before this patch,
```
test/cqlpy/run test_vector_search_with_vector_store_mock.py
```

Took 34 seconds.

After this patch, it takes **1 second**.

Look at the individual patches for how the magic happened. The first patch lowers the test duration from 34 to 5 seconds, the second patch lowers it further to 1 second.

Closes scylladb/scylladb#29891

* github.com:scylladb/scylladb:
  test/cqlpy: make test_vector_search_with_vector_store_mock faster
  vector-search: reset DNS timeout after changing host
2026-05-14 17:12:47 +02:00
Botond Dénes
1403f18240 Merge 'alternator: add more vector search features' from Nadav Har'El
Recently (in commit 37fc1507f0) we added vector search support for Alternator.
That implementation was functional, but did not yet support all the features that we had envisioned.

This patch series adds some of the missing features to Alternator's vector search. Each feature is described in more detail in its own patch.

* Metrics related to vector search usage in Alternator.
* `SimilarityFunction` option when creating a vector index to choose the similarity function. Defaults to `COSINE` (the existing default). Other options are `DOT_PRODUCT` and `EUCLIDEAN`.
* An optimized vector type, `{"FLOAT32VECTOR": [1.0, 2.0, ..]}`, which is stored on disk efficiently as 32-bit floats, **not** a JSON.
* A Query VectorSearch option `ReturnScores` asking to return the similarity score calculated for each returned result (the results are sorted in decreasing similarity score - the highest similarity is the best and returned first).

Closes scylladb/scylladb#29554

* github.com:scylladb/scylladb:
  alternator: add ReturnScores option to VectorSearch
  vector_store_client: read and return similarity_scores
  alternator: add optimized vector type for vector search
  alternator: add SimilarityFunction option to vector index creation
  alternator: add vector search metrics
2026-05-14 10:41:41 +03:00
Nadav Har'El
5c065c7746 test/cqlpy: make test_vector_search_with_vector_store_mock faster
The previous patch made test_vector_search_with_vector_store_mode
significantly faster, but at 5 seconds for 7 tests, it was still not
fast enough.

It turns out that the reason why the tests was slow is that each test
used a function-scoped fixture, which set up the vector store mock
again and again, separately for each test. This - especially waiting
for the client in Scylla to recognize the new server - took time
(before the previous patch it was 5 seconds, after the patch it
went down to 0.5 seconds - but still too slow).

The solution is simple:
1. Create a *module* scoped fixture that creates the mock and connects
   it to Scylla just once for all the tests in that file.
2. The *function* scoped fixture just uses the module-scoped one but
   resets the saved responses, to avoid one test influencing the other.

After this patch, the time to run this test file is down to 1 second (!).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 14:57:56 +03:00
Nadav Har'El
c56361a6d7 vector_store_client: read and return similarity_scores
The vector store returns for every ANN search, in addition to the keys
of the matching items, two additional vectors - "distances" and
"similarity_cores". The "distances" are raw distance metrics - lower
scores are better matches, while "similarity_scores" are modified
such that higher scores are better matches.

Traditionally, search scores in systems like Cassandra and Open Search
use the "similarity scores" approach (higher is better, results are
returned in decreasing similarity order), so this is the more interesting
vector of the two.

But before this patch, our vector_store_client::ann() inspected
only "distances". But... then, it didn't return even that to the
caller :-)

So in this patch, we:

1. Ignore "distances" and instead look at "similarity scores",
   which is what users really want based on their experience with
   other vector and non-vector search engines.

2. Return the similarity score of each match together with the match.
   We already have this score (the vector store returns it) and we
   can add it to the existing primary_key structure of each result.
   So each result is a "struct primary_key" which has fields partition,
   clustering, and after this patch - similarity.

Existing callers in CQL and Alternator vector search will ignore this
"similarity" field in each result, and not notice it was added.
But in the next patch, we'll allow Alternator's vector search to
return this similarity in each result.

The existing unit tests for vector_store_client.cc mocked vector-store
responses with "distances", without "similarity_scores", so no longer
represent what we actually expect the vector store to do. So this patch
also contains modifications for these tests, to mock and to test
"similarity_scores" - not "distances". The more interesting tests, in
the next patch, use the real vector store and check that we really do
get a "similarity_scores" response from it.

This patch also handles a small corner case for DOT_PRODUCT, which is
the only unbounded similarity function. If the similarity overflows
the 32-bit float, the vector store returns a JSON "null" instead of
a JSON number (since JSON doesn't support infinite numbers). Our
existing vector-store client code errored out when it saw this "null",
which is wrong - the request should be allowed to proceed. So in this
patch when we see a "null" JSON for similarity, we return +Inf.
This is usually correct because the top results really have +Inf, not
-Inf, but if we ask for all items we can reach those with similarity
-Inf and incorrectly assign +Inf to them (we have a test for this case
in the next patch). But this problenm won't happen when Limit is low,
and in any case it's better than aborting the request after it had
already succeeded.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 14:19:17 +03:00
Nadav Har'El
51c35c05e2 test/cqlpy: teach run-cassandra to use Docker
The test/cqlpy/run-cassandra script makes it quite easy to run test/cqlpy
tests against Cassandra, which is important for checking compatibility.

Unfortunately, because modern Linux distributions like Fedora do not have
either Cassandra or the old version of Java that it needs, the user needs
to download those manually. This is fairly easy, and explained in detail
in test/cqlpy/README.md, but nevertheless is a non-trivial manual step.

So this patch adds an even simpler alternative, the "--docker" option
which tells the script to run the official Cassandra docker image,
complete with the version of Java that it prefers - the user does not
need to download or install Cassandra or Java. The image is efficiently
cached by Docker, so running run-cassandra again doesn't need to
download it again; Moreover, trying several different versions of
Cassandra only needs to download and store the shared parts (base image
and Java) once.

test/cqlpy/run-cassandra --docker test_file.py::test_function

Runs by default the latest Cassandra 5 release. You can also use
"--docker=4" to get the latest Cassandra 4 release, "--docker=3.11"
to get the latest Cassandra 3.11 patch release, or "--docker=3.11.1"
to get a specific patch release.

In addition to the "--docker" option, this patch also introduces a
second option, "--java-docker", which takes *only* Java from docker,
but runs your locally installed Cassandra (to which you should point
with the CASSANDRA environment variable, as before). This option can
be useful if your host does not have a suitable version of Java, but
you want to run a locally-installed or locally-modified version of
Cassandra. The "--java-docker" option defaults to getting Java 11,
to use other versions you can use for example "--java-docker=17".

Fixes #25826.

Closes scylladb/scylladb#29860
2026-05-13 11:57:18 +02:00
Yaniv Michael Kaul
c359a09189 test: add UDF/UDA keyspace isolation and UDT tests
Port 3 tests from scylla-dtest user_functions_test.py:
- test_udf_with_udt: UDF taking frozen UDT arg, verifies DROP TYPE blocked
- test_udf_with_udt_keyspace_isolation: cross-keyspace UDT references rejected
- test_aggregate_with_udt_keyspace_isolation: cross-keyspace UDT in UDA rejected

All tests use Lua (Scylla's supported UDF language).
Reproduces CASSANDRA-9409.

Closes scylladb/scylladb#1928

Closes scylladb/scylladb#29843
2026-05-12 14:57:14 +03:00
Piotr Smaron
1018710e38 test/cqlpy: un-xfail oversized indexed value build test
Issue #8627 is fixed, so test_too_large_indexed_value_build now passes and should run normally instead of XPASSing under strict xfail.

Fixes: SCYLLADB-1938

Closes scylladb/scylladb#29853
2026-05-12 11:40:53 +02:00
Botond Dénes
8d6f031a4a schema: fix DESCRIBE showing NullCompactionStrategy when compaction is disabled
When a table's compaction is disabled via 'enabled': 'false', the DESCRIBE
output incorrectly showed NullCompactionStrategy instead of the actual strategy.
This happened because schema_properties() called compaction_strategy(), which
returns compaction_strategy_type::null when compaction is disabled. Fix it by
using configured_compaction_strategy(), which always returns the real strategy
type - consistent with how schema_tables.cc serializes it to disk.

Fixes SCYLLADB-1353

Closes scylladb/scylladb#29804
2026-05-12 12:38:25 +03:00
Pavel Emelyanov
150345cc52 Merge 'test: per-bucket isolation for S3/GCS object storage tests' from Ernest Zaslavsky
This series adds per-test bucket isolation to all S3 and GCS object storage tests. Previously, every test shared a single pre-created bucket, which meant tests could interfere with each other through leftover objects and could not run concurrently across multiple `test.py` processes without risking collisions.

New `create_bucket`, `delete_bucket`, and `delete_bucket_with_objects` methods on `s3::client`, following the existing `make_request` pattern. `create_bucket` handles the `BUCKET_ALREADY_OWNED_BY_YOU` error gracefully.

A new `s3_test_fixture` RAII class for C++ Boost tests that creates a uniquely-named bucket on construction (derived from the Boost test name and pid) and tears down everything — objects, bucket, client — on destruction. All S3 tests in `s3_test.cc` are migrated to use it, removing manual `deferred_delete_object` and `deferred_close` boilerplate. The minio server policy is broadened to allow dynamic bucket creation/deletion.

A `client::make` overload that accepts a custom `retry_strategy`, used in tests with a fast 1ms retry delay instead of exponential backoff, significantly reducing test runtime for transient errors during bucket lifecycle operations.

Python-side (`test/cluster/object_store`): each pytest fixture (`object_storage`, `s3_storage`, `s3_server`) now creates a unique bucket per test function via `create_test_bucket()` and destroys it on teardown. Bucket names are sanitized from the pytest node name with a short UUID suffix for uniqueness.

Object storage helpers (`S3Server`, `MinioWrapper`, `GSFront`, `GSServerImpl`, factory functions, CQL helpers, `s3_server` fixture) are extracted from `test/cluster/object_store/conftest.py` into a shared `test/pylib/object_storage.py` module, eliminating duplication across test suites. The conftest becomes a thin re-export wrapper. Old class names are preserved as aliases for backward compatibility.

| Test Name                                                    | new test specific retry strategy execution time (ms) | original execution time (ms) |   Δ (ms) | Speedup |
|--------------------------------------------------------------|----------------:|-------------:|---------:|--------:|
| test_client_upload_file_multi_part_with_remainder_proxy      |          19,261 |       61,395 | −42,134  | **3.2×** |
| test_client_upload_file_multi_part_without_remainder_proxy   |          16,901 |       53,688 | −36,787  | **3.2×** |
| test_client_upload_file_single_part_proxy                    |           3,478 |        6,789 |  −3,311  | **2.0×** |
| test_client_multipart_copy_upload_proxy                      |           1,303 |        1,619 |    −316  | 1.2×    |
| test_client_put_get_object_proxy                             |             150 |          365 |    −215  | **2.4×** |
| test_client_readable_file_stream_proxy                       |             125 |          327 |    −202  | **2.6×** |
| test_small_object_copy_proxy                                 |             205 |          389 |    −184  | 1.9×    |
| test_client_put_get_tagging_proxy                            |             181 |          350 |    −169  | 1.9×    |
| test_client_multipart_upload_proxy                           |           1,252 |        1,416 |    −164  | 1.1×    |
| test_client_list_objects_proxy                               |             729 |          881 |    −152  | 1.2×    |
| test_chunked_download_data_source_with_delays_proxy          |             830 |          960 |    −130  | 1.2×    |
| test_client_readable_file_proxy                              |             148 |          279 |    −131  | 1.9×    |
| test_client_upload_file_multi_part_with_remainder_minio      |           3,358 |        3,170 |    +188  | 0.9×    |
| test_client_upload_file_multi_part_without_remainder_minio   |           3,131 |        2,929 |    +202  | 0.9×    |
| test_client_upload_file_single_part_minio                    |             519 |          421 |     +98  | 0.8×    |
| test_download_data_source_proxy                              |             180 |          237 |     −57  | 1.3×    |
| test_client_list_objects_incomplete_proxy                     |             590 |          641 |     −51  | 1.1×    |
| test_large_object_copy_proxy                                 |             952 |          991 |     −39  | 1.0×    |
| test_client_multipart_upload_fallback_proxy                  |             148 |          185 |     −37  | 1.3×    |
| test_client_multipart_copy_upload_minio                      |             641 |          674 |     −33  | 1.1×    |

No backport needed — this is a test infrastructure improvement with no production code impact beyond the new `s3::client` methods.

Closes scylladb/scylladb#29508

* github.com:scylladb/scylladb:
  test: extract object storage helpers to test/pylib/object_storage.py
  test: add per-test bucket isolation to object_store fixtures
  s3: add client::make overload with custom retry strategy
  test: add s3_test_fixture and migrate tests to per-bucket isolation
  s3: add create_bucket and delete_bucket to client
2026-05-12 12:38:24 +03:00
Piotr Smaron
71542206bc cql: return InvalidRequest for oversized partition/clustering keys
When a partition key or clustering key value exceeds the 64 KiB limit
(65535 bytes serialized), Scylla used to raise a generic
std::runtime_error "Key size too large: N > M" from the low-level
compound-key serializer. That error surfaced to clients as a CQL
server error (code 0x0000, "NoHostAvailable"-looking), which is both
ugly and incompatible with Cassandra - Cassandra returns a clean
InvalidRequest with the message "Key length of N is longer than
maximum of M".

Fix this at the single chokepoint: compound_type::serialize_value in
keys/compound.hh. The serializer is on every path that materializes a
key - INSERT/UPDATE/DELETE/BATCH build mutations through it, and
SELECT builds partition and clustering ranges through it - so a single
throw replacement produces a clean InvalidRequest consistently across
all paths and all key shapes (single, compound PK, composite CK).

The previous approach on this PR branch patched three call sites in
cql3/restrictions/statement_restrictions.cc, which only covered
SELECT, duplicated the check, and placed it mid-restrictions code
(flagged in review). Dropping those changes in favour of the
root-cause fix here.

Un-xfail the tests this fixes:
- test/cqlpy/test_key_length.py: test_insert_65k_pk, test_insert_65k_ck,
  test_where_65k_pk, test_where_65k_ck, test_insert_65k_ck_composite,
  test_insert_total_compound_pk_err, test_insert_total_composite_ck_err.
- test/cqlpy/cassandra_tests/.../insert_test.py: testPKInsertWithValueOver64K,
  testCKInsertWithValueOver64K.
- test/cqlpy/cassandra_tests/.../select_test.py: testPKQueryWithValueOver64K.

test_insert_65k_pk_compound stays xfail: its oversized value gets
rejected by the Python driver's CQL wire-protocol encoder (see
CASSANDRA-19270) before reaching the server, so the fix can't apply.
Updated its reason. testCKQueryWithValueOver64K stays xfail with an
updated reason: Cassandra silently returns empty for an oversized
clustering key in WHERE, while Scylla now throws InvalidRequest - a
deliberate choice mirroring the partition-key case, documented in
the discussion on #10366.

Add three tight-boundary tests (addressing review feedback on the
previous revision) that pin MAX+1 behaviour for SELECT and INSERT of
both partition and clustering keys.

Update test/cluster/dtest/limits_test.py to match the new message
("Key length of \\d+ is longer than maximum of 65535").

fixes #10366
fixes #12247

Co-authored-by: Alexander Turetskiy <someone.tur@gmail.com>

Closes scylladb/scylladb#23433
2026-05-11 16:56:35 +03:00