Commit Graph

62 Commits

Author SHA1 Message Date
Szymon Malewski
15493872b2 vector_search: fix decimal/varint precision loss in filter value_to_json()
value_to_json() converts CQL values to JSON for vector search filters.
For decimal and varint types, it used rjson::parse() on the JSON string,
which parses through a double and silently loses precision for values
exceeding ~15 significant digits — producing wrong filter results.

Additionally, for decimal type we need an exact string representation
that preserves the original (unscaled, scale) pair, because partition
keys use byte-level identity: different serialized representations of
the same numeric value are distinct rows, so the filter must reproduce
the exact representation stored in the key.

Add big_decimal::to_string_canonical() which follows the Java BigDecimal
toString() spec (JDK 8+), producing a bijective string representation
that uses exponential notation for extreme scales instead of expanding
trailing zeros (which could cause OOM). This could replace to_string(),
but doing so has wider consequences (e.g. hash/equality contract for
decimal_type) described in SCYLLADB-1574. Use it in value_to_json() for
decimal_type, and use rjson::from_string() for varint_type, both
bypassing the lossy double parse path.

Tests cover the new to_string_canonical() and the filter fix, as well as
existing decimal type behavior (key representation, clustering order,
toJson) that we rely on and must not break. The CQL decimal type tests
(test_type_decimal.py) also pass against Cassandra.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1583
Refs: https://scylladb.atlassian.net/browse/SCYLLADB-1574

Closes scylladb/scylladb#29505
2026-05-18 17:07:26 +03:00
Piotr Smaron
71542206bc cql: return InvalidRequest for oversized partition/clustering keys
When a partition key or clustering key value exceeds the 64 KiB limit
(65535 bytes serialized), Scylla used to raise a generic
std::runtime_error "Key size too large: N > M" from the low-level
compound-key serializer. That error surfaced to clients as a CQL
server error (code 0x0000, "NoHostAvailable"-looking), which is both
ugly and incompatible with Cassandra - Cassandra returns a clean
InvalidRequest with the message "Key length of N is longer than
maximum of M".

Fix this at the single chokepoint: compound_type::serialize_value in
keys/compound.hh. The serializer is on every path that materializes a
key - INSERT/UPDATE/DELETE/BATCH build mutations through it, and
SELECT builds partition and clustering ranges through it - so a single
throw replacement produces a clean InvalidRequest consistently across
all paths and all key shapes (single, compound PK, composite CK).

The previous approach on this PR branch patched three call sites in
cql3/restrictions/statement_restrictions.cc, which only covered
SELECT, duplicated the check, and placed it mid-restrictions code
(flagged in review). Dropping those changes in favour of the
root-cause fix here.

Un-xfail the tests this fixes:
- test/cqlpy/test_key_length.py: test_insert_65k_pk, test_insert_65k_ck,
  test_where_65k_pk, test_where_65k_ck, test_insert_65k_ck_composite,
  test_insert_total_compound_pk_err, test_insert_total_composite_ck_err.
- test/cqlpy/cassandra_tests/.../insert_test.py: testPKInsertWithValueOver64K,
  testCKInsertWithValueOver64K.
- test/cqlpy/cassandra_tests/.../select_test.py: testPKQueryWithValueOver64K.

test_insert_65k_pk_compound stays xfail: its oversized value gets
rejected by the Python driver's CQL wire-protocol encoder (see
CASSANDRA-19270) before reaching the server, so the fix can't apply.
Updated its reason. testCKQueryWithValueOver64K stays xfail with an
updated reason: Cassandra silently returns empty for an oversized
clustering key in WHERE, while Scylla now throws InvalidRequest - a
deliberate choice mirroring the partition-key case, documented in
the discussion on #10366.

Add three tight-boundary tests (addressing review feedback on the
previous revision) that pin MAX+1 behaviour for SELECT and INSERT of
both partition and clustering keys.

Update test/cluster/dtest/limits_test.py to match the new message
("Key length of \\d+ is longer than maximum of 65535").

fixes #10366
fixes #12247

Co-authored-by: Alexander Turetskiy <someone.tur@gmail.com>

Closes scylladb/scylladb#23433
2026-05-11 16:56:35 +03:00
Piotr Smaron
959f67b345 cql: verify tuples length in multi-column IN restriction
When a multi-column IN restriction contains tuples with a different
number of elements than the number of restricted columns (e.g.
`(b, c, d) IN ((1, 2), (2, 1, 4))`), Scylla would either produce an
inconsistent error message or, for over-sized tuples, an internal
type-mismatch error referencing the list literal representation.

Validate each tuple's arity against the number of restricted columns
while building the IN restriction and raise a clear
"Expected N elements in value tuple, but got M" error in both the
under- and over-sized cases.

Fixes #13241

Co-authored-by: Alexander Turetskiy <someone.tur@gmail.com>

Closes scylladb/scylladb#18407
2026-05-11 16:55:09 +03:00
Botond Dénes
eb3326b417 Merge 'test.py: migrate all bare skips to typed skip markers' from Artsiom Mishuta
should be merged after #29235

Complete the typed skip markers migration started in the plugin PR.
Every bare `@pytest.mark.skip` decorator and `pytest.skip()` runtime call
across the test suite is replaced with a typed equivalent, making skip
reasons machine-readable in JUnit XML and Allure reports.

**62 files changed** across 8 commits, covering ~127 skip sites in total.

Bare `pytest.skip` provides only a free-text reason string. CI dashboards
(JUnit, Allure) cannot distinguish between a test skipped due to a known
bug, a missing feature, a slow test, or an environment limitation. This
makes it hard to track skip debt, prioritize fixes, or filter dashboards
by skip category.

The typed markers (`skip_bug`, `skip_not_implemented`, `skip_slow`,
`skip_env`) introduced by the `skip_reason_plugin` solve this by embedding
a `skip_type` field into every skip report entry.

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_bug` | 24 | 16 | Skip reason references a known bug/issue |
| `skip_not_implemented` | 10 | 5 | Feature not yet implemented in Scylla |
| `skip_slow` | 4 | 3 | Test too slow for regular CI runs |
| `skip_not_implemented` (bare) | 2 | 1 | Bare `@pytest.mark.skip` with no reason (COMPACT STORAGE, #3882) |

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_env` | ~85 | 34 | Feature/config/topology not available at runtime |
| `skip_bug` | 2 | 2 | Known bugs: Streams on tablets (#23838), coroutine task not found (#22501) |

- **Comments**: 7 comments/docstrings across 5 files updated from `pytest.skip()` to `skip()`
- **Plugin hardened**: `warnings.warn()` → `pytest.UsageError` for bare `@pytest.mark.skip` at collection time — bare skips are now a hard error, not a warning
- **Guard tests**: New `test/pylib_test/test_no_bare_skips.py` with 3 tests that prevent regression:
  - AST scan for bare `@pytest.mark.skip` decorators
  - AST scan for bare `pytest.skip()` runtime calls
  - Real `pytest --collect-only` against all Python test directories

Runtime skip sites use the convenience wrappers from `test.pylib.skip_types`:
```python
from test.pylib.skip_types import skip_env
```

Usage:
```python
skip_env("Tablets not enabled")
```

1. **test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs** — 24 decorator sites, 16 files
2. **test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented** — 10 decorator sites, 5 files
3. **test: migrate @pytest.mark.skip to @pytest.mark.skip_slow** — 4 decorator sites, 3 files
4. **test: migrate bare @pytest.mark.skip to skip_not_implemented** — 2 bare decorators, 1 file
5. **test: migrate runtime pytest.skip() to typed skip_env()** — ~85 sites, 34 files
6. **test: migrate runtime pytest.skip() to typed skip_bug()** — 2 sites, 2 files
7. **test: update comments referencing pytest.skip() to skip()** — 7 comments, 5 files
8. **test/pylib: reject bare pytest.mark.skip and add codebase guards** — plugin hardening + 3 guard tests

- All 60 plugin + guard tests pass (`test/pylib_test/`)
- No bare `@pytest.mark.skip` or `pytest.skip()` calls remain in the codebase
- `pytest --collect-only` succeeds across all test directories with the hardened plugin

SCYLLADB-1349

Closes scylladb/scylladb#29305

* github.com:scylladb/scylladb:
  test/alternator: replace bare pytest.skip() with typed skip helpers
  test: migrate new bare skips introduced by upstream after rebase
  test/pylib: reject bare pytest.mark.skip and add codebase guards
  test: update comments referencing pytest.skip() to skip_env()
  test: migrate runtime pytest.skip() to typed skip_bug()
  test: migrate runtime pytest.skip() to typed skip_env()
  test: migrate bare @pytest.mark.skip to skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_slow
  test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs
2026-04-22 15:48:27 +03:00
Łukasz Paszkowski
d18eb9479f cql/statement: Create keyspace_metadata with correct initial_tablets count
In `ks_prop_defs::as_ks_metadata(...)` a default initial tablets count
is set to 0, when tablets are enabled and the replication strategy
is NetworkReplicationStrategy.

This effectively sets _uses_tablets = false in abstract_replication_strategy
for the remaining strategies when no `tablets = {...}` options are specified.
As a consequence, it is possible to create vnode-based keyspaces even
when tablets are enforced with `tablets_mode_for_new_keyspaces`.

The patch sets a default initial tablets count to zero regardless of
the chosen replication strategy. Then each of the replication strategy
validates the options and raises a configuration exception when tablets
are not supported.

All tests are altered in the following way:
+ whenever it was correct, SimpleStrategy was replaced with NetworkTopologyStrategy
+ otherwise, tablets were explicitly disabled with ` AND tablets = {'enabled': false}`

Fixes https://github.com/scylladb/scylladb/issues/25340

Closes scylladb/scylladb#25342
2026-04-20 17:57:38 +03:00
Artsiom Mishuta
8a80e2c3be test: migrate runtime pytest.skip() to typed skip_env()
Migrate runtime pytest.skip() calls across 34 files to use the typed
skip_env() wrapper from test.pylib.skip_types.

These sites skip at runtime because a required feature, config option,
library version, build mode, or runtime topology is not available.

Also fixes 'raise pytest.skip(...)' in test_audit.py — skip_env()
already raises internally, so the explicit raise was incorrect.

Each file gains one new import:
  from test.pylib.skip_types import skip_env
2026-04-19 11:09:29 +02:00
Artsiom Mishuta
fb0974a329 test: migrate bare @pytest.mark.skip to skip_not_implemented
Migrate 2 bare @pytest.mark.skip decorators (no reason string) to
@pytest.mark.skip_not_implemented with an explicit reason referencing
issue #3882 (COMPACT STORAGE not implemented).
2026-04-19 11:06:30 +02:00
Artsiom Mishuta
a39fb9d29a test: migrate @pytest.mark.skip to @pytest.mark.skip_slow
Migrate 4 @pytest.mark.skip decorator sites to @pytest.mark.skip_slow
across 3 test files where the skip reason indicates a slow test.
2026-04-19 11:06:30 +02:00
Artsiom Mishuta
638efedc3c test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented
Migrate 10 @pytest.mark.skip decorator sites to
@pytest.mark.skip_not_implemented across 5 test files where the
skip reason indicates a feature not yet implemented.
2026-04-19 11:06:30 +02:00
Artsiom Mishuta
465636bc53 test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs
Migrate 24 @pytest.mark.skip decorator sites to @pytest.mark.skip_bug
across 16 test files where the reason references a known bug or issue.
2026-04-19 11:06:30 +02:00
Piotr Dulikowski
9fc2c65d18 Merge 'cql3: implement WRITETIME() and TTL() of individual elements of map, set, and UDT' from Nadav Har'El
In commit 727f68e0f5 we added the ability to SELECT:

* Individual elements of a map: `SELECT map_col[key]`.
* Individual elements of a set: `SELECT set_col[key]` returns key if the key exists in the set, or null if it doesn't, allowing to check if the element exists in the set.
* Individual pieces of a UDT: `SELECT udt_col.field`.

But at the time, we didn't provide any way to retrieve the **meta-data** for this value, namely its timestamp and TTL. We did not support `SELECT TIMESTAMP(collection[key])`, or `SELECT TIMESTAMP(udt.field)`.

Users requested to support such SELECTs in the past (see issue #15427), and Cassandra 5.0 added support for this feature - for both maps and sets and udts - so we also need this feature for compatibility. This feature was also requested recently by vector-search developers, who wanted to read Alternator columns - stored as map elements, not individual columns - with their WRITETIME information.

The first four patches in this series adds the feature (in four smaller patches instead one big one), the fifth and sixth patches add tests (cqlpy and boost tests, respectively). The seventh patch adds documentation.

All the new tests pass on Cassandra 5, failed on Scylla before the present fix, and pass with it.

The fix was surprisingly difficult. Our existing implementation (from 727f68e0f5 building on earlier machinery) doesn't just "read" `map_col[key]` and allow us to return just its timestamp. Rather, the implementation reads the entire map, serializes it in some temporary format that does **not** include the timestamps and ttls, and then takes the subscript key, at which point we no longer have the timestamp or ttl of the element. So the fix had to cross all these layers of the implementation.

While adding support for UDT fields in a pre-existing grammar nonterminal "subscriptExpr", we unintentionally added support for UDT fields also in LWT expressions (which used this nonterminal). LWT missing support for UDT fields was a long-time known compatibility issue (#13624) so we unintentionally fixed it :-) Actually, to completely fix it we needed another small change in the expression implementation, so the eighth patch in this series does this.

Fixes #15427
Fixes #13624

Closes scylladb/scylladb#29134

* github.com:scylladb/scylladb:
  cql3: support UDT fields in LWT expressions
  cql3: document WRITETIME() and TTL() for elements of map, set or UDT
  test/boost: test WRITETIME() and TTL() on map collection elements
  test/cqlpy: test WRITETIME() and TTL() on element of map, set or UDT
  cql3: prepare and evaluate WRITETIME/TTL on collection elements and UDT fields
  cql3: parse per-element timestamps/TTLs in the selection layer
  cql3: add extended wire format for per-element timestamps and TTLs
  cql3: extend WRITETIME/TTL grammar to accept collection and UDT elements
2026-04-14 12:35:46 +02:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Nadav Har'El
33dbb63aef cql3: support UDT fields in LWT expressions
In an earlier patch, we used the CQL grammar's "subscriptExpr" in
the rule for WRITETIME() and TTL(). But since we also wanted these
to support UDT fields (x.a), not just collection subscripts (x[3]),
we expanded subscriptExpr to also support the field syntax.

But LWT expressions already used this subscriptExpr, which meant
that LWT expressions unintentionally gained support for UDT fields.
Missing support for UDT fields in LWT is a long-standing known
Cassandra-compatibility bug (#13624), and now our grammar finally
supports the missing syntax.

But supporting the syntax is not enough for correct implementation
of this feature - we also need to fix the expression handling:

Two bugs prevented expressions like `v.a = 0` from working in LWT IF
clauses, where `v` is a column of user-defined type.

The first bug was in get_lhs_receiver() in prepare_expr.cc: it lacked
a handler for field_selection nodes, causing an "unexpected expression"
internal error when preparing a condition like `IF v.a = 0`. The fix
adds a handler that returns a column_specification whose type is taken
from the prepared field_selection's type field.

The second bug was in search_and_replace() in expression.cc: when
recursing into a field_selection node it reconstructed it with only
`structure` and `field`, silently dropping the `field_idx` and `type`
fields that are set during preparation. As a result, any transformation
that uses search_and_replace() on a prepared expression containing a
field_selection — such as adjust_for_collection_as_maps() called from
column_condition_prepare() — would zero out those fields. At evaluation
time, type_of() on the field_selection returned a null data_type
pointer, causing a segmentation fault when the comparison operator tried
to call ->equal() through it. The fix preserves field_idx and type when
reconstructing the node.

Fixes #13624.
2026-04-12 14:28:01 +03:00
Artsiom Mishuta
b1e9c0b867 test/pylib: add typed skip markers plugin
Add skip_reason_plugin.py — a framework-agnostic pytest plugin that
provides typed skip markers (skip_bug, skip_not_implemented, skip_slow,
skip_env) so that the reason a test is skipped is machine-readable in
JUnit XML and Allure reports.  Bare untyped pytest.mark.skip now
triggers a warning (to become an error after full migration).  Runtime
skips via skip() are also enriched by parsing the [type] prefix from
the skip message.

The plugin is a class (SkipReasonPlugin) that receives the concrete
SkipType enum and an optional report_callback from conftest.py, keeping
it decoupled from allure and project-specific types.

Extract SkipType enum and convenience runtime skip wrappers (skip_bug,
skip_env, etc.) into test/pylib/skip_types.py so callers only need a
single import instead of importing both SkipType and skip() separately.
conftest.py imports SkipType from the new module and registers the
plugin instance unconditionally (for all test runners).

New files:
- test/pylib/skip_reason_plugin.py: core plugin — typed marker
  processing, bare-skip warnings, JUnit/Allure report enrichment
  (including runtime skip() parsing via _parse_skip_type helper)
- test/pylib/skip_types.py: SkipType enum and convenience wrappers
  (skip_bug, skip_not_implemented, skip_slow, skip_env)
- test/pylib_test/test_skip_reason_plugin.py: 17 pytester-based
  test functions (51 cases across 3 build modes) covering markers,
  warnings, reports, callbacks, and skip_mode interaction

Infrastructure changes:
- test/conftest.py: import SkipType from skip_types, register
  SkipReasonPlugin with allure report callback
- test/pylib/runner.py: set SKIP_TYPE_KEY/SKIP_REASON_KEY stash keys
  for skip_mode so the report hook can enrich JUnit/Allure with
  skip_type=mode without longrepr parsing
- test/pytest.ini: register typed marker definitions (required for
  --strict-markers even when plugin is not loaded)

Migrated test files (representative samples):
- test/cluster/test_tablet_repair_scheduler.py:
  skip -> skip_bug (#26844), skip -> skip_not_implemented
- test/cqlpy/.../timestamp_test.py: skip -> skip_slow
- test/cluster/dtest/schema_management_test.py: skip -> skip_not_implemented
- test/cluster/test_change_replication_factor_1_to_0.py: skip -> skip_bug (#20282)
- test/alternator/conftest.py: skip -> skip_env
- test/alternator/test_https.py: use skip_env() wrapper

Fixes SCYLLADB-79

Closes scylladb/scylladb#29235
2026-04-08 10:38:56 +03:00
Szymon Malewski
3116db6c2d test: fix testJsonOrdering
The `test/cqlpy/cassandra_tests/validation/entities/json_test.py::testJsonOrdering` was failing because of differences between Cassandra and Scylla in printing
JSON floating point values - e.g. Cassandra prints 30.0, where Scylla prints 30.
Both are valid, so in this patch, instead of comparing strings, we compare parsed JSON using `EquivalentJson`.

Fixes #28467

Closes scylladb/scylladb#28924
2026-03-12 09:07:08 +01:00
Szymon Malewski
f9d213547f cql3: selection: fix add_column_for_post_processing for ORDER BY
The purpose of `add_column_for_post_processing` is to add columns that are required for processing of a query,
but are not part of SELECT clause and shouldn't be returned. They are added to the final result set, but later are not serialized.
Mainly it is used for filtering and grouping columns, with a special case of `WHERE primary_key IN ...  ORDER BY ...` when the whole result set needs additional final sorting,
and ordering columns must be added as well.
There was a bug that manifested in #9435, #8100 and was actually identified in #22061.
In case of selection with processing (e.g functions involved), result set row is formed in two stages.
Initially it is a list of columns fetched from replicas - on which filtering and grouping is performed.
After that the actual selection is resolved and the final number of columns can change.
Ordering is performed on this final shape, but the ordering column index returned by `add_column_for_post_processing` refereed to initial shape.
If selection refereed to the same column twice (e.g. `v, TTL(v)` as in #9435) final row was longer than initial and ordering refereed to incorrect column.
If a function in selection refereed to multiple columns (e.g. as_json(.., ..) which #8100 effectively uses) the final row was shorter
and ordering tried to use a non-existing column.

This patch fixes the problem by making sure that column index of the final result set is used for ordering.

The previously crashing test `cassandra_tests/validation/entities/json_test.py::testJsonOrdering` doesn't have to be skipped, but now it is failing on issue #28467.

Fixes #9435
Fixes #8100
Fixes #22061

Closes scylladb/scylladb#28472
2026-03-05 19:22:34 +02:00
Nadav Har'El
a1475dbeb9 test/cqlpy: make test testMapWithLargePartition faster
Right now the slowest test in the test/cqlpy directory is

   cassandra_tests/validation/entities/collections_test.py::
      testMapWithLargePartition

This test (translated from Cassandra's unit test), just wants to verify
that we can write and flush a partition with a single large map - with
200 items totalling around 2MB in size.

200 items totalling 2MB is large, but not huge, and is not the reason
why this test was so so slow (around 9 seconds). It turns out that most
of the test time was spent in Python code, preparing a 2MB random string
the slowest possible way. But there is no need for this string to be
random at all - we only care about the large size of the value, not the
specific characters in it!

Making the characters written in this text constant instead of random
made it 20 times fast - it now takes less than half a second.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#28271
2026-02-18 10:12:16 +03:00
Nadav Har'El
a63ad48b0f test/cqlpy: remove xfail from tests for fixed issue 7972
The test test_to_json_double used to fail due to #7972, but this issue
was already fixed in Scylla 5.1 and we didn't notice.
So remove the xfail marker from this test, and also update another test
which still xfails but no longer due to this issue.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-02-02 23:49:32 +02:00
Nadav Har'El
10b81c1e97 test/cqlpy: remove xfail from tests for fixed issue 10358
The tests testWithUnsetValues and testFilteringWithoutIndices used to fail
due to #10358, but this issue was already fixed three years ago, when the
UNSET-checking code was cleaned up, and the test is now passing.
So remove the xfail marker from these tests.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-02-02 23:49:31 +02:00
Nadav Har'El
508bb97089 test/cqlpy: remove xfail from passing test testInvalidNonFrozenUDTRelation
The test testInvalidNonFrozenUDTRelation used to fail due to #10632
(an incorrectly-printed column name in an error message) and was marked
"xfail". But this issue has already been fixed two years ago, and
the test is now passing. So remove the xfail marker.
2026-02-02 23:49:31 +02:00
Avi Kivity
3d1558be7e test: remove xfail markers from SELECT JSON count(*) tests
These were marked xfail due to #8077 (the column name was wrong),
but it was fixed long ago for 5.4 (exact commit not known).

Remove the xfail markers to prevent regressions.

Closes scylladb/scylladb#28432
2026-01-29 21:56:00 +02:00
Nadav Har'El
1454228a05 test/cqlpy: fix "assertion rewriting" in translated Cassandra tests
One of the best features of the pytest framework is "assertion
rewriting": If your test does for example "assert a + 1 == b", the
assertion is "rewritten" so that if it fails it tells you not only
that "a+1" and "b" are not equal, what the non-equal values are,
how they are not equal (e.g., find different elements of arrays) and
how each side of the equality was calculated.

But pytest can only "rewrite" assertion that it sees. If you call a
utility function checksomething() from another module and that utility
function calls assert, it will not be able to rewrite it, and you'll
get ugly, hard-to-debug, assertion failures.

This problem is especially noticable in tests we translated from
Cassandra, in test/cqlpy/cassandra_tests. Those tests use a bunch of
assertion-performing utility functions like assertRows() et al.
Those utility functions are defined in a separate source file,
porting.py, so by default do not get their assertions rewritten.

We had a solution for this: test/cqlpy/cassandra_test/__init__.py had:

    pytest.register_assert_rewrite("cassandra_tests.porting")

This tells pytest to rewrite assertions in porting.py the first time
that it is imported.

It used to work well, but recently it stopped working. This is because
we change the module paths recently, and it should be written as
test.cqlpy.cassandra_tests.porting.

I verified by editing one of the cassandra_tests to make a bad check
that indeed this statement stopped working, and fixing the module
path in this way solves it, and makes assertion rewriting work
again.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#28411
2026-01-28 18:34:57 +02:00
Yaron Kaikov
7c49711906 test/cqlpy: Remove redundant pytest.register_assert_rewrite call
During test.py run, noticed this warning:
```
10:38:22  test/cqlpy/cassandra_tests/validation/operations/insert_update_if_condition_test.py:14: 32 warnings
10:38:22    /jenkins/workspace/releng-testing/scylla-ci/scylla/test/cqlpy/cassandra_tests/validation/operations/insert_update_if_condition_test.py:14: PytestAssertRewriteWarning: Module already imported so cannot be rewritten: test.cqlpy.cassandra_tests.porting
10:38:22      pytest.register_assert_rewrite('test.cqlpy.cassandra_tests.porting')
```

The insert_update_if_condition_test.py was calling
pytest.register_assert_rewrite() for the porting module, but this
registration is already handled by cassandra_tests/__init__.py which
is automatically loaded before any test runs.

Closes scylladb/scylladb#28409
2026-01-28 13:17:05 +02:00
Avi Kivity
ec70cea2a1 test/cqlpy: restore LWT tests marked XFAIL for tablets
Commit 0156e97560 ("storage_proxy: cas: reject for
tablets-enabled tables") marked a bunch of LWT tests as
XFAIL with tablets enabled, pending resolution of #18066.
But since that event is now in the past, we undo the XFAIL
markings (or in some cases, use an any-keyspace fixture
instead of a vnodes-only fixture).

Ref #18066.

Closes scylladb/scylladb#28336
2026-01-26 12:27:19 +02:00
Dawid Pawlik
4a7e20953a test/cqlpy: remove the xfail mark from already passing tests
Since #28109 was merged, those tests started to pass as we allow
the filtering on primary key columns within ANN vector queries.

Closes scylladb/scylladb#28231
2026-01-21 08:47:20 +02:00
Nadav Har'El
d86d5b33aa test/cqlpy: translate Cassandra's unit tests for LWT
This is a translation of Cassandra's CQL unit test source file
validation/operations/InsertUpdateIfConditionTest.java into our cqlpy
framework.

This test file checks various LWT conditional updates. After that
file became too big, the Cassandra developers split parts from it -
moving tests for LWT with collections, UDTs, and static columns to
separate test files - which I already translated (pull request #13663).
This patch translates the remaining, main, LWT tests.

Strangely, this test file also has, in the middle of the file, several
tests for conditional schema changes, like CREATE KEYSPACE IF NOT EXISTS,
a feature which has *nothing* to do with LWT so really didn't belong in
this file. But I translated those as well.

These new tests all pass on both ScyllaDB and Cassandra, and have not
uncovered any new bug.

However these tests do demonstrate yet again something that users and
developers of ScyllaDB's LWT must be aware of: Whereas usually
ScyllaDB's goal has been compatiblity with Cassandra's CQL, in LWT
this has *not* been the case: ScyllaDB deviated from Cassandra's
behavior in its LWT implementation in several places. These intentional
deviations were documented in docs/kb/lwt-differences.rst.

Accordingly, the tests here include almost a hundred (!) modificatons
(search for "if is_scylla") to allow the same test to pass on both
ScyllaDB and Cassandra, as well as many comments explaining the types
of differences we're seeing.

Although these deviations from Cassandra compatibility are known and
intentional, it's worth listing here the ones re-discovered by these
new tests:

1. On a successful conditional write, Cassandra returns just true, Scylla
   also returns the old contents of the row.

2. Similarly, in an IF EXISTS write that failed (the row did not exist),
   Cassandra returns just false, Scylla also returns extra null values for
   each and every column of the row.

3. Cassandra allows in "IF v IN (?, ?)" to bind individual values to
   UNSET_VALUE and skips them, Scylla treats this as an error. Refs #13659.

4. When there are static columns, Scylla's LWT response returns the static
   column first, Cassandra returns the modified column first. Since both
   also say which columns they return, neither is more correct than the other,
   a normally users will address specific columns by name, not by position.

5. docs/kb/lwt-differences.rst explains that "the returned result set
   contains an old row for every conditional statement in the batch".
   Beyond this different, actually non-conditional updates in the batch will
   also get a row in Scylla's result. Refs #27955.

6. For batch statement, ScyllaDB allows mixing `IF EXISTS`, `IF NOT EXISTS`,
   and other conditions for the same row. Cassandra doesn't, so checks that
   these combinations are not allowed were commented out.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#27961
2026-01-19 09:46:04 +02:00
Nadav Har'El
3e138a2685 test/cqlpy: Add our copyright/license to translated Cassandra tests
All the tests under test/cqlpy/cassandra_tests/ were translated from
Cassandra's unit tests originally written in Java into our own test
framework, and accordinly carry a clear mention of their origin and
original license.

However, we did modify these original tests - even if the modification
was slight and mostly straightforward. Therefore I was asked to also
mention our own copyright (and license) for these modifications.

So this patch adds to every file in test/cqlpy/cassandra_tests/ text like:

   # Modifications: Copyright 2026-present ScyllaDB
   # SPDX-License-Identifier: LicenseRef-ScyllaDB-Source-Available-1.0

with the appropriate year instead of 2026.

Fixes #28215

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#28216
2026-01-18 16:25:28 +01:00
Nadav Har'El
a1f198d453 test/cqlpy: translate Cassandra's test InsertInvalidateSizedRecordsTest
This is a translation of Cassandra's CQL unit test source file
validation/operations/InsertInvalidateSizedRecordsTest.java into our
cqlpy framework.

This is one of the tests added to Cassandra as part of the vector
search work, but actually has nothing to do with vector search -
it checks what happens when key columns of different types exceeed
their maximum size (64KB).

Unfortunately, each one of the tests added here *fail* on ScyllaDB,
providing more reproducers for two already known issues (which
already had plenty of reproducers...):

Refs  #8627 Cleanly reject updates with indexed values where value > 64k
Refs #12247 Better error reporting for oversized keys during INSERT

One of the tests also fails on Cassandra, due to CASSANDRA-19270.
It is not clear to me how this unit test actually passed on Cassandra,
I can only guess that the Python driver somehow makes the request
differently than what the Java unit tests use to make requests to
Cassandra.

One of the tests in the original Cassandra source file I did not
translate, readingEmptyStringsForDifferentTypes, because it tests
cqlsh, not pure CQL.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#27944
2026-01-13 08:59:36 +02:00
Michał Hudobski
92c988514c vector_search: allow all where clauses in vector search queries
To prepare for implementation of filtering we skip validation
of where clauses in vector search queries. All queries that would
be blocked by the lack of ALLOW FILTERING now will pass through.

Fixes: VECTOR-410

Closes scylladb/scylladb#27758
2026-01-11 12:56:44 +02:00
Nadav Har'El
74a57d2872 test/cqlpy: remove unused imports
Remove many unused "import" statements or parts of import statement.
All of them were detected by Copilot, but I verified each one manually
and prepared this patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#27675
2025-12-24 13:31:41 +02:00
Pavel Emelyanov
c4496dd63c Merge 'test/cqlpy: rename tests with duplicate name' from Nadav Har'El
When translating Cassandra's unit tests, in a couple of places I accidentally used the same name for two tests, resulting in the first of each pair to never running.

Let's fix the name of the second of the each pair to be the real name it had in the original Cassandra test.

Closes scylladb/scylladb#27644

* github.com:scylladb/scylladb:
  test/cqlpy: rename test with duplicate name
  test/cqlpy: rename test with duplicate name
2025-12-16 19:32:20 +03:00
Nadav Har'El
f287484f4d test/cqlpy: rename test with duplicate name
When translating Cassandra's test validation/operations/CreateTest.java
I accidentally used the same name for two tests, resulting in the first
of them never being run.

Let's fix the name of the second of the two to be the real name it had
in the original Cassandra test.

After this patch pytest reports 16 tests in this file, instead of 15
before this patch. The previously-ignored test was correct, and it
now passes in both Scylla and Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-12-15 14:19:24 +02:00
Nadav Har'El
a9442e6d56 test/cqlpy: rename test with duplicate name
When translating Cassandra's test validation/operations/DeleteTest.java
I accidentally used the same name for two tests, resulting in the first
of them never being run.

Let's fix the name of the second of the two to be the real name it had
in the original Cassandra test.

After this patch pytest reports 52 tests in this file, instead of 51
before this patch. The previously-ignored test was correct, and it
now passes in both Scylla and Cassandra.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-12-15 12:02:59 +02:00
Michał Hudobski
12483d8c3c vector_search: throw an error when we restrict primary in vector search
We currently allow restrictions on single column primary key,
but we ignore the restriction and return all results.
This can confuse the users. We change it so such a restriction
will throw an error and add a test to validate it.

Fixes: VECTOR-331

Closes scylladb/scylladb#27143
2025-12-15 09:45:56 +02:00
Aleksandra Martyniuk
76174d1f7a cql3: reject ALTER KEYSPACE if rf of datacenter with tablets is omitted
In ALTER KEYSPACE, when a datacenter name is omitted, its replication
factor is implicitly set to zero with vnodes, while with tablets,
it remains unchanged.

ALTER KEYSPACE should behave the same way for tablets as it does
for vnodes. However, this can be dangerous as we may mistakenly
drop the whole datacenter.

Reject ALTER KEYSPACE if it changes replication factor, but omits
a datacenter that currently contains tablet replicas.

Fixes: https://github.com/scylladb/scylladb/issues/25549.

Closes scylladb/scylladb#25731
2025-11-24 06:36:51 +02:00
Nadav Har'El
b659dfcbe9 test/cqlpy: comment out Cassandra check that is no longer relevant
In the test translated from Cassandra validation/operations/alter_test.py
we had two lines in the beginning of an unrelated test that verified
that CREATE KEYSPACE is not allowed without replication parameters.
But starting recently, ScyllaDB does have defaults and does allow these
CREATE KEYSPACE. So comment out these two test lines.

We didn't notice that this test started to fail, because it was already
marked xfail, because in the main part of this test, it reproduces a
different issue!

The annoying side-affect of these no-longer-passing checks was that
because the test expected a CREATE KEYSPACE to fail, it didn't bother
to delete this keyspace when it finished, which causes test.py to
report that there's a problem because some keyspaces still exist at the
end of the test. Now that we fixed this problem, we no longer need to
list this test in test/cqlpy/suite.yaml as a test that leaves behind
undeleted keyspaces.

Fixes #26292

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#26341
2025-11-11 10:34:27 +02:00
Michael Litvak
1dbf53ca29 test: enable counters tests with tablets
Enable all counters-related tests that were disabled for tablets because
counters was not supported with tablets until now.

Some tests were parametrized to run with both vnodes and tablets, and
the tablets case was skipped, in order to not lose coverage. We change
them to run with the default configuration since now counters is
supported with both vnodes and tablets, and the implementation is the
same, so there is no benefit in running them with both configurations.
2025-11-03 16:04:37 +01:00
Benny Halevy
e8b9f13061 test: Prepare for handling errors specific to rack list path 2025-10-29 23:32:58 +01:00
Szymon Wasik
ccfe80ab97 cql3: Update error messages to be in line with documentation.
ANN (aproximate nearest neighborhood) is just the name of the type
of algorithm used to perform vector search. For this reason the error
messages should refer to vector queries rather than ANN queries.
2025-09-26 17:01:10 +02:00
Michał Hudobski
1690e5265a vector search: correct column name formatting
This patch corrects the column name formatting whenever
an "Undefined column name" exception is thrown.
Until now we used the `name()` function which
returns a bytes object. This resulted in a message
with a garbled ascii bytes column name instead of
a proper string. We switch to the `text()` function
that returns a sstring instead, making the message
readable.
Tests are adjusted to confirm this behavior.

Fixes: VECTOR-228

Closes scylladb/scylladb#26120
2025-09-20 07:02:53 +02:00
Michael Litvak
778dec2630 test/cqlpy: adjust cdc tests for tablets
update cdc-related tests in test/cqlpy for cdc with tablets.

* test_cdc_log_entries_use_cdc_streams: this test depends on the
  implementation of the cdc tables, which is different for tablets, so
  it's changed to run for both vnodes and tablets keyspaces, and we add
  the implementation for tablets.

* some cdc-related are unskipped for tablets so they will be run with
  both tablets and vnodes keyspaces. these are tests where the
  implementation may be different between tablets and vnodes and we want
  to have converage of both.

* other cdc-related tests do not depend on the implementation
  differences between tablets and vnodes, so we can just enable them to
  run with the default configuration. previously they were disabled for
  tablets keyspaces because it wasn't supported, so now we remove this.
2025-09-17 14:47:13 +02:00
Pavel Emelyanov
b0aa2d61d9 Merge 'cql3: add default replication factor to create_keyspace_statement' from Dario Mirovic
When creating a new keyspace, replication factor must be stated.
For example:
`CREATE KEYSPACE ks WITH REPLICATION { 'class': 'NetworkTopologyStrategy', 'replication_factor': 3 };`

This patch changes it in the following way - if there is no
replication factor specified, use default replication factor.
Default replication factor is equal to the number of racks that
are not arbiter-only, i.e. racks that have at least one non-arbiter node.
The following syntax is now valid:
`CREATE KEYSPACE ks WITH REPLICATION { 'class': 'NetworkTopologyStrategy' };`
`CREATE KEYSPACE ks WITH REPLICATION { };`

Fixes #16028

Backport is not needed. This is an enhancement for future releases.

Closes scylladb/scylladb#25570

* github.com:scylladb/scylladb:
  docs/cql: update documentation for default replication factor
  test/cqlpy: add keyspace creation default replication factor tests
  cql3: add default replication factor to `create_keyspace_statement`
2025-09-03 12:31:53 +03:00
Dario Mirovic
fd84da7a50 test/cqlpy: add keyspace creation default replication factor tests
Add test cases for create keyspace default replication factor.
It is expected that the default replication factor is equal to the
number of racks containing at least some non-zero-token nodes
in the test suite.

Refs: #16028
2025-08-28 01:42:34 +02:00
Dawid Pawlik
461d820fbb test/cqlpy: run vector_index tests only on vnodes
When creating an index on vector column using 'vector_index' class
the CDC log is being created as it is required for Vector Search.

Due to the fact that CDC does not yet work with tablets (Refs #16317)
enabled we have to mark the tests failing on tablets and run them on vnodes
to make sure the vector index tests continue to pass.
2025-08-20 12:38:52 +02:00
Jan Łakomy
8b2ed0f014 cassandra_tests: translate vector_invalid_query_test
Translate vector_invalid_query_test which tests parsing of ANN OF syntax.

Co-authored-by: Dawid Pawlik <dawid.pawlik@scylladb.com>
2025-08-01 12:08:50 +02:00
Jan Łakomy
eec47d9059 cassandra_tests: copy vector_invalid_query_test from Cassandra
Copy over and comment out this tests code from Cassandra for it to be translated later.
2025-08-01 12:08:50 +02:00
Karol Nowacki
4577c66a04 cql, schema: Extend name length limit from 48 to 192 bytes
This commit increases the maximum length of names for keyspaces, tables, materialized views, and indexes from 48 to 192 bytes.
The previous 48-bytes limit was inherited from Cassandra 3 for compatibility. However, this validation was removed in Cassandra 4 and 5 (see CASSANDRA-20389)
and some usage scenarios (such as some feature store workflows generating long table names) now depend on this relaxed constraint.
This change brings ScyllaDB's behavior in line with modern Cassandra versions and better supports these use cases.

The new limit of 192 bytes is derived from underlying filesystem limitations to prevent runtime errors when creating directories for table data.
When a new table is created, ScyllaDB generates a directory for its SSTables. The directory name is constructed from the table name, a dash, and a 32-character UUID.
For a CDC-enabled table, an associated log table is also created, which has the suffix `_scylla_cdc_log` appended to its name.
The directory name for this log table becomes the longest possible representation.
Additionally we reserve 15 bytes for future use, allowing for potential future extensions without breaking existing schemas.
To guarantee that directory creation never fails due to exceeding filesystem name limits, the maximum name length is calculated as follows:
  255 bytes (common filesystem limit for a path component)
-  32 bytes (for the 32-character UUID string)
-   1 byte  (for the '-' separator)
-  15 bytes (for the '_scylla_cdc_log' suffix)
-  15 bytes (reserved for future use)
----------
= 192 bytes (Maximum allowed name length)
This calculation is similar in principle to the one proposed for Cassandra to fix related directory creation failures (see apache/cassandra/pull/4038).

This patch also updates/adds all associated tests to validate the new 192-byte limit.
The documentation has been updated accordingly.
2025-06-18 14:08:38 +02:00
Dawid Mędrek
6bde01bb59 test/cqlpy/cassandra_tests: Adjust to RF-rack-valid keyspaces
We adjust three existing Cassandra tests so that they don't create
RF-rack-invalid keyspaces. We modify the replication factor used
in the problematic tests. The changes don't affect the tests as
the value of the RF is unrelated to what they verify. Thanks to
that, we can run them now even with enforced RF-rack-valid keyspaces.

The drawback is that the modified ALTER statements do not modify
the RF at all. However, since the tests seem to verify that the code
responsible for VALIDATING a request works as intended, that should
have little to no impact on them.
2025-04-11 14:20:14 +02:00
Paweł Zakrzewski
d483051e44 cql3/select_statement: reject aggregate functions when PER PARTITION LIMIT is present
Before this patch we silently allowed and ignored PER PARTITION LIMIT.
While using aggregate functions in conjunction with PER PARTITION LIMIT
can make sense, we want to disable it until we can offer proper
implementation, see #9879 for discussion.

We want to match Cassandra, and for queries with aggregate functions it
behaves as follows:
- it silently ignores PER PARTITION LIMIT if GROUP BY is present, which
  matches our previous implementation.
- rejects PER PARTITION LIMIT when GROUP BY is *not* present.

This patch adds rejection of the second group.

Fixes #9879

Closes scylladb/scylladb#23086
2025-03-13 10:29:53 +02:00
Paweł Zakrzewski
854d2917a1 cql3/select_statement: reject PER PARTITION LIMIT with SELECT DISTINCT
Before this patch we silently allowed and ignored PER PARTITION LIMIT.
SELECT DISTINCT requires all the partition key columns, which means that
setting PER PARTITION LIMIT is redundant - only one result will be
returned from every partition anyway.

Cassandra behaves the same way, so this patch also ensures
compatibility.

Fixes scylladb/scylladb#15109

Closes scylladb/scylladb#22950
2025-02-24 14:50:18 +02:00