With the latest changes, there are a lot of code that is redundant in the test.py. This PR just cleans this code.
Also, it narrows using dynamic scope for fixtures to test/alternator and test/cqlpy. All the rest by default will have module scope.
test.py will be a wrapper for pytest mostly for CI use. As for now test.py have important part of calculating the number of threads to start pytest with. This is not possible to do in pytest itself.
No backport needed, framework enhancement only.
Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-666Closesscylladb/scylladb#28852
* github.com:scylladb/scylladb:
test.py: remove testpy_test_fixture_scope
test.py: add logger for 3rd party service
test.py: delete dead code in test.py
Users ought to have possibility to create the local index for Vector Search
based only on a part of the partition key. This commits provides this by
removing requirements of 'full partition key only' for custom local index.
The commit updates docs to explain that local vector index can use only a part
of the partition key.
The commit implements cqlpy test to check fixed functionality.
Fixes: SCYLLADB-953
Needs to be backported to 2026.1 as it is a fix for local vector indexes.
Closesscylladb/scylladb#28931
With migration to pyest this fixture is useless. Removing and setting
the session to the module for the most of the tests.
Add dynamic_scope function to support running alternator fixtures in
session scope, while Test and TestSuite are not deleted. This is for
migration period, later on this function should be deleted.
Several tests in test_select_from_mutation_fragments.py assume that all
mutations end up in a single SSTable. This assumption can be violated
by background memtable flushes triggered by commitlog disk pressure.
Since the Scylla node is taken from a pool, it may carry unflushed data
from prior tests that prevents closed segments from being recycled,
thereby increasing the commitlog disk usage. A main source of such
pressure is keyspace-level flushes from earlier tests in this module,
which rotate commitlog segments without flushing system tables (e.g.,
`system.compaction_history`), leaving closed segments dirty.
Additionally, prior tests in the same module may have left unflushed
data on the shared test table (`test_table` fixture), keeping commitlog
segments dirty on its behalf as well. When commitlog disk usage exceeds
its threshold, the system flushes the test table to reclaim those
segments, potentially splitting a running test's mutations across
multiple SSTables.
This was observed in CI, where test_paging failed because its data was
split across two SSTables, resulting in more mutation fragments than the
hardcoded expected count.
This patch fixes the affected tests in two ways:
1. Where possible, tests are reworked to not assume a single SSTable:
- test_paging
- test_slicing_rows
- test_many_partition_scan
2. Where rework is impractical, major compaction is added after writes
and before validation to ensure that only one SSTable will exist:
- test_smoke
- test_count
- test_metadata_and_value
- test_slicing_range_tombstone_changes
Fixes SCYLLADB-1375.
Signed-off-by: Nikos Dragazis <nikolaos.dragazis@scylladb.com>
Closesscylladb/scylladb#29389
This series allows creating multiple vector indexes on the same column so users can rebuild an index without losing query availability.
The intended flow is:
1. Create a new vector index on a column that already has one.
2. Keep serving ANN queries from the old index while the new one is being built.
3. Verify the new index is ready.
4. Automatically switch to the remaining index.
5. Drop the old index.
To make that deterministic, `index_version` is changed from the base table schema version to a real creation timeuuid. When multiple vector indexes exist on the same column, ANN query planning now picks the index according to the routing implemented in Vector Store (newest serving index). This keeps queries on the old index until it the new one is up and ready.
This patch also removes the create-time restriction that rejected a second vector index on the same column. Name collisions are still rejected as before.
Test coverage is updated accordingly:
- Scylla now verifies that two vector indexes can coexist on the same column.
- Cassandra/SAI behavior is still covered and is still expected to reject duplicate indexes on the same column.
Fixes: VECTOR-610
Closesscylladb/scylladb#29407
* github.com:scylladb/scylladb:
docs: document vector index metadata and duplicate handling
test/cqlpy: cover vector index duplicate creation rules
vector_index: allow multiple named indexes on one column
vector_index: store `index_version` as creation timeuuid
execute_batch_without_checking_exception_message() inserted entries
into the authorized prepared cache before verifying that
check_access() succeeded. A failed BATCH therefore left behind
cached 'authorized' entries that later let a direct EXECUTE of the
same prepared statement skip the authorization check entirely.
Move the cache insertion after the access check so that entries are
only cached on success. This matches the pattern already used by
do_execute_prepared() for individual EXECUTE requests.
Introduced in 98f5e49ea8
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1221
Backport: all supported versions
Closesscylladb/scylladb#29432
* github.com:scylladb/scylladb:
test/cqlpy: add reproducer for BATCH prepared auth cache bypass
cql3: fix authorization bypass via BATCH prepared cache poisoning
In commit 727f68e0f5 we added the ability to SELECT:
* Individual elements of a map: `SELECT map_col[key]`.
* Individual elements of a set: `SELECT set_col[key]` returns key if the key exists in the set, or null if it doesn't, allowing to check if the element exists in the set.
* Individual pieces of a UDT: `SELECT udt_col.field`.
But at the time, we didn't provide any way to retrieve the **meta-data** for this value, namely its timestamp and TTL. We did not support `SELECT TIMESTAMP(collection[key])`, or `SELECT TIMESTAMP(udt.field)`.
Users requested to support such SELECTs in the past (see issue #15427), and Cassandra 5.0 added support for this feature - for both maps and sets and udts - so we also need this feature for compatibility. This feature was also requested recently by vector-search developers, who wanted to read Alternator columns - stored as map elements, not individual columns - with their WRITETIME information.
The first four patches in this series adds the feature (in four smaller patches instead one big one), the fifth and sixth patches add tests (cqlpy and boost tests, respectively). The seventh patch adds documentation.
All the new tests pass on Cassandra 5, failed on Scylla before the present fix, and pass with it.
The fix was surprisingly difficult. Our existing implementation (from 727f68e0f5 building on earlier machinery) doesn't just "read" `map_col[key]` and allow us to return just its timestamp. Rather, the implementation reads the entire map, serializes it in some temporary format that does **not** include the timestamps and ttls, and then takes the subscript key, at which point we no longer have the timestamp or ttl of the element. So the fix had to cross all these layers of the implementation.
While adding support for UDT fields in a pre-existing grammar nonterminal "subscriptExpr", we unintentionally added support for UDT fields also in LWT expressions (which used this nonterminal). LWT missing support for UDT fields was a long-time known compatibility issue (#13624) so we unintentionally fixed it :-) Actually, to completely fix it we needed another small change in the expression implementation, so the eighth patch in this series does this.
Fixes#15427Fixes#13624Closesscylladb/scylladb#29134
* github.com:scylladb/scylladb:
cql3: support UDT fields in LWT expressions
cql3: document WRITETIME() and TTL() for elements of map, set or UDT
test/boost: test WRITETIME() and TTL() on map collection elements
test/cqlpy: test WRITETIME() and TTL() on element of map, set or UDT
cql3: prepare and evaluate WRITETIME/TTL on collection elements and UDT fields
cql3: parse per-element timestamps/TTLs in the selection layer
cql3: add extended wire format for per-element timestamps and TTLs
cql3: extend WRITETIME/TTL grammar to accept collection and UDT elements
Add cqlpy tests for the current CREATE INDEX behavior of vector indexes.
Cover named and unnamed duplicates, IF NOT EXISTS, coexistence of
multiple named vector indexes on the same column, interactions between
named and unnamed indexes, and the same-name-on-different-table case.
An unprivileged user could bypass authorization checks by exploiting
the BATCH prepared statement cache:
1. Prepare an INSERT on a table the user has no access to
2. Execute it inside a BATCH — gets Unauthorized
3. Execute the same prepared INSERT directly — succeeds
In an earlier patch, we used the CQL grammar's "subscriptExpr" in
the rule for WRITETIME() and TTL(). But since we also wanted these
to support UDT fields (x.a), not just collection subscripts (x[3]),
we expanded subscriptExpr to also support the field syntax.
But LWT expressions already used this subscriptExpr, which meant
that LWT expressions unintentionally gained support for UDT fields.
Missing support for UDT fields in LWT is a long-standing known
Cassandra-compatibility bug (#13624), and now our grammar finally
supports the missing syntax.
But supporting the syntax is not enough for correct implementation
of this feature - we also need to fix the expression handling:
Two bugs prevented expressions like `v.a = 0` from working in LWT IF
clauses, where `v` is a column of user-defined type.
The first bug was in get_lhs_receiver() in prepare_expr.cc: it lacked
a handler for field_selection nodes, causing an "unexpected expression"
internal error when preparing a condition like `IF v.a = 0`. The fix
adds a handler that returns a column_specification whose type is taken
from the prepared field_selection's type field.
The second bug was in search_and_replace() in expression.cc: when
recursing into a field_selection node it reconstructed it with only
`structure` and `field`, silently dropping the `field_idx` and `type`
fields that are set during preparation. As a result, any transformation
that uses search_and_replace() on a prepared expression containing a
field_selection — such as adjust_for_collection_as_maps() called from
column_condition_prepare() — would zero out those fields. At evaluation
time, type_of() on the field_selection returned a null data_type
pointer, causing a segmentation fault when the comparison operator tried
to call ->equal() through it. The fix preserves field_idx and type when
reconstructing the node.
Fixes#13624.
This patch adds many tests verifying the behavior of WRITETIME() and
TTL() on individual elements of maps, sets and UDTs, serving as a
regression test for issue #15427. We also add tests verifying our
understanding of related issues like WRITETIME() and TTL() of entire
collections and of individual elements of *frozen* collections.
All new tests pass on Cassandra 5.0, helping to verify that our
implementation is compatible with Cassandra. They also pass on
ScyllaDB after the previous patch (most didn't before that patch).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Vector indexes currently store the base table schema version in
`index_version`. That value is name-based, not time-based,
so it does not represent when the index was created.
Store a timeuuid instead and change the relevant interfaces from
`table_schema_version` to `utils::UUID`. This is a prerequisite
for supporting multiple vector indexes on the same column where
the oldest index must be selected deterministically via routing
implemented in Vector Store.
Update the cqlpy tests to check the new semantics directly:
recreating the index changes `index_version`, while ALTER TABLE does not.
Accept the Cassandra SAI 'source_model' option for vector indexes.
This option is used by Cassandra libraries (e.g., CassIO, LangChain)
to tag vector indexes with the name of the embedding model that
produced the vectors.
ScyllaDB does not use the source_model value but stores it and
includes it in the DESCRIBE INDEX output for Cassandra compatibility.
Additionally, extend vector_index::describe() to emit a
WITH OPTIONS = {...} clause containing all user-provided index options
(filtering out system keys: target, class_name, index_version).
This makes options like similarity_function, source_model, etc.
visible in DESCRIBE output.
Libraries such as CassIO, LangChain, and LlamaIndex create vector
indexes using Cassandra's StorageAttachedIndex (SAI) class name.
This commit lets ScyllaDB accept these statements without modification.
When a CREATE CUSTOM INDEX statement specifies an SAI class name on a
vector column, ScyllaDB automatically rewrites it to the native
vector_index implementation. Accepted class names (case-insensitive):
- org.apache.cassandra.index.sai.StorageAttachedIndex
- StorageAttachedIndex
- sai
SAI on non-vector columns is rejected with a clear error directing
users to a secondary index instead.
The SAI detection and rewriting logic is extracted into a dedicated
static function (maybe_rewrite_sai_to_vector_index) to keep the
already-long validate_while_executing method manageable.
Multi-column (local index) targets and nonexistent columns are
skipped with continue — the former are treated as filtering columns
by vector_index::check_target(), and the latter are caught later by
vector_index::validate().
Tests that exercise features common to both backends (basic creation,
similarity_function, IF NOT EXISTS, bad options, etc.) now use the
SAI class name with the skip_on_scylla_vnodes fixture so they run
against both ScyllaDB and Cassandra. ScyllaDB-specific tests continue
to use USING 'vector_index' with scylla_only.
- Change 'Reproduces' to 'Validates fix for' in test comments to
reflect that the referenced issues are already fixed.
- Condense the VECTOR-179 comment to two lines.
- Replace the xfailed test_ann_query_with_restriction_works_only_on_pk
with a focused test (test_ann_query_with_pk_restriction) that creates
a vector index on a table with a PK column restriction, validating
the VECTOR-374 fix.
This series fixes two related inconsistencies around secondary-index
names.
1. `DESCRIBE INDEX ... WITH INTERNALS` returned the backing
materialized-view name in the `name` column instead of the logical
index name.
2. The snapshot REST API accepted backing table names for MV-backed
secondary indexes, but not the logical index names exposed to users.
The snapshot side now resolves logical secondary-index names to backing
table names where applicable, reports logical index names in snapshot
details, rejects vector index names with HTTP 400, and keeps multi-keyspace
DELETE atomic by resolving all keyspaces before deleting anything.
The tests were also extended accordingly, and the snapshot test helper
was fixed to clean up multi-table snapshots using one DELETE per table.
Fixes: SCYLLADB-1122
Minor bugfix, no need to backport.
Closesscylladb/scylladb#29083
* github.com:scylladb/scylladb:
cql3: fix DESCRIBE INDEX WITH INTERNALS name
test: add snapshot REST API tests for logical index names
test: fix snapshot cleanup helper
api: clarify snapshot REST parameter descriptions
api: surface no_such_column_family as HTTP 400
db: fix clear_snapshot() atomicity and use C++23 lambda form
db: normalize index names in get_snapshot_details()
db: add resolve_table_name() to snapshot_ctl
DESCRIBE INDEX ... WITH INTERNALS returned the name of
the backing materialized view in the name column instead
of the logical index name.
Return the logical index name from schema::describe()
for index schemas so all callers observe the
user-facing name consistently.
Fixes: SCYLLADB-1122
The test_create_index_synchronous_updates test in test_secondary_index_properties.py
was intermittently failing with 'assert found_wanted_trace' because the expected
trace event 'Forcing ... view update to be synchronous' was missing from the
trace events returned by get_query_trace().
Root cause: trace events are written asynchronously to system_traces.events.
The Python driver's populate() method considers a trace complete once the
session row in system_traces.sessions has duration IS NOT NULL, then reads
events exactly once. Since the session row and event rows are written as
separate mutations with no transactional guarantee, the driver can read an
incomplete set of events.
Evidence from the failed CI run logs:
- The entire test (CREATE TABLE through DROP TABLE) completed in ~300ms
(01:38:54,859 - 01:38:55,157)
- The INSERT with tracing happened in a ~50ms window between the second
CREATE INDEX completing (01:38:55,108) and DROP TABLE starting
(01:38:55,157)
- The 'Forcing ... synchronous' trace message is generated during the
INSERT write path (db/view/view.cc:2061), so it was produced, but
not yet flushed to system_traces.events when the driver read them
- This matches the known limitation documented in test/alternator/
test_tracing.py: 'we have no way to know whether the tracing events
returned is the entire trace'
Fix: replace the single-shot trace.events read with a retry loop that
directly queries system_traces.events until the expected event appears
(with a 30s timeout). Use ConsistencyLevel.ONE since system_traces has
RF=2 and cqlpy tests run on a single-node cluster.
The same race condition pattern exists in test_mv_synchronous_updates in
test_materialized_view.py (which this test was modeled after), so the
same fix is proactively applied there as well.
Fixes SCYLLADB-1314
Closesscylladb/scylladb#29374
Add skip_reason_plugin.py — a framework-agnostic pytest plugin that
provides typed skip markers (skip_bug, skip_not_implemented, skip_slow,
skip_env) so that the reason a test is skipped is machine-readable in
JUnit XML and Allure reports. Bare untyped pytest.mark.skip now
triggers a warning (to become an error after full migration). Runtime
skips via skip() are also enriched by parsing the [type] prefix from
the skip message.
The plugin is a class (SkipReasonPlugin) that receives the concrete
SkipType enum and an optional report_callback from conftest.py, keeping
it decoupled from allure and project-specific types.
Extract SkipType enum and convenience runtime skip wrappers (skip_bug,
skip_env, etc.) into test/pylib/skip_types.py so callers only need a
single import instead of importing both SkipType and skip() separately.
conftest.py imports SkipType from the new module and registers the
plugin instance unconditionally (for all test runners).
New files:
- test/pylib/skip_reason_plugin.py: core plugin — typed marker
processing, bare-skip warnings, JUnit/Allure report enrichment
(including runtime skip() parsing via _parse_skip_type helper)
- test/pylib/skip_types.py: SkipType enum and convenience wrappers
(skip_bug, skip_not_implemented, skip_slow, skip_env)
- test/pylib_test/test_skip_reason_plugin.py: 17 pytester-based
test functions (51 cases across 3 build modes) covering markers,
warnings, reports, callbacks, and skip_mode interaction
Infrastructure changes:
- test/conftest.py: import SkipType from skip_types, register
SkipReasonPlugin with allure report callback
- test/pylib/runner.py: set SKIP_TYPE_KEY/SKIP_REASON_KEY stash keys
for skip_mode so the report hook can enrich JUnit/Allure with
skip_type=mode without longrepr parsing
- test/pytest.ini: register typed marker definitions (required for
--strict-markers even when plugin is not loaded)
Migrated test files (representative samples):
- test/cluster/test_tablet_repair_scheduler.py:
skip -> skip_bug (#26844), skip -> skip_not_implemented
- test/cqlpy/.../timestamp_test.py: skip -> skip_slow
- test/cluster/dtest/schema_management_test.py: skip -> skip_not_implemented
- test/cluster/test_change_replication_factor_1_to_0.py: skip -> skip_bug (#20282)
- test/alternator/conftest.py: skip -> skip_env
- test/alternator/test_https.py: use skip_env() wrapper
Fixes SCYLLADB-79
Closesscylladb/scylladb#29235
The vector-search feature introduced the somewhat confusing feature of
enabling CDC without explicitly enabling CDC: When a vector index is
enabled on a table, CDC is "enabled" for it even if the user didn't
ask to enable CDC.
For this, write-path code began to use a new cdc_enabled() function
instead of checking schema.cdc_options.enabled() directly. This
cdc_enabled() function checks if either this enabled() is true, or
has_vector_index() is true.
Unfortunately, LWT writes continued to use cdc_options.enabled() instead
of the new cdc_enabled(). This means that if a vector index is used and
a vector is written using an LWT write, the new value is not indexed.
This patch fixes this bug. It also adds a regression test that fails
before this patch and passes afterwards - the new test verifies that
when a table has a vector index (but no explicit CDC enabled), the CDC
log is updated both after regular writes and after successful LWT writes.
This patch was also tested in the context of the upcoming vector-search-
for-Alternator pull request, which has a test reproducing this bug
(Alternator uses LWT frequently, so this is very important there).
It will also be tested by the vector-store test suite ("validator").
Fixes SCYLLADB-1342
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#29300
Queries against local vector indexes were failing with the error:
```ANN ordering by vector requires the column to be indexed using 'vector_index'```
This was a regression introduced by 15788c3734, which incorrectly
assumed the first column in the targets list is always the vector column.
For local vector indexes, the first column is the partition key, causing
the failure.
Previously, serialization logic for the target index option was shared
between vector and secondary indexes. This is no longer viable due to
the introduction of local vector indexes and vector indexes with filtering
columns, which have different target format.
This commit introduces a dedicated JSON-based serialization format for
vector index targets, identifying the target column (tc), filtering
columns (fc), and partition key columns (pk). This ensures unambiguous
serialization and deserialization for all vector index types.
This change is backward compatible for regular vector indexes. However,
it breaks compatibility for local vector indexes and vector indexes with
filtering columns created in version 2026.1.0. To mitigate this, usage
of these specific index types will be blocked in the 2026.1.0 release
by failing ANN queries against them in vector-store service.
Fixes: SCYLLADB-895
Backport to 2026.1 is required as this issue occurs also on this branch.
Closesscylladb/scylladb#28862
* github.com:scylladb/scylladb:
index: fix DESC INDEX for vector index
vector_search: test: refactor boilerplate setup
vector_search: fix SELECT on local vector index
index: test: vector index target option serialization test
index: test: secondary index target option serialization test
The estimate() function in the size_estimates virtual reader only
considered sstables local to the shard that happened to own the
keyspace's partition key token. Since sstables are distributed across
shards, this caused partition count estimates to be approximately
1/smp_count of the actual value.
This bug has been present since the virtual reader was introduced in
225648780d.
Use db.container().map_reduce0() to aggregate sstable estimates
across all shards. Each shard contributes its local count and
estimated_histogram, which are then merged to produce the correct
total.
Also fix the `test_partitions_estimate_full_overlap` test which becomes
flaky (xpassing ~1% of runs) because autocompaction could merge the
two overlapping sstables before the size estimate was read. Wrap the
test body in nodetool.no_autocompaction_context to prevent this race.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-1179
Refs https://github.com/scylladb/scylladb/issues/9083Closesscylladb/scylladb#29286
The `DESC INDEX` command returned incorrect results for local vector
indexes and for vector indexes that included filtering columns.
This patch corrects the implementation to ensure `DESCRIBE INDEX`
accurately reflects the index configuration.
This was a pre-existing issue, not a regression from recent
serialization schema changes for vector index target options.
Queries against local vector indexes were failing with the error:
"ANN ordering by vector requires the column to be indexed using 'vector_index'"
This was a regression introduced by 15788c3734, which incorrectly
assumed the first column in the targets list is always the vector column.
For local vector indexes, the first column is the partition key, causing
the failure.
Previously, serialization logic for the target index option was shared
between vector and secondary indexes. This is no longer viable due to
the introduction of local vector indexes and vector indexes with filtering
columns, which have different target format.
This commit introduces a dedicated JSON-based serialization format for
vector index targets, identifying the target column (tc), filtering
columns (fc), and partition key columns (pk). This ensures unambiguous
serialization and deserialization for all vector index types.
This change is backward compatible for regular vector indexes. However,
it breaks compatibility for local vector indexes and vector indexes with
filtering columns created in version 2026.1.0. To mitigate this, usage
of these specific index types will be blocked in the 2026.1.0 release
by failing ANN queries against them in vector-store service.
Fixes: SCYLLADB-895
This test ensures that the serialization format for vector index target
options remains stable. Maintaining backward compatibility is critical
because the index is restored from this property on startup.
Any unintended changes to the serialization schema could break existing
indexes after an upgrade.
This option is also an interface for the vector-store service,
which uses it to identify the indexed column.
Target option serialization must remain stable for backward compatibility.
The index is restored from this property on startup, so unintentional
changes to the serialization schema can break indexes after upgrade.
This patch series introduces a new documentation for exiting guardrails.
Moreover:
- Warning / failure messages of recently added write CL guardrails (SCYLLADB-259) are rephrased, so all guardrails have similar messages.
- Some new tests are added, to help verify the correctness of the documentation and avoid situations where the documentation and implementation diverge.
Fixes: [SCYLLADB-257](https://scylladb.atlassian.net/browse/SCYLLADB-257)
No backport, just new docs and tests.
[SCYLLADB-257]: https://scylladb.atlassian.net/browse/SCYLLADB-257?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQClosesscylladb/scylladb#29011
* github.com:scylladb/scylladb:
test: add new guardrail tests matching documentation scenarios
test: add metric assertions to guardrail replication strategy tests
test: use regex matching in guardrail replication strategy tests
test: extract ks_opts helper in test_guardrail_replication_strategy
docs: document CQL guardrails
cql: improve write consistency level guardrail messages
Since 7564a56dc8, all tables default to
repair-mode tombstone-gc, which is identical to immediate-mode for RF=1
tables. Consequently the tombstones written by the tests in this test
file are immediately collectible and with some unlucky timing, some of
them can be collected before the end of the test, failing the empty-page
prefix check because the empty pages prefix will be smaller than
expected based on the number of tombstones written.
Disable tombstone-gc to remove this source of flakyness.
Fixes: SCYLLADB-1062
Closesscylladb/scylladb#29077
Trie-based sstable indexes are supposed to be (hopefully) a better default than the old BIG indexes.
Make the new format a new default for new clusters by naming ms in the default scylla.yaml.
New functionality. No backport needed.
This PR is basically Michał's one https://github.com/scylladb/scylladb/pull/26377, Jakub's https://github.com/scylladb/scylladb/pull/27332 fixing `sstables_manager::get_highest_supported_format()` and one test fix.
Closesscylladb/scylladb#28960
* github.com:scylladb/scylladb:
db/config: announce ms format as highest supported
db/config: enable `ms` sstable format by default
cluster/dtest/bypass_cache_test: switch from highest_supported_sstable_format to chosen_sstable_format
api/system: add /system/chosen_sstable_version
test/cluster/dtest: reduce num_tokens to 16
Add tests for RF guardrails (min/max warn/fail, RF=0 bypass,
threshold=-1 disable, ALTER KEYSPACE) and write consistency level
guardrails to cover all scenarios described in guardrails.rst.
Test runtime (dev):
test_guardrail_replication_strategy - 6s
test_guardrail_write_consistency_level - 5s
Refs: SCYLLADB-257
Replace loose substring assertions with regex-based matching against
the exact server message formats. Add regex constants for all
guardrail messages and rewrite create_ks_and_assert_warnings_and_errors()
to verify count and content of warnings and failures.
Refs: SCYLLADB-257
Factor out ks_opts() to build keyspace options with tablets handling
and use it across all existing replication strategy guardrail tests.
No behavioral changes.
This facilitates further modification of the tests later in this
patch series.
Refs: SCYLLADB-257
This PR adds integrity verification for SSTable component files during loading. When component digests are present in Scylla metadata, the loader now validates each component's CRC32 digest against the stored expected value, catching silent corruption of component files. Index, Rows and Partitions components digests are also validated duriung scrub in validate mode
Added corruption tests that write an SSTable, flip a bit in a specific component file, then verify that reloading the SSTable detects the corruption and throws the expected exception.
Depends on https://github.com/scylladb/scylladb/pull/28338
Backport is not required, this is new feature
Fixes https://github.com/scylladb/scylladb/issues/20103Closesscylladb/scylladb#28761
* github.com:scylladb/scylladb:
test/cqlpy: test --ignore-component-digest-mismatch flag in scylla sstable upgrade
docs: document --ignore-component-digest-mismatch flag for scylla sstable upgrade
sstables: propagate ignore_component_digest_mismatch config to all load sites
sstables: add option to ignore component digest mismatches
sstable_compaction_test: Add scrub validate test for corrupted index
sstables: add tests for component digest validation on corrupted SSTables
sstables: validate index components digests during SSTable scrub in validate mode
sstables: verify component digests on SSTable load
sstables: add digest_file_random_access_reader for CRC32 digest computation
The `test/cqlpy/cassandra_tests/validation/entities/json_test.py::testJsonOrdering` was failing because of differences between Cassandra and Scylla in printing
JSON floating point values - e.g. Cassandra prints 30.0, where Scylla prints 30.
Both are valid, so in this patch, instead of comparing strings, we compare parsed JSON using `EquivalentJson`.
Fixes#28467Closesscylladb/scylladb#28924
Recently, in commit 7b30a39, we added to pytest.ini the option xfail_strict.
This option causes every time a test XPASSes, i.e., an XFAIL test actually
passes, to be considered an error and fail the test.
While this has some benefits, it's a big problem when running tests
against a reference implementation like DynamoDB or Cassandra: We
typically mark a test "xfail" if the test shows a known bug - i.e., if
the test fails on Scylla but passes on the reference system (DynamoDB
or Cassandra). This means that when running "test/cqlpy/run-cassandra"
or "test/alternator/run --aws", we expect to see many tests XPASS,
and now this will cause these runs to "fail".
So in this patch we add the xfail_strict=false to cqlpy/run-cassandra
and alternator/run --aws. This option is not added to cqlpy/run or
to alternator/run without --aws, and also doesn't affect test.py or
Jenkins.
P.S. This is another nail in the coffin of doing "cd test/alternator;
pytest --aws". You should get used to running Alternator tests through
test/alternator/run, even if you don't need to run Scylla (the "--aws"
option doesn't run Scylla).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#28973
Verify that scylla sstable upgrade fails when an sstable has a
corrupted Statistics component digest, and succeeds when the
--ignore-component-digest-mismatch flag is provided.
There is a race condition in driver that raises the RuntimeException.
This pollutes the output, so this PR is just silencing this exception.
Fixes: SCYLLADB-900
Closesscylladb/scylladb#28957
A few days ago, in commit 7b30a3981b we added to pytest.ini the option
xfail_strict. This option causes every time a test XPASSes, i.e., an
xfail test actually passes - to be considered an error and fail the test.
But some tests demonstrate a timing-related bug and do not reproduce the
bug every single time. An example we noticed in one CI run is:
test/cqlpy/test_secondary_index.py::test_unbuilt_index_not_used
This test reproduces a timing-related bug (if you read from a secondary
index "too quickly" you can get wrong results), but only about 90% of the
time, not 100% of the time.
The solution is to add "strict=False" for the xfail marker on this
specific test. This undoes the xfail_strict for this specific test,
accepting that this specific test can either pass or fail. Note that
this does NOT make this test worthless - we still see this test failing
most of the time, and when a developer finally fixes this issue, the
test will begin to pass all the time.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-956
(we'll probably need to follow up this fix with the same fix for other
xfail tests that can sometime pass).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#28942
Trie-based sstable indexes are supposed to be (hopefully)
a better default than the old BIG indexes.
Make them the new default.
If we change our mind, this change can be reverted later.
The purpose of `add_column_for_post_processing` is to add columns that are required for processing of a query,
but are not part of SELECT clause and shouldn't be returned. They are added to the final result set, but later are not serialized.
Mainly it is used for filtering and grouping columns, with a special case of `WHERE primary_key IN ... ORDER BY ...` when the whole result set needs additional final sorting,
and ordering columns must be added as well.
There was a bug that manifested in #9435, #8100 and was actually identified in #22061.
In case of selection with processing (e.g functions involved), result set row is formed in two stages.
Initially it is a list of columns fetched from replicas - on which filtering and grouping is performed.
After that the actual selection is resolved and the final number of columns can change.
Ordering is performed on this final shape, but the ordering column index returned by `add_column_for_post_processing` refereed to initial shape.
If selection refereed to the same column twice (e.g. `v, TTL(v)` as in #9435) final row was longer than initial and ordering refereed to incorrect column.
If a function in selection refereed to multiple columns (e.g. as_json(.., ..) which #8100 effectively uses) the final row was shorter
and ordering tried to use a non-existing column.
This patch fixes the problem by making sure that column index of the final result set is used for ordering.
The previously crashing test `cassandra_tests/validation/entities/json_test.py::testJsonOrdering` doesn't have to be skipped, but now it is failing on issue #28467.
Fixes#9435Fixes#8100Fixes#22061Closesscylladb/scylladb#28472
Currently, repair-mode tombstone-gc cannot be used on tables with RF=1. We want to make repair-mode the default for all tablet tables (and more, see https://github.com/scylladb/scylladb/issues/22814), but currently a keyspace created with RF=1 and later altered to RF>1 will end up using timeout-mode tombstone gc. This is because the repair-mode tombstone-gc code relies on repair history to determine the gc-before time for keys/ranges. RF=1 tables cannot run repairs so they will have empty repair history and consequently won't be able to purge tombstones.
This PR solves this by keeping a registry of RF=1 tables and consulting this registry when creating `tombstone_gc_state` objects. If the table is RF=1, tombstone-gc will work as if the table used immediate-mode tombstone-gc. The registry is updated on each replication update. As soon as the table is not RF=1 anymore, the tombstone-gc reverts to the natural repair-mode behaviour.
After this PR, tombstone-gc defaults to repair-mode for all tables, regardless of RF and tablets/vnodes.
Fixes: SCYLLADB-106.
New feature, no backport required.
Closesscylladb/scylladb#22945
* github.com:scylladb/scylladb:
test/{boost,cluster}: add test for tombstone gc mode=repair with RF=1
tombstone_gc: allow use of repair-mode for RF=1 tables
replica/table: update rf=1 table registry in shared tombstone-gc state
tombstone_gc: tombstone_gc_before_getter: consider RF when getting gc before time
tombstone_gc: unpack per_table_history_maps
tombstone_gc: extract _group0_gc_time from per_table_history_map
tombstone_gc: drop tombstone_gc_state(nullptr) ctor and operator bool()
test/lib/random_schema: use timeout-mode tombstone_gc
tombstone_gc_options: add C++ friendly constructor
test: move away from tombstone_gc_state(nullptr) ctor
treewide: move away from tombstone_gc_state(nullptr) ctor
sstable: move away from tombstone_gc_mode::operator bool()
replica/table: add get_tombstone_gc_state()
compaction: use tombstone_gc_state with value semantics
db/row_cache: use tombstone_gc_state with value semantics
tombstone_gc: introduce tombstone_gc_state::for_tests()
This patch series implements `write_consistency_levels_warned` and `write_consistency_levels_disallowed` guardrails, allowing the configuration of which consistency levels are unwanted for writes. The motivation for these guardrails is to forbid writing with consistency levels that don't provide high durability guarantees (like CL=ANY, ONE, or LOCAL_ONE).
Neither guardrail is enabled by default, so as not to disrupt clusters that are currently using any of the CLs for writes. The warning guardrail may seem harmless, as it only adds a warning to the CQL response; however, enabling it can significantly increase network traffic (as a warning message is added to each response) and also decrease throughput due to additional allocations required to prepare the warning. Therefore, both guardrails should be enabled with care. The newly added `writes_per_consistency_level` metric, which is incremented unconditionally, can help decide whether a guardrail can be safely enabled in an existing cluster.
This commit adds additional `if` instructions on the critical path. However, based on the `perf_simple_query` benchmark for writes, the difference is marginal (~40 additional instructions, which is a relative difference smaller than 0.001).
BEFORE:
```
291443.35 tps ( 53.3 allocs/op, 16.0 logallocs/op, 14.2 tasks/op, 48067 insns/op, 18885 cycles/op, 0 errors)
throughput:
mean= 289743.07 standard-deviation=6075.60
median= 291424.69 median-absolute-deviation=1702.56
maximum=292498.27 minimum=261920.06
instructions_per_op:
mean= 48072.30 standard-deviation=21.15
median= 48074.49 median-absolute-deviation=12.07
maximum=48119.87 minimum=48019.89
cpu_cycles_per_op:
mean= 18884.09 standard-deviation=56.43
median= 18877.33 median-absolute-deviation=14.71
maximum=19155.48 minimum=18821.57
```
AFTER:
```
290108.83 tps ( 53.3 allocs/op, 16.0 logallocs/op, 14.2 tasks/op, 48121 insns/op, 18988 cycles/op, 0 errors)
throughput:
mean= 289105.08 standard-deviation=3626.58
median= 290018.90 median-absolute-deviation=1072.25
maximum=291110.44 minimum=274669.98
instructions_per_op:
mean= 48117.57 standard-deviation=18.58
median= 48114.51 median-absolute-deviation=12.08
maximum=48162.18 minimum=48087.18
cpu_cycles_per_op:
mean= 18953.43 standard-deviation=28.76
median= 18945.82 median-absolute-deviation=20.84
maximum=19023.93 minimum=18916.46
```
Fixes: SCYLLADB-259
Refs: SCYLLADB-739
No backport, it's a new feature
Closesscylladb/scylladb#28570
* github.com:scylladb/scylladb:
scylla.yaml: add write CL guardrails to scylla.yaml
scylla.yaml: reorganize guardrails config to be in one place
test: add cluster tests for write CL guardrails
test: implement test_guardrail_write_consistency_level
cql3: start using write CL guardrails
cql3/query_processor: implement metrics to track CL of writes
db: cql3/query_processor: add write_consistency_levels enum_sets
config: add write_consistency_levels_* guardrails configuration
Recently we started to rely on the options "--auth-superuser-name"
and "--auth-superuser-salted-password" to ensure that a
cassandra/cassandra user exists for tests - without those options
a default superuser no longer exists.
This broke "test/cqlpy/run --release" for old releases, earlier
than 5.4 (in the enterprise stream, 2024.1 or earlier), because
those old release didn't have this option.
So in this patch we fix the "--release" logic that removes these
options from the command line when running these old versions.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closesscylladb/scylladb#28894
`isclose` function checks if returned similarity floats are close enough to expected value, but it doesn't `assert` by itself.
Several tests missed that `assert`, effectively always passing.
With this patch similarity values checks are wrapped in helper function `assert_similarity` with predefined tolerance.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-877Closesscylladb/scylladb#28748
Modify the methods which calculate the default gc mode as well as that
which validates whether repair-mode can be used at all, so both accepts
use of repair-mode on RF=1 tables.
This de-facto changes the default tombstone-gc to repair-mode for all
tables. Documentation is updated accordingly.
Some tests need adjusting:
* cqlpy/test_select_from_mutation_fragments.py: disable GC for some test
cases because this patch makes tombstones they write subject to GC
when using defaults.
* test/cluster/test_mv.py::test_mv_tombstone_gc_not_inherited used
repair-mode as a non-default for the base table and expected the MV to
revert to default. Another mode has to be used as the non-default
(immediate).
* test/cqlpy/test_tools.py::test_scylla_sstable_dump_schema: don't
compare tombstone_gc schema extension when comparing dumped schema vs.
original. The tool's schema loader doesn't have access to the keyspace
definition so it will come up with different defaults for
tombstone-gc.
* test/boost/row_cache_test.cc::test_populating_cache_with_expired_and_nonexpired_tombstones
sets tombstone expiry assuming the tombstone-gc timeout-mode default.
Change the CREATE TABLE statement to set the expected mode.
Implement basic tests for write consistency level guardrails,
verifying that they work for each type of write request (inserts,
updates, deletes, logged batches, unlogged batches, conditional batches,
and counter operations).
All tests are marked as Scylla-only because they currently don't
pass with Cassandra due to differences in handling superusers (see:
SCYLLADB-882).
Tests execution time:
- Dev: 3s
- Debug: 14s
Refs: SCYLLADB-259
Refs: SCYLLADB-882
Changes the behavior of default superuser creation.
Previously, without configuration 'cassandra:cassandra' credentials
were used. Now default superuser creation is skipped if not configured.
The two ways to create default superuser are:
- Config file - auth_superuser_name and auth_superuser_salted_password fields
- Maintenance socket - connect over maintenance socket and CREATE/ALTER ROLE ...
Behavior changes:
Old behavior:
- No config - 'cassandra:cassandra' created
- auth_superuser_name only - <name>:cassandra created
- auth_superuser_salted_password only - 'cassandra:<password>' created
- Both specified - '<name>:<password>' created
New behavior:
- No config - no default superuser
- Requires maintenance socket setup
- auth_superuser_name only - '<name>:' created WITHOUT password
- Requires maintenance socket setup
- auth_superuser_salted_password only - no default superuser
- Both specified - '<name>:<password>' created
Fixes SCYLLADB-409