Commit Graph

630 Commits

Author SHA1 Message Date
Szymon Malewski
6b2fce03f9 alternator: optional stripping of http response headers
In Alternator's HTTP API, response headers can dominate bandwidth for
small payloads. The Server, Date, and Content-Type headers were sent on
every response but many clients never use them.

This patch introduces three Alternator config options:
  - alternator_http_response_server_header,
  - alternator_http_response_disable_date_header,
  - alternator_http_response_disable_content_type_header,
which allow customizing or suppressing the respective HTTP response
headers. All three options support live update (no restart needed).
The Server header is no longer sent by default; the Date and
Content-Type defaults preserve the existing behavior.

The Server and Date header suppression uses Seastar's
set_server_header() and set_generate_date_header() APIs added in
https://github.com/scylladb/seastar/pull/3217. This patch also
fixes deprecation warnings from older Seastar HTTP APIs.

Tests are in test/alternator/test_http_headers.py.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-70

Closes scylladb/scylladb#28288
2026-05-19 10:47:13 +03:00
Nadav Har'El
cd61a44ab8 test/alternator: test response compression of tiny responses
This patch adds to the existing collection of tests for Alternator
response compression another test with a tiny response being compressed.
This test serves two purposes:

1. It verifies setting alternator_response_compression_threshold_in_bytes
   to a tiny number like 1 really means that tiny responses would be
   compressed.

2. It verifies that our compression code, which has a special code path
   for the small chunk at the end of the compression, works correctly.

The original motivation for writing this test was a false alarm by
Claude Code which claimed that Alternator's response compression code
has a serious, exploitable, memory overrun bug, because it set the
wrong size limit on that last chunk. Claude was wrong, there is no such
bug. We did set an oversized limit on the last chunk (so this patch
fixes this typo), but it didn't matter - because the code used
deflateBound - the guaranteed maximum size of the uncompressed data -
for the buffer's size, so the buffer was unconditionally big enough,
no matter which avail_out limit we passed to delate() it could never
overflow.

The included test passes even before this patch, even with ASAN
enabled to detect memory overflows - no overflow was happening.
It also passes after the typo correction in this patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#29718
2026-05-19 10:02:26 +03:00
Szymon Malewski
cb8e11653f test/alternator: Number normalization tests
DynamoDB normalizes Number values, so different string representations
of the same number (e.g., "1000" vs "1e3") should be treated as the
same value in all contexts.
In Alternator this is true in most cases, thanks to implicit normalization in
Decimal `to_string()` function.
However this is fragile - and in fact this function should be fixed
due to OOM vulnerability in CQL use (#8002).

This patch adds tests that should prevent regression in cases
that work currently.

Unfortunately not all contexts work currently - mainly the HASH keys
are not normalized and backend handles them by byte representation.
Added test replicate this incorrect behaviour

All added tests pass with DynamoDB, with one exception: weirdly
DynamoDB doesn't recognise unnormalized numbers in BatchGetItem
 as duplicate keys.

Ref SCYLLADB-1575

Closes scylladb/scylladb#29501
2026-05-18 09:42:33 +03:00
Nadav Har'El
4082fdf350 alternator: add ReturnScores option to VectorSearch
A vector search operation in Alternator (VectorSearch option to Query)
returns items sorted by decreasing similarity to the searched vector.

Although the items are sorted by decreasing similarity scores, before
this patch the user had no way to see the values of these scores.
This patch adds a new VectorSearch option, `ReturnScores`. This option
defaults to `NONE`. But if set to `SIMILARITY`, the query will return
an array `Scores` with the same length as `Items`, which gives the
similarity score for each item.

As usual, this patch includes the implementation, the documentation,
and tests for the new feature.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 14:19:17 +03:00
Nadav Har'El
85c6cafb1d alternator: add optimized vector type for vector search
Today in Alternator vector search, vectors are presented to the API as
lists of numbers. I.e., in JSON a vector is sent in requests and responses
as:

     {"L": [{"N": "3.14159"}, {"N":" "6.7"}}

This format is verbose and inefficient for long vectors. Even worse,
because the "N" number format has precision guarantees in DynamoDB,
we cannot optimize the storage of such vectors by, for example, storing
the numbers as 32-bit floats. We actually store these vectors as JSON,
exactly as shown above.

So in this patch we introduce a new DynamoDB type, "FLOAT32VECTOR", for
vectors. The above vector will look like this in JSON:

     {"FLOAT32VECTOR": [3.14159, 6.7]}

Note that each number is an unquoted JSON number, not a JSON string.
Importantly, the definition of the "FLOAT32VECTOR" type specifies that
components of the vector only have 32-bit precision. This means that
Scylla may store internally these vectors as lists of 32-bit floats -
not as a JSON. And indeed, this patch includes this optimization:
Top-level vector attributes are now encoded in an optimized way,
as a byte 5 (alternator_type::FLOAT32VECTOR) followed by the elements
of the vector, just 4 bytes each (the 4-byte big-endian IEEE 754
representation of each floating-point component).

This patch also includes documentation, and extensive tests that the
new "FLOAT32VECTOR" type works (which also serves as an example how to
use it in the boto3 SDK), that it is indeed encoded internally as 32-bit
floats and not wasteful JSON strings, and that vector search on such items
work. The last thing requires cooperation from the vector store, of
course - it needs to be able to understand the new optimized encoding
of vector attributes in addition to the old unoptimized one.

Note that the old unoptimized ("list of numbers") vectors are still
supported. Although not recommended for general use, some users might
still want to use the unoptimized type if they have pre-existing data
created on DynamoDB or Alternator without vector search in mind, and
the vectors already exist as lists of numbers.

Although this is less important, the new vector type "FLOAT32VECTOR"
is also allowed in a Query's QueryVector.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 11:57:45 +03:00
Nadav Har'El
ea910acdd4 alternator: add SimilarityFunction option to vector index creation
Before this patch, vector search always used the COSINE similarity
function. In this patch we add the ability to choose a different
similarity function when creating a new vector index (with CreateTable
or UpdateTable) by using the SimilarityFunction option. We still default
to "COSINE" if SimilarityFunction isn't specified.

Allowed similarity functions are COSINE, DOT_PRODUCT, and EUCLIDEAN.
DescribeTable can also retrieve a vector index's SimilarityFunction.

As usual, this patch also includes documentation for the new feature,
and tests. Some of the tests can run without a vector store - verifying
the API syntax and which similarity function is supported - but we
also add tests that require the vector store and check that the different
similarity functions actually sort the nearest items in the expected
order.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 11:57:45 +03:00
Nadav Har'El
70283967d3 alternator: add vector search metrics
Before this patch, we did not have any special metrics for vector search
in Alternator. We have had count of "Query" operations, but there was no
distinction between "standard" queries - of a base table or GSI/LSI -
and vector-search queries.

This patch adds four new metrics:

 * vector_search_query - counting how many Query requests are actually
   vector searches.

 * vector_search_query_returned_items - counting how many items were
   returned by vector searches.

 * vector_search_query_items_from_vs - counting how many results were
   retrieved from the vector-store backend.

 * vector_search_query_items_from_base_table - counting how many items
   were read from the base table during vector-search queries. Some
   vector search queries using SELECT=ALL_PROJECTED_ATTRIBUTES or COUNT
   are optimized to not need to read items from the base table.

This patch also includes documentation for the new four metrics, and
tests that they count what we want them to count.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-05-13 11:57:44 +03:00
Piotr Szymaniak
459c1dc32f test/alternator: stop avoiding tablets in Streams tests
Alternator Streams now supports tablets, so stop skipping the TTL Streams test in tablet mode and stop forcing vnodes in the Streams audit test.

Refs SCYLLADB-463

Closes scylladb/scylladb#29697
2026-05-10 22:13:15 +03:00
Nadav Har'El
df8c9b17b8 Merge 'alternator: Graduate Alternator Streams from experimental' from Piotr Szymaniak
As a final step for https://scylladb.atlassian.net/browse/SCYLLADB-461 we need to graduate Alternator Streams from experimental.
So let's remove `--experimental-features=alternator-streams` and map the obsolete config string to `UNUSED` for backward compatibility. Also, remove the related gating of the feature.
Finally, stop providing the config flag in test configs.

Fixes SCYLLADB-1680
Fixes #16367

To documentation tracked by https://scylladb.atlassian.net/browse/SCYLLADB-462 still remains.

This PR needs to hit 2026.2, so (only) if it branches before the PR is merged to `master`, we'd need to backport.

Closes scylladb/scylladb#29604

* github.com:scylladb/scylladb:
  test: Stop providing alternator-streams experimental flag
  alternator: Graduate Alternator Streams from experimental
2026-05-10 22:10:03 +03:00
Nadav Har'El
63927e07ea Merge 'alternator/streams: keep disabled streams usable and purge on re-enable' from Piotr Szymaniak
When an Alternator stream is disabled, the data should continue to be accessible so that consumers can finish reading. When the stream is later re-enabled, a new StreamArn is produced and only then the old data is purged.

On disable, the existing CDC options (including preimage and postimage) are preserved so that DescribeStream can still report StreamViewType. All stream APIs continue to work on the disabled stream, with all shards reported as closed (EndingSequenceNumber set). No new CDC records are written; existing data expires via TTL after 24 hours.

On re-enable, the old CDC log table is dropped as a separate Raft group0 schema change and a fresh one is created with a new UUID, giving a new StreamArn. This is Alternator-specific — CQL CDC keeps reusing the log table. Re-enabling is the only way to immediately purge old stream data.

Old stream data is removed immediately upon re-enable (a discrepancy with DynamoDB, which keeps it readable for 24 hours through the old StreamArn).

Tests updated to cover the new disable and re-enable behavior.

Fixes #7239
Fixes SCYLLADB-523

Closes scylladb/scylladb#29413

* github.com:scylladb/scylladb:
  alternator/streams: remove dead next_iter in get_records
  test/alternator: fix stream wait timeouts to use wall-clock time
  docs/alternator: document stream disable/re-enable behavior
  alternator/streams: keep disabled streams usable and purge on re-enable
2026-05-10 22:04:35 +03:00
Piotr Szymaniak
744848a85f test/alternator: fix stream wait timeouts to use wall-clock time
Both disable_stream and wait_for_active_stream used time.process_time()
for their timeouts, but process_time measures CPU time, not wall-clock
time. Since these loops spend most of their time sleeping and waiting on
API calls, the timeouts could last far longer than intended. Use
time.time() instead to enforce actual wall-clock deadlines.
2026-05-07 14:45:42 +02:00
Piotr Szymaniak
38bd068f78 alternator/streams: keep disabled streams usable and purge on re-enable
Previously, disabling Alternator Streams would create a blank
cdc::options with only enabled=false, which meant losing access also
to stored Streams's data (including preimage and postimage).

Now, when a stream is disabled:
- The existing CDC options are preserved (only 'enabled' is flipped to
  false), so StreamViewType remains available.
- DescribeStream enumerates all shards with EndingSequenceNumber set,
  indicating they are closed.
- GetRecords omits NextShardIterator for disabled streams.
- DescribeTable (supplement_table_stream_info) reports the stream ARN
  and StreamEnabled: false when the CDC log table still exists.
- ListStreams uses get_base_table instead of is_log_for_some_table so
  that disabled streams whose log table still exists are listed.

When a stream is re-enabled on an Alternator table that has an existing
(disabled) CDC log table, the old log table is dropped and a fresh one
is created with a new UUID, producing a new StreamArn. This is
Alternator-specific behavior; CQL CDC tables continue to reuse the
existing log table.

The old stream data is lost immediately upon re-enable. DynamoDB keeps
it readable for 24 hours.

Tests:
- test_streams_closed_read, test_streams_disabled_stream: remove xfail
  now that disabled streams are usable.
- test_streams_reenable: new test verifying that re-enabling produces
  a new ARN and the old data is still readable via the old ARN (xfail
  because Scylla currently purges old data on re-enable).

Fixes scylladb/scylladb#7239
2026-05-07 14:45:42 +02:00
Piotr Szymaniak
9a86044c63 test: Stop providing alternator-streams experimental flag
Now that alternator-streams is no longer an experimental feature,
stop passing it in test configurations.
2026-04-22 15:25:37 +02:00
Botond Dénes
eb3326b417 Merge 'test.py: migrate all bare skips to typed skip markers' from Artsiom Mishuta
should be merged after #29235

Complete the typed skip markers migration started in the plugin PR.
Every bare `@pytest.mark.skip` decorator and `pytest.skip()` runtime call
across the test suite is replaced with a typed equivalent, making skip
reasons machine-readable in JUnit XML and Allure reports.

**62 files changed** across 8 commits, covering ~127 skip sites in total.

Bare `pytest.skip` provides only a free-text reason string. CI dashboards
(JUnit, Allure) cannot distinguish between a test skipped due to a known
bug, a missing feature, a slow test, or an environment limitation. This
makes it hard to track skip debt, prioritize fixes, or filter dashboards
by skip category.

The typed markers (`skip_bug`, `skip_not_implemented`, `skip_slow`,
`skip_env`) introduced by the `skip_reason_plugin` solve this by embedding
a `skip_type` field into every skip report entry.

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_bug` | 24 | 16 | Skip reason references a known bug/issue |
| `skip_not_implemented` | 10 | 5 | Feature not yet implemented in Scylla |
| `skip_slow` | 4 | 3 | Test too slow for regular CI runs |
| `skip_not_implemented` (bare) | 2 | 1 | Bare `@pytest.mark.skip` with no reason (COMPACT STORAGE, #3882) |

| Type | Count | Files | Description |
|------|-------|-------|-------------|
| `skip_env` | ~85 | 34 | Feature/config/topology not available at runtime |
| `skip_bug` | 2 | 2 | Known bugs: Streams on tablets (#23838), coroutine task not found (#22501) |

- **Comments**: 7 comments/docstrings across 5 files updated from `pytest.skip()` to `skip()`
- **Plugin hardened**: `warnings.warn()` → `pytest.UsageError` for bare `@pytest.mark.skip` at collection time — bare skips are now a hard error, not a warning
- **Guard tests**: New `test/pylib_test/test_no_bare_skips.py` with 3 tests that prevent regression:
  - AST scan for bare `@pytest.mark.skip` decorators
  - AST scan for bare `pytest.skip()` runtime calls
  - Real `pytest --collect-only` against all Python test directories

Runtime skip sites use the convenience wrappers from `test.pylib.skip_types`:
```python
from test.pylib.skip_types import skip_env
```

Usage:
```python
skip_env("Tablets not enabled")
```

1. **test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs** — 24 decorator sites, 16 files
2. **test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented** — 10 decorator sites, 5 files
3. **test: migrate @pytest.mark.skip to @pytest.mark.skip_slow** — 4 decorator sites, 3 files
4. **test: migrate bare @pytest.mark.skip to skip_not_implemented** — 2 bare decorators, 1 file
5. **test: migrate runtime pytest.skip() to typed skip_env()** — ~85 sites, 34 files
6. **test: migrate runtime pytest.skip() to typed skip_bug()** — 2 sites, 2 files
7. **test: update comments referencing pytest.skip() to skip()** — 7 comments, 5 files
8. **test/pylib: reject bare pytest.mark.skip and add codebase guards** — plugin hardening + 3 guard tests

- All 60 plugin + guard tests pass (`test/pylib_test/`)
- No bare `@pytest.mark.skip` or `pytest.skip()` calls remain in the codebase
- `pytest --collect-only` succeeds across all test directories with the hardened plugin

SCYLLADB-1349

Closes scylladb/scylladb#29305

* github.com:scylladb/scylladb:
  test/alternator: replace bare pytest.skip() with typed skip helpers
  test: migrate new bare skips introduced by upstream after rebase
  test/pylib: reject bare pytest.mark.skip and add codebase guards
  test: update comments referencing pytest.skip() to skip_env()
  test: migrate runtime pytest.skip() to typed skip_bug()
  test: migrate runtime pytest.skip() to typed skip_env()
  test: migrate bare @pytest.mark.skip to skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_slow
  test: migrate @pytest.mark.skip to @pytest.mark.skip_not_implemented
  test: migrate @pytest.mark.skip to @pytest.mark.skip_bug for known bugs
2026-04-22 15:48:27 +03:00
Radosław Cybulski
6f7bf30a14 alternator: increase wait time to tablet sync
When forcing tablet count change via cql command, the underlying
tablet machinery takes some time to adjust. Original code waited
at most 0.1s for tablet data to be synchronized. This seems to be
not enough on debug builds, so we add exponential backoff and increase
maximum waiting time. Now the code will wait 0.1s first time and
continue waiting with each time doubling the time, up to maximum of 6 times -
or total time ~6s.

Fixes: SCYLLADB-1655

Closes scylladb/scylladb#29573
2026-04-21 17:38:07 +02:00
Marcin Maliszkiewicz
9f11920b15 Merge 'alternator: fix remaining problems with new Stream ARN format' from Nadav Har'El
This small series includes a few followups to the patch that changed Alternator Stream ARNs from using our own UUID format to something that resembles Amazon's Stream ARNs (and the KCL library won't reject as bogus-looking ARNs).

The first patch is the most important one, fixing ListStreams's LastEvaluatedStreamArn to also use the new ARN format. It fixes SCYLLADB-539.

The following patches are additional cleanups and tests for the new ARN code.

Closes scylladb/scylladb#29474

* github.com:scylladb/scylladb:
  alternator: fix ListStreams paging if table is deleted during paging
  test/alternator: test DescribeStream on non-existent table
  alternator: ListStreams: on last page, avoid LastEvaluatedStreamArn
  alternator: remove dead code stream_shard_id
  alternator: fix ListStreams to return real ARN as LastEvaluatedStreamArn
2026-04-20 14:42:28 +02:00
Artsiom Mishuta
dce0c24a02 test/alternator: replace bare pytest.skip() with typed skip helpers 2026-04-19 17:34:41 +02:00
Avi Kivity
9fb67e3e96 Revert "alternator: optional stripping of http response headers"
This reverts commit 73f0deef6d. It
prevents 2943d30b0c, which causes high flakiness, from being
reverted.
2026-04-19 15:14:48 +03:00
Artsiom Mishuta
0b6b380b80 test: update comments referencing pytest.skip() to skip_env()
Update 7 comments/docstrings across 5 files that still referenced
pytest.skip() to reference the typed skip_env() wrapper for
consistency with the migrated code.
2026-04-19 11:14:03 +02:00
Artsiom Mishuta
b10028e556 test: migrate runtime pytest.skip() to typed skip_bug()
Migrate 2 runtime pytest.skip() calls referencing known bugs to use
the typed skip_bug() wrapper from test.pylib.skip_types:

- test/alternator/test_ttl.py: Streams on tablets (#23838)
- test/scylla_gdb/test_task_commands.py: coroutine task not found (#22501)
2026-04-19 11:10:42 +02:00
Artsiom Mishuta
8a80e2c3be test: migrate runtime pytest.skip() to typed skip_env()
Migrate runtime pytest.skip() calls across 34 files to use the typed
skip_env() wrapper from test.pylib.skip_types.

These sites skip at runtime because a required feature, config option,
library version, build mode, or runtime topology is not available.

Also fixes 'raise pytest.skip(...)' in test_audit.py — skip_env()
already raises internally, so the explicit raise was incorrect.

Each file gains one new import:
  from test.pylib.skip_types import skip_env
2026-04-19 11:09:29 +02:00
Szymon Malewski
73f0deef6d alternator: optional stripping of http response headers
In Alternator's HTTP API, response headers can dominate bandwidth for
small payloads. The Server, Date, and Content-Type headers were sent on
every response but many clients never use them.

This patch introduces three Alternator config options:
  - alternator_http_response_server_header,
  - alternator_http_response_disable_date_header,
  - alternator_http_response_disable_content_type_header,
which allow customizing or suppressing the respective HTTP response
headers. All three options support live update (no restart needed).
The Server header is no longer sent by default; the Date and
Content-Type defaults preserve the existing behavior.

The Server and Date header suppression uses Seastar's
set_server_header() and set_generate_date_header() APIs added in
https://github.com/scylladb/seastar/pull/3217. This patch also
fixes deprecation warnings from older Seastar HTTP APIs.

Tests are in test/alternator/test_http_headers.py.

Fixes https://scylladb.atlassian.net/browse/SCYLLADB-70

Closes scylladb/scylladb#28288
2026-04-19 09:22:04 +03:00
Nadav Har'El
0d05e3b4a4 alternator: fix ListStreams paging if table is deleted during paging
Currently, ListStreams paging works by looking in the list of tables
for ExclusiveStartStreamArn and starting there. But it's possible
that during the paging process, one of the tables got deleted and
ExclusiveStartStreamArn no longer points to an existing table. In
the current implementation this caused the paging to stop (think
it reached the end).

The solution is simple: ListStreams will now sort the list of tables
by name (it anyway needs to be sorted by something to be consistent
across pages), and will look with std::upper_bound for the first
table *after* the ExclusiveStartStreamArn - we don't need to find
that table name itself.

The patch also includes a test reproducing this bug. As usual, the
test passes on DynamoDB, fails on Alternator before this patch,
and passes with the patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-19 09:12:02 +03:00
Nadav Har'El
930fb4c330 test/alternator: test DescribeStream on non-existent table
We already had a test for DescribeStream being called on a bogus ARN
returns a ValidationException. But if the stream is more legitimate-
looking but refers to a non-existent table (e.g., an ARN taken in the
past from a table that no longer exists), we should return
ResourceNotFoundException. In this patch we add a test that verifies
we indeed do this correctly.

Moreover, Alternator's current stream ARNs include both a keyspace
name and a table name, and either one being incorrect should lead
to ResourceNotFoundException, and indeed the new test validates
that it works as expected - there is no bug here (AI guessed we
have a bug in the missing *keyspace* case, but this guess was wrong).
2026-04-19 09:12:02 +03:00
Nadav Har'El
02d474fca8 alternator: ListStreams: on last page, avoid LastEvaluatedStreamArn
When ListStreams is on its last page and ran out streams to list,
it shouldn't return a paging cookie (LastEvaluatedStreamArn) at all.
Before this patch it does, and forces the user to make another call
just to get another empty page, which is silly.

This patch includes a fix and a reproducer test (that, as usual, passes
on DynamoDB and fails on Alternator before the patch and succeeds
after).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-19 09:12:02 +03:00
Nadav Har'El
1ac910c2ab alternator: fix ListStreams to return real ARN as LastEvaluatedStreamArn
Alternator Streams' "ListStreams" does paging by returning a "cookie"
LastEvaluatedStreamArn from one request, that the user passes to the
next request as ExclusiveStartStreamArn.

In the past, Alternator's stream ARNs were UUIDs, but we recently
changed them to match DynamoDB's ARN format which the KCL library
requires. However, we didn't change ListStream's cookie format,
and it remained UUIDs.

This, however, goes against the documentation of DynamoDB, which
states that LastEvaluatedStreamArn should be "the stream ARN of
the item where the operation stopped". It shouldn't be some weird
opaque cookie.

So in this patch we add a test that confirms that indeed, in DynamoDB
the LastEvaluatedStreamARN is really the last returned ARN and not
an opaque cookie. The new test passes on DynamoDB, and fails on
Alternator before the simple fix that this patch then does.

Fixes SCYLLADB-539.
2026-04-19 09:12:01 +03:00
Piotr Szymaniak
4b6937b570 alternator/streams: Block tablet merges when Alternator Streams are enabled
DynamoDB Streams API can only convey a single parent per stream shard.
Tablet merges produce 2 parents, which is incompatible. When streams
are requested on a tablet table, block tablet merges via
tablet_merge_blocked (the allocator suppresses new merge decisions and
revokes any active merge decision).

add_stream_options() sets tablet_merge_blocked=true alongside
enabled=true, so CreateTable needs no special handling — the flag
is inert on vnode tables and immediately effective on tablet tables.

For UpdateTable, CDC enablement is deferred: store the user's intent
via enable_requested, and let the topology coordinator finalize
enablement once no in-progress merges remain. A new helper,
defer_enabling_streams_block_tablet_merges(), amends the CDC options
to this deferred state.

Disabling streams clears all flags, immediately re-allowing merges.

The tablet allocator accesses the merge-blocked flag through a
schema::tablet_merges_forbidden() accessor rather than reaching into
CDC options directly.

Mark test_parent_children_merge as xfail and remove downward
(merge) steps from tablet_multipliers in test_parent_filtering and
test_get_records_with_alternating_tablets_count.
2026-04-19 03:54:33 +02:00
Nadav Har'El
31e0315710 Merge 'alternator: fix unnecesary cdc log entries' from Radosław Cybulski
Fix cdc writing unnecesary entries to it's log, like for example when Alternator deletes an item which in reality doesn't exist.

Originally @wps0 tackled this issue. This patch is an extension of his work. His work involved adding `should_skip` function to cdc, which would process a `mutation` object and decide, wherever changes in the object should be added to cdc log or not.

The issue with his approach is that `mutation` object might contain changes for more than one row. If - for example - the `mutation` object contains two changes, delete of non-existing row and create of non-existing row, `should_skip` function will detect changes in second item and allow whole `mutation` (BOTH items) to be added. For example (using python's boto3) running this on empty table:
```
with table.batch_writer() as batch:
    batch.put_item({'p': 'p', 'c': 'c0'})
    batch.delete_item(Key={'p': 'p', 'c': 'c1'})
```
will emit two events ("put" event and "delete" event), even though the item with `c` set to `c1` does not exist (thus can't be deleted). Note, that both entries in batch write must use the same partition key, otherwise upper layer with split them into separate `mutation` objects and the issue will not happen.

The solution is to do similar processing, but consider each change separated from others. This is tricky to implement due to a way cdc works. When cdc processes `mutation` object (containing X changes), it emits cdc entries in phases. Phase 1 - emit `preimage` (old state) for each change (if requested). Phase 2 - for each change emit actual "diff" (update / delete and so on). Phase 3 - emit `postimage` (new state).

We will know if change needs to be skipped during phase 2. By that time phase 1 is completed and preimage for the change is emited. At that moment we set a flag that the change (identified by clustering key value) needs to be skipped - we add a clustering key to a `ignore-rows` set (`_alternator_clustering_keys_to_ignore` variable) and continue normally. Once all phases finish we add a `postprocess` phase (`clean_up_noop_rows` function). It will go through generated cdc mutations and skip all modifications, for which clustering key is in `ignore-rows` set. After skipping we need to do a "cleanup" operation - each generated cdc mutation contain index (incremented by one), if we skipped some parts, the index is not consecutive anymore, so we reindex final changes.

There's a special case worth mentioning - Alternator tables without clustering keys. At that point `mutation` object passed to cdc can contain exactly one change (since different partition keys are splitted by upper layers and Alternator will never emit `mutation` object containing two (or more) changes with the same primary key. Here, when we decide the change is to be skipped we add empty `bytes` object to `ignore-rows` set. When checking `ignore-rows` set, we check if it's empty or not (we don't check for presence of empty `bytes` object).

Note: there might be some confusion between this patch and #28452 patch. Both started from the same error observation and use similar tests for validation, as both are easily triggered by BatchWrite commands (both needs `mutation` object passed to cdc to contain more than one single change). This issue tho is about wrong data written in cdc log and is fixed at cdc, where #28452 is about wrong way of parsing correct cdc data and is fixed at Alternator side of things. Note, that we need #28452 to truly verify (otherwise we will emit correct cdc entries, but Alternator will incorrectly parse them).

Note: to benefit / notice this patch you need `alternator_streams_increased_compatibility` flag turned on.

Note: rework is quite "broad" and covers a lot of ground - every operation, that might result in a no-change to the database state should be tested. An additional test was added - trying to remove a column from non-existing item, as well as trying to remove non-existing column from existing item.

Fixes: #28368
Fixes: SCYLLADB-1528
Fixes: SCYLLADB-538

Closes scylladb/scylladb#28544

* github.com:scylladb/scylladb:
  alternator: remove unnecesary code
  alternator: fix Alternator writing unnecesary cdc entries
  alternator: add failing tests for Streams
2026-04-18 00:07:51 +03:00
Radosław Cybulski
9a6aed721b alternator: add streams with tablets tests
Add tests for Streams, when table uses tablets underneath.
One test verifies filtering using CHILD_SHARDS feature.
Other one makes sure we get read all data while the table
undergoes tablet count change.

Add `--tablet-load-stats-refresh-interval-in-seconds=1`
to `alternator/run` script, as otherwise newly added tests will fail.
The setting changes how often scylla refreshes tablet metadata.
This can't be done using `scylla_config_temporary`, as
1) default is 60 seconds
2) scylla will wait full timeout (60s) to read configuration variable again.
2026-04-17 18:58:27 +02:00
Radosław Cybulski
6be16cf224 alternator: remove antitablet guards when using Streams
Remove `if` condition, that prevented tables with tablets
working with Streams.
Remove a test, that verifies, that Alternator will reject
tables with tablets underneath working with Streams feature enabled
on them.
Update few tests, that were expected to fail on tablets to enable their
normal execution.
2026-04-17 18:58:26 +02:00
Radosław Cybulski
6e5aaa85b6 alternator: fix Alternator writing unnecesary cdc entries
Work in this patch is a result of two bugs - spurious MODIFY event, when
remove column is used in `update_item` on non-existing item and
spurious events, when batch write item mixed noop operations with
operations involving actual changes (the former would still emit
cdc log entries).
The latter issue required rework of Piotr Wieczorek's algorithm,
which fixed former issue as well.

Piotr Wieczorek previously wrote checks, that should
prevent unnecesary cdc events from being written. His implementation
missed the fact, that a single `mutation` object passed to cdc code
to be analysed for cdc log entries can contain modifications for
multiple rows (with the same timestamp - for example as a result
to BatchWriteItem call). His code tries to skip whole `mutation`,
which in such case is not possible, because BatchWriteItem might have
one item that does nothing and second item that does modification
(this is the reason for the second bug).

His algorithm was extended and moved. Originally it was working
as follows - user would sent a `mutation` object with some changes to
be "augmented". The cdc would process those changes and built a set of
cdc log changes based on them, that would be added to cdc log table.
Piotr added a `should_skip` function, which processes user changes and
tried to determine if they all should be dropped or not.
New version, instead of trying to skip adding rows to
cdc log `mutation` object, builds a rows-to-ignore set.
After whole cdc log `mutation` object is completed, it processes it
and go through it row by row. Any row that was previously added to
a `rows_to_ignore` set will now be removed. Remaining rows are written to
new cdc log `mutation` with new clustering key
(`cdc$batch_seq_no` index value should probably be consecutive -
we just want to be safe here) and returns new `mutation` object to
be sent to cdc log table.

The first bug is fixed as a side effect of new algorithm,
which contains more precise checks detecting, if given
mutation actually made a difference.

Fixes: #28368
Fixes: SCYLLADB-538
Fixes: SCYLLADB-1528
Refs: #28452
2026-04-17 18:00:25 +02:00
Radosław Cybulski
2894542e57 alternator: add failing tests for Streams
Add failing tests for Streams functionality.
Trying to remove column from non-existing item is producing
a MODIFY event (while it should none).
Doing batch write with operations working on the same partition,
where one operation is without side effects and second with
will produce events for both operations, even though first changes nothing.

First test has two versions - with and without clustering key.
Second has only with clustering key, as we can't produce
batch write with two items for the same partition -
batch write can't use primary key more than once in single call.
We also add a test for batch write, where one of three operations
has no observable side effects and should not show up in Streams
output, but in current scylla's version it does show.
2026-04-17 16:28:14 +02:00
Botond Dénes
facb50cbf9 Merge 'test.py: refactor test.py' from Andrei Chekun
With the latest changes, there are a lot of code that is redundant in the test.py. This PR just cleans this code.
Also, it narrows using dynamic scope for fixtures to test/alternator and test/cqlpy. All the rest by default will have module scope.
test.py will be a wrapper for pytest mostly for CI use. As for now test.py have important part of calculating the number of threads to start pytest with. This is not possible to do in pytest itself.

No backport needed, framework enhancement only.

Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-666

Closes scylladb/scylladb#28852

* github.com:scylladb/scylladb:
  test.py: remove testpy_test_fixture_scope
  test.py: add logger for 3rd party service
  test.py: delete dead code in test.py
2026-04-17 12:51:14 +03:00
Piotr Dulikowski
37fc1507f0 Merge 'Alternator: Add vector search support' from Nadav Har'El
This series adds support for vector search in Alternator based on the existing implementation in CQL.

The series adds APIs for `CreateTable` and `UpdateTable` to add or remove vector indexes to Alternator tables, `DescribeTable` to list them and check the indexing status, and `Query` to perform a vector search - which contacts the vector store for the actual ANN (approximate nearest neighbor) search.

Correct functionality of these features depend on some features of the the vector store, that were already done (see https://github.com/scylladb/vector-store/pull/394).

This initial implementation is fully functional, and can already be useful, but we do not yet support all the features we hope to eventually support. Here are things that we have **not** done yet, and plan to do later in follow-up pull requests:

1. Support a new optimized vector type ("V") - in addition to the "list of numbers" type supported in this version.
2. Allow choosing a different similarity function when creating an index, by SimilarityFunction in VectorIndex definition.
3. Allow choosing quantization (f32/f16/bf16/i8/b1) to ask the vector index to compress stored vectors.
4. Support oversampling and rescoring, defined per-index and per-query.
5. Support HNSW tuning parameters — maximum_node_connections, construction_beam_width, search_beam_width.
6. Support pre-filtering over key columns, which are available at the vector store, by sending the filter to the vector store (translated from DynamoDB filter syntax to the vector's store's filter syntax). A decision still need to be made if this will use KeyConditionExpression or FilterExpression. This version supports only post-filtering (with `FilterExpression`).
7. Support projecting non-key attributes into the index (Projection=INCLUDE and Projection=ALL), and then 1. pre-filtering using these attributes, and 2. efficiently return these attributes (using Select=ALL_PROJECTED_ATTRIBUTES, which today returns just the key columns).
8. Optimize the performance of `Query`, which today is inefficient for Select=ALL_ATTRIBUTES because it serially retrieves the matching items one at a time.
9. Returning the similarity scores with the items (the design proposes ReturnVectorSearchSimilarity).
10. Add more vector-search-specific metrics, beyond the metric we already have counting Query requests. For example separate latency and request-count metrics for vector-search Queries (distinct from GSI/LSI queries), and a metric accumulating the total Limit (K) across all vector search queries.
11. Consider how (and if at all) we want to run the tests in test/alternator/test_vector.py that need the vector store in the CI. Currently they are skipped in CI and only run manually (with `test/alternator/run --vs test_vector`).
12. UpdateTable 'Update' operation to modify index parameters. Only some can be modified, e.g., Oversampling.
13. Support for "local index" (separate index for each partition).
14. Make sure that vector search and Streams can be enabled concurrently on the same table - both need CDC but we need to verify that one doesn't confuse the other or disables options that the other needs. We can only do this after we have Alternator Streams running on tablets (since vector store requires tablets).

Testing the new Alternator vector search end-to-end requires running both Scylla and the vector store together. We will have such end-to-end tests in the vector store repository (see https://github.com/scylladb/vector-store/pull/392), but we also add in this pull request many end-to-end tests written in Python, that can be run with the command "test/alternator/run --vs test_vector.py". The "--vs" option tells the run script to run both Scylla and the vector store (currently assumed to be in `.../vector-store/target/release/vector-store`). About 65% of the tests in this pull request check supported syntax and error paths so can run without the vector store, while about 35% of the tests do perform actual Query operations and require the vector store to be running. Currently, the tests that do require the vector store will not get run by CI, but can be easily re-run manually with `test/alternator/run --vs test_vector.py`.

 In total, this series includes 78 functional tests in 2200 lines of Python code.

This series also includes documentation for the new Alternator feature and the new APIs introduced. You can see a more detailed design document here: https://docs.google.com/document/d/1cxLI7n-AgV5hhH1DTyU_Es8_f-t8Acql-1f58eQjZLY/edit

Two patches in this series split the huge alternator/executor.cc, after this series continued to grow it and it reached a whoppng 7,000 lines. These patches are just reorganization of code, no functional changes. But it's time that we finally do this (Refs #5783), we can't just continue to grow executor.cc with no end...

Closes scylladb/scylladb#29046

* github.com:scylladb/scylladb:
  test/alternator: add option to "run" script to run with vector search
  alternator: document vector search
  test/alternator: fix retries in new_dynamodb_session
  test/alternator: test for allowed characters in attribute names
  test/alternator: tests for vector index support
  alternator, vector: add validation of non-finite numbers in Query
  alternator: Query: improve error message when VectorSearch is missing
  alternator: add per-table metrics for vector query
  alternator: clean up duplicated code
  alternator: fix default Select of Query
  alternator: split executor.cc even more
  alternator: split alternator/executor.cc
  alternator: validate vector index attribute values on write
  alternator: DescribeTable for vector index: add IndexStatus and Backfilling
  alternator: implement Query with a vector index
  alternator: fix bug in describe_multi_item()
  alternator: prevent adding GSI conflicting with a vector index
  alternator: implement UpdateTable with a vector index
  alternator: implement DescribeTable with a vector index
  alternator: implement CreateTable with a vector index
  alternator: reject empty attribute names
  cdc: fix on_pre_create_column_families to create CDC log for vector search
2026-04-17 10:25:45 +02:00
Andrei Chekun
745debe9ec test.py: remove testpy_test_fixture_scope
With migration to pyest this fixture is useless. Removing and setting
the session to the module for the most of the tests.
Add dynamic_scope function to support running alternator fixtures in
session scope, while Test and TestSuite are not deleted. This is for
migration period, later on this function should be deleted.
2026-04-16 22:08:33 +02:00
Radosław Cybulski
c5ed6b22ae alternator: add CHILD_SHARDS filtering
Add a `CHILD_SHARDS` filter to `DescribeStream` command.
When used, user need to pass a parent stream shard id as
json's ShardFilter.ShardId field. DescribeStream will
then return only list of stream shards, that are direct
descendants of passed parent stream shard.

Each stream shard cover a consecutive part of token space.
A stream shard Q is considered to be a child of stream shard W,
when at least one token belongs to token spaces from both streams.
The filtering algorithm itself is somewhat complicated - more details
in comments in streams.cc.

CHILD_SHARDS is a Amazon's functionality and is required by KCL.

Add unit tests.

Fixes: #25160

Closes scylladb/scylladb#28189
2026-04-16 18:27:55 +03:00
Piotr Szymaniak
d0c3f78d76 test/alternator: extend local TTL streams timeout
Increase the non-AWS wait in the TTL streams test to reduce vnode CI flakes caused by delayed expiration visibility.

Fixes SCYLLADB-1556

Closes scylladb/scylladb#29516
2026-04-16 15:53:35 +03:00
Nadav Har'El
d3d5db37d7 test/alternator: add option to "run" script to run with vector search
Add to test/alternator/run the option "-vs" which runs alongside with
Scylla a vector store, to allow running Alternator tests with vector
indexing.

To get the vector store, do

	git clone git@github.com:scylladb/vector-store.git
	cargo build --release

"run -vs" looks for an executable in ../vector-store/target/*/vector-store
but can also be overridden by the VECTOR_STORE environment variable.

test/alternator/run runs the vector store exactly like it runs Scylla -
in a temporary directory, on a temporary IP address in the localhost
subnet (127.0.0/8), killing it when the test end, and showing the output
of both programs (Scylla and vector store). These transient runs of
Scylla and vector store are configured to be able to communicate to
each other.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 14:30:18 +03:00
Nadav Har'El
164b0e37e1 test/alternator: fix retries in new_dynamodb_session
The new_dynamodb_session() function had a bug which we never noticed
because we hardly used it, but it became more noticable when the new
test/alternator/test_vector.py started to use it:

By default, boto3 retries a request up to 9 times when it encounters
a retriable error (such as an Internal Server Error). We don't want such
retries in our tests - it makes failures slower, but more importantly
it can hide "flaky" bugs by retrying 9 times until it happens to succeed.

The new_dynamodb_session() had code (copied from the dynamodb fixture)
to set boto3's "max_attempts" configuration to 0, to disable this retry.
But this code had an incorrect "if" to only be done if we're testing on
"localhost". This is wrong: We almost never use "localhost" as the
target of the test; Both test/cqlpy/run and test.py pick an IP address
in the localhost subnet (127/8) and uses that IP address - not the string
"localhost".

This bug only existed in new_dynamodb_session() - the more commonly used
"dynamodb" fixture didn't have this bug.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 14:30:17 +03:00
Nadav Har'El
858dee0b30 test/alternator: test for allowed characters in attribute names
One of the tests in the previous patch checked that strange characters
are allowed in attribute names used for vector indexing. It turns out
we never had a test that verifies that regardless of vector indexes -
any character whatsoever is allowed in attribute names. This is
different from table names which are much more limited.

So this patch adds the missing test.

As usual, the new test also passes on DynamoDB, showing that these
stange characters in attribute names are also allowed by DynamoDB.
2026-04-16 14:30:17 +03:00
Nadav Har'El
58538e18e8 test/alternator: tests for vector index support
In this patch we add a large collection of basic functional tests for the
vector index support, covering the CreateTable, UpdateTable,  DescribeTable
and Query operations and the various ways in which those are allowed to
work - or expected to fail. These tests were written in parallel with
writing the code so they (hopefully) cover all the corner cases considered
during development, and make sure these corner cases are all handled
correctly and will not regress in the future.

Some of these tests do not involve querying of the index and focus on
the structure of requests and the kind of syntax allowed. But other tests
are end-to-end, requiring the vector store to be running and trying to
index Alternator data and query it. These tests are marked
"needs_vector_store", and are immediately skipped in Scylla is not
configured to connect to a vector store. In a later patch we'll add a
an option to test/alternator/run to be able to run these end-to-end
tests by automatically running both Scylla and the Vector Store.

We'll have additional end-to-end tests in the vector-store repository.

Note that vector search is a new API feature that doesn't exist in DynamoDB,
so we are adding new parameters and outputs to existing operations. The AWS
SDKs don't normally allow doing that, so the test added here begins by
teaching the Python SDK to use the new APIs we added. This piece of code
can also be used by end-users to use vector search (at least in Python...)
before we officially add this support to ScyllaDB's SDK wrappers.
2026-04-16 14:30:17 +03:00
Nadav Har'El
f932f94422 alternator: add per-table metrics for vector query
The per-table metrics for Query were not incremented for the
vector variant of the Query operations, only the global metrics were
incremented. This patch fixes this oversight, and add a test that
reproduces it (the new test fails before this patch, and passes after).
2026-04-16 14:30:16 +03:00
Nadav Har'El
f15c6634a7 alternator: fix default Select of Query
In earlier patches, when Query'ing a vector index, we set the default
Select to ALL_ATTRIBUTES. However, according to the DynamoDB documentation
for Query,

   "If neither Select nor ProjectionExpression are specified, DynamoDB
    defaults to ALL_ATTRIBUTES when accessing a table, and
    ALL_PROJECTED_ATTRIBUTES when accessing an index."

This default should also apply to vector index, so this patch fixes this.
The new behavior is not only more compatible with DynamoDB, it is also
much more efficient by default, as ALL_PROJECTED_ATTRIBUTES does not need
to read from the base table - it returns the results that the vector store
returned. Of course, if the user needs the more efficient ALL_ATTRIBUTES
this option is still available - it's just no longer the default.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 14:30:16 +03:00
Nadav Har'El
0afc730b7b alternator: reject empty attribute names
Alternator has a function validate_attr_name_length() used to validate an
attribute name passed in different operations like PutItem, UpdateItem,
GetItem, etc. It fails the request if the attribute name is longer than
65535 characters.

It turns out that we forgot to check if the attribute name length isn’t 0 -
which should be forbidden as well!

This patch fixes the validation code, and also adds a test that confirms
that after this patch empty attribute names are rejected - just like DynamoDB
does - whereas before this patch they were silently accepted.

We want to fix this issue now, because in a later patch we intend to use
the same validation function also for vector indexes - and want it to be
accurate.

Fixes SCYLLADB-1069.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 13:28:15 +03:00
Piotr Szymaniak
4c93c2af62 audit/alternator: support audit_tables=alternator.<table> shorthand
The real keyspace name of an Alternator table T is "alternator_T".
Expand the "alternator.T" format used in the audit_tables config flag
to the real keyspace name at parse time, so users don't need to spell
out the internal "alternator_T.T" form.
2026-04-15 12:29:15 +02:00
Piotr Szymaniak
0714d8aded audit/alternator: Add negative audit tests
Add tests for the unhappy path of Alternator audit logging:
- Category filtering: operations are not logged when their category
  (DML, QUERY, DDL) is excluded from audit_categories.
- Keyspace filtering: operations on a keyspace not listed in
  audit_keyspaces are not logged.
- Error entries: a failed operation (thrown exception after audit_info
  is set) produces an audit entry with error=true.
- Empty-keyspace bypass: global operations like ListTables and
  DescribeEndpoints are logged regardless of audit_keyspaces because
  should_log() short-circuits on an empty keyspace.
2026-04-15 12:29:15 +02:00
Piotr Szymaniak
ad05b44931 audit/alternator: Add testing of auditing
There is a new test file created, `test/alternator/test_audit.py`.
The file contains a suite of tests of all auditing operations.
2026-04-15 12:29:15 +02:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Artsiom Mishuta
b1e9c0b867 test/pylib: add typed skip markers plugin
Add skip_reason_plugin.py — a framework-agnostic pytest plugin that
provides typed skip markers (skip_bug, skip_not_implemented, skip_slow,
skip_env) so that the reason a test is skipped is machine-readable in
JUnit XML and Allure reports.  Bare untyped pytest.mark.skip now
triggers a warning (to become an error after full migration).  Runtime
skips via skip() are also enriched by parsing the [type] prefix from
the skip message.

The plugin is a class (SkipReasonPlugin) that receives the concrete
SkipType enum and an optional report_callback from conftest.py, keeping
it decoupled from allure and project-specific types.

Extract SkipType enum and convenience runtime skip wrappers (skip_bug,
skip_env, etc.) into test/pylib/skip_types.py so callers only need a
single import instead of importing both SkipType and skip() separately.
conftest.py imports SkipType from the new module and registers the
plugin instance unconditionally (for all test runners).

New files:
- test/pylib/skip_reason_plugin.py: core plugin — typed marker
  processing, bare-skip warnings, JUnit/Allure report enrichment
  (including runtime skip() parsing via _parse_skip_type helper)
- test/pylib/skip_types.py: SkipType enum and convenience wrappers
  (skip_bug, skip_not_implemented, skip_slow, skip_env)
- test/pylib_test/test_skip_reason_plugin.py: 17 pytester-based
  test functions (51 cases across 3 build modes) covering markers,
  warnings, reports, callbacks, and skip_mode interaction

Infrastructure changes:
- test/conftest.py: import SkipType from skip_types, register
  SkipReasonPlugin with allure report callback
- test/pylib/runner.py: set SKIP_TYPE_KEY/SKIP_REASON_KEY stash keys
  for skip_mode so the report hook can enrich JUnit/Allure with
  skip_type=mode without longrepr parsing
- test/pytest.ini: register typed marker definitions (required for
  --strict-markers even when plugin is not loaded)

Migrated test files (representative samples):
- test/cluster/test_tablet_repair_scheduler.py:
  skip -> skip_bug (#26844), skip -> skip_not_implemented
- test/cqlpy/.../timestamp_test.py: skip -> skip_slow
- test/cluster/dtest/schema_management_test.py: skip -> skip_not_implemented
- test/cluster/test_change_replication_factor_1_to_0.py: skip -> skip_bug (#20282)
- test/alternator/conftest.py: skip -> skip_env
- test/alternator/test_https.py: use skip_env() wrapper

Fixes SCYLLADB-79

Closes scylladb/scylladb#29235
2026-04-08 10:38:56 +03:00
Nadav Har'El
a0e79f391f Merge 'alternator: fix batch write item squashing cdc entries' from Radosław Cybulski
When `BatchWriteItem` operates on multiple items sharing the same partition key in `always_use_lwt` write isolation mode, all CDC log entries are emitted under a single timestamp. The previous `get_records` parsing algorithm in `alternator/streams.cc` assumed that all CDC log entries sharing the same timestamp correspond to a single DynamoDB item change. As a result, it would incorrectly squash multiple distinct item changes into a single Streams record — producing wrong event data (e.g., one INSERT instead of four, with mismatched key/attribute values).

Note: the bug is specific to `always_use_lwt` mode because only in LWT mode does the entire batch share a single timestamp. In non-LWT modes, each item in the batch receives a separate timestamp, so the entries naturally stay separate.

**Commit 1: alternator: add BatchWriteItem Streams test**

- Adds new tests `test_streams_batchwrite_no_clustering_deletes_non_existing_items` and `test_streams_batchwrite_no_clustering_deletes_existing_items` that cover the corner cases of batch-deleting a existing and non-existing item in a table without a clustering key. CDC tables without clustering keys are handled differently, and this path was previously untested for delete operations.
- Adds a new test `test_streams_batchwrite_into_the_same_partition_will_report_wrong_stream_data`, that is a simple way to trigger a bug.
- Adds a new test `test_streams_batchwrite_into_the_same_partition_deletes_existing_items`, that validates various combinations of puts and deletes in a single BatchWrite against the same partition.
- Adds a new `test_table_ss_new_and_old_images_write_isolation_always` fixture and extends `create_table_ss` to accept `additional_tags`, enabling tests with a specific write isolation mode.

**Commit 2: alternator: fix BatchWriteItem squashed Streams entries**

The core fix rewrites the CDC log entry parsing in `get_records` to distinguish items by their clustering key:

- Introduces `managed_bytes_ptr_hash` and `managed_bytes_ptr_equal` helper structs for pointer-based hash map lookups on `managed_bytes`.
- Replaces the single `record`/`dynamodb` pair with a `std::unordered_map<const managed_bytes*, Record, ...>` (`records_map`) keyed by the base table's clustering key value from each CDC log row. For tables without a clustering key, all entries map to a single sentinel key.
- Adds a validation that Alternator tables have at most one clustering key column (as required by the DynamoDB data model).
- On end-of-record (`eor`), flushes all accumulated per-clustering-key records into the output, each with a unique `eventID` (the `event_id` format now includes an index suffix).
- Adjusts the limit check: since a single CDC timestamp bucket can now produce multiple output records, the limit may be slightly exceeded to avoid breaking mid-batch.

Fixes #28439
Fixes: SCYLLADB-540

Closes scylladb/scylladb#28452

* github.com:scylladb/scylladb:
  alternator/test: explain why 'always' write isolation mode is used in tests
  alternator/test: add scylla_only to always write isolation fixture
  alternator: fix BatchWriteItem squashed Streams entries
  alternator: add BatchWriteItem test (failing)
2026-04-07 17:49:23 +03:00