Commit Graph

603 Commits

Author SHA1 Message Date
Nadav Har'El
2e274bbdba alternator: split executor.cc even more
This patch continues the effort to split the huge executor.cc (5000
lines before this patch) even more.

In this patch we introduce a new source file, executor_util.cc, for
various utility functions that are used for many different operations
and therefore are useful to have in a header file. These utility
functions will now be in executor_util.cc and executor_util.hh -
instead of executor.cc and executor.hh.

Various source files, including executor.cc, the executor_read.cc
introduced in the previous patch, as well as older source files like
as streams.cc, ttl.cc and serialization.cc, use the new header file.

This patch removes over 700 lines of code from executor.cc, and
also removes a large amount of utility functions declerations from
executor.hh. Originally, executor.hh was meant to be about the
interface that the Alternator server needs to *execute* the different
DynamoDB API operations - and after this patch it returns closer to
this original goal.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 14:30:16 +03:00
Nadav Har'El
751da00692 alternator: split alternator/executor.cc
Already six years ago, in #5783, we noticed that alternator/executor.cc
has grown too large. The previous patches added hundreds of more lines
to it to implement vector search, and it reached a whopping 7,000 lines
of code. This is too much.

This patch splits from executor.cc two major chunks:

1. The implementation of **read** requests - GetItem, BatchGetItem,
   Query (base table, GSI/LSI, and vector-search), and Scan - was
   moved to a new source file alternator/executor_read.cc.
   The new file has 2,000 lines.

2. Moved 250 lines of template functions dealing with attribute paths
   and maps of them to a new header file, attribute_path.hh.
   These utilities are used for many different operations - various
   read operations use them for ProjectionExpression, and UpdateItem
   uses them for modifications to nested attributes, so we need the
   new header file from both executor.cc and executor_read.cc

The remaining executor.cc is still pretty big, 5,000 lines, and
contains write operations (PutItem, UpdateItem, DeleteItem,
BatchWriteItem) as well as various table and other operations, and
also many utility functions used by many types of operations, so
we can later continue this refactoring effort.

Refs #5783

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 14:30:10 +03:00
Nadav Har'El
83670d2493 alternator: validate vector index attribute values on write
When a table has a vector index, writes to the indexed attribute
(via PutItem, UpdateItem, or BatchWriteItem) must supply a value that
is a vector of the appropriate length: It must be a list of exactly the
declared number of elements, where each element is a numeric type ("N")
representable as a 32-bit float. Before this patch, invalid values were
silently accepted and the item was simply not indexed (it was skipped
by the vector store when it read this item). Now these writes are
rejected with a ValidationException.

This is analogous to the existing validation of GSI/LSI key attribute
values - in DynamoDB after a certain attribute becomes the key of a
GSI or LSI, the user is no longer allowed to write the same type.

The implementation we add here is also analogous to the implementation
of the GSI/LSI key validation. The GSI/LSI key validation is done
by validate_value_if_index_key / si_key_attributes, and in this
patch we add the vector-index parallels: vector_index_attributes()
collects the attribute name and declared dimensions for every vector
index in the schema, and validate_value_if_vector_index_attribute()
enforces the type limitations.

For efficiency in the common case where a table has no vector indexes
and no GSIs/LSIs, both validation functions are out-of-line and each
call site guards the call with an explicit empty() check, so no
function-call overhead is incurred when there is nothing to validate.
For UpdateItem, the map of vector index attributes is cached in
update_item_operation (alongside the existing _key_attributes cache)
to avoid recomputing it on every call to update_attribute().
2026-04-16 13:31:49 +03:00
Nadav Har'El
aea7b6a66b alternator: DescribeTable for vector index: add IndexStatus and Backfilling
Add to DescribeTable's output for VectorIndexes two fields - IndexStatus
and Backfilling - which are intended to exactly mirror these two fields
that exist for GlobalSecondaryIndexes:

When a vector index is added, IndexStatus is "CREATING" before the index
is usable, and "ACTIVE" when it is finally usable for a Query. During
"CREATING" phase, "Backfilling" may be set to true when the index is
currently being backfilled (the table is scaned and an index is built).

A user is expected to call DescribeTable in a loop after creating a
vector index (via either CreateTable and UpdateTable) and only call
Query on the index after the IndexStatus is finally ACTIVE. Calling
Query earlier, while IndexStatus is still CREATING, will result in an
error.

In the current implementation, Alternator does not track the state of the
vector index, so it needs to contact the vector store to inquire about
the state of the index - using a new function introduced in this patch
that uses an existing vector-store API. This makes DescribeTable slower
on tables that have vector indexes, because the vector store is contacted
on every DescribeTable call.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 13:31:49 +03:00
Nadav Har'El
e43a2e5086 alternator: implement Query with a vector index
We introduce to the Query request a new "VectorSearch" parameter, which
take a mandatory "QueryVector" (a value which must be a numeric vector
of the right length) and "Limit".

The "Limit" of a vector search (Query with VectorSearch) determines the
number of nearest neighbors to return, and does not allow pagination
(ExclusiveKeyStart is not allowed). ConsistentRead=True is also not
allowed on a vector search query.

The "Select"/"ProjectionExpression"/"AttributesToGet" parameters are
also supported, requesting which attributes to fetch. Using Select=
ALL_PROJECTED_ATTRIBUTES means read only the attributes found in the
vector index - currently only the key columns - so it is significantly
faster than ALL_ATTRIBUTES because it doesn't require reading the items
from the base table.

The "FilterExpression" parameter is also supported. Like in DynamoDB's
traditional Query, this does post-filtering, i.e., removing some of the
results returned by the vector index that don't match the filter, and
as a result fewer than Limit results may be returned.

Pre-filtering (done on the vector store, and always returns Limit
results) is not yet implemented.
2026-04-16 13:31:47 +03:00
Nadav Har'El
68e34c57e1 alternator: fix bug in describe_multi_item()
In commit a55c5e9ec7, the function
describe_multi_item() got a new item_callback parameter, that can
be used to calculate the size of the item. This new parameter
has a default, an empty noncopyable_function. But an empty
noncopyable_function shouldn't be called - exactly like std::function,
it throws std::bad_function_call if called when empty.

So describe_multi_item() should only call this item_callback if
it's not empty.

This became a problem in the next patch, implementing vector search
query, which called describe_multi_item with the default item_callback.
But in general, the function should be usable with the default parameter
(or we shouldn't have defined a default value for this parameter!).

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-04-16 13:30:02 +03:00
Nadav Har'El
ffe1029b7c alternator: prevent adding GSI conflicting with a vector index
All the "indexes" we implement in Alternator - GSI, LSI and the new
vector index - share the same IndexName namespace, which we'll use in
Query to refer to the index. In the previous patch we already prevented
adding a vector index with the same name as an existing GSI or LSI.
In this patch we also prevent the reverse - adding a GSI with the name
of an existing vector index.

Additionally, one cannot add a GSI on a key that is already the key of
a vector index: The types conflict: The key of a vector index must be a
vector column, while the key of a GSI must have a standard key type
(string, binary or number).

We have tests for this later, this the big test patch.
2026-04-16 13:30:02 +03:00
Nadav Har'El
82de16f92c alternator: implement UpdateTable with a vector index
After an earlier patch allowed CreateTable to create vector indexes
together with a table, in this patch we add to UpdateTable the ability
to add a new vector index to an existing table, as well as the ability
to delete a vector index from an existing table.

The implementation is inspired by DynamoDB's syntax for GSI - just like
GSI has GlobalSecondaryIndexUpdates with "Create" and "Delete" operations,
for vector indexes we have VectorIndexUpdates supporting Create and
Delete. "Update" is not yet supported - we didn't implement yet any
parameter that can be updated - but we can easily implement it in the
future.
2026-04-16 13:30:02 +03:00
Nadav Har'El
217090a996 alternator: implement DescribeTable with a vector index
In this patch we add to DescribeTable the ability to list the vector indexes
enabled on an Alternator table.
2026-04-16 13:30:02 +03:00
Nadav Har'El
e156d67177 alternator: implement CreateTable with a vector index
ScyllaDB supports the "vector search" feature in CQL.
In this patch we start the path to adding vector search support also to
Alternator.

In this patch, we implement CreateTable support - allowing the user to enable
vector search in a new table. The following patches will enable additional
operations like UpdateTable (adding a vector index to an existing table or
deleting a vector index to an existing table) and DescribeTable.

Extensive tests for all these features will come at the end of the series.
Those tests were written in parallel with writing this implementation so cover
(hopefully) every nook and cranny of the imlementation.
2026-04-16 13:29:58 +03:00
Piotr Szymaniak
6913efab5c audit/alternator: Audit requests
Both the successful ones as well as the failed ones are audited.

Each Alternator operation sets up audit metadata via an
executor::maybe_audit() helper, which checks will_log() and only
heap-allocates audit_info_alternator when auditing is enabled. DDL
and metadata operations pass no consistency level; data read/write
operations pass the actual CL used.

BatchWriteItem and BatchGetItem guard table name collection with
will_log() to avoid unnecessary work when auditing is disabled.
ListStreams audits the input table name rather than collecting
output table names during iteration. UntagResource sets up
auditing after parameter validation. Exception re-throw in
server.cc uses co_return coroutine::exception().

The chosen audit types for the operations:
- CreateTable - DDL
- DescribeTable - QUERY
- DeleteTable - DDL
- UpdateTable - DDL
- PutItem - DML
- UpdateItem - DML
- GetItem - QUERY
- DeleteItem - DML
- ListTables - QUERY
- Scan - QUERY
- DescribeEndpoints - QUERY
- BatchWriteItem - DML
- BatchGetItem - QUERY
- Query - QUERY
- TagResource - DDL
- UntagResource - DDL
- ListTagsOfResource - QUERY
- UpdateTimeToLive - DDL
- DescribeTimeToLive - QUERY
- ListStreams - QUERY
- DescribeStream - QUERY
- GetShardIterator - QUERY
- GetRecords - QUERY
- DescribeContinuousBackups - QUERY
2026-04-15 11:55:42 +02:00
Radosław Cybulski
4b984212ba alternator: improve parsing / generating of StreamArn parameter
Previously Alternator, when emit Amazon's ARN would not stick to the
standard. After our attempt to run KCL with scylla we discovered few
issues.

Amazon's ARN looks like this:

arn:partition:service:region:account-id:resource-type/resource-id

for example:

arn:aws:dynamodb:us-west-2:111122223333:table/TestTable/stream/2015-05-11T21:21:33.291

KCL checks for:
- ARN provided from Alternator calls must fit with basic Amazon's ARN
  pattern shown above,
- region constisting only of lower letter alphabets and `-`, no
  underscore character
- account-id being only digits (exactly 12)
- service being `dynamodb`
- partition starting with `aws`

The patch updates our code handling ARNs to match those findings.

1. Split `stream_arn` object into `stream_arn` - ARN for streams only and
`stream_shard_id` - id value for stream shards. The latter receives original
implementation. The former emits and parses ARN in a Amazon style.
 for example:
2. Update new `stream_arn` class to encode keyspace and table together
separating them by `@`. New ARN looks like this:

arn:aws:dynamodb:us-east-1:000000000000:table/TestKeyspace@TestTable/stream/2015-05-11T21:21:33.291

3. hardcode `dynamodb` as service, `aws` as partition, `us-east-1` as
   region and `000000000000` as account-id (must have 12 digits)
4. Update code handling ARNs for tags manipulation to be able to parse
   Amazon's style ARNs. Emiting code is left intact - the parser is now
   capable of parsing both styles.
5. Added unit tests.

Fixes #28350
Fixes: SCYLLADB-539
Fixes: #28142

Closes scylladb/scylladb#28187
2026-04-14 18:07:05 +03:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Botond Dénes
035aa90d4b Merge 'Alternator: add per-table batch latency metrics and test coverage' from Amnon Heiman
This series fixes a metrics visibility gap in Alternator and adds regression coverage.

Until now, BatchGetItem and BatchWriteItem updated global latency histograms but did not consistently update per-table latency histograms. As a result, table-level latency dashboards could miss batch traffic.

It updates the batch read/write paths to compute request duration once and record it in both global and per-table latency metrics.

Add the missing tests, including a metric-agnostic helper and a dedicated per-table latency test that verifies latency counters increase for item and batch operations.

This change is metrics-only (no API/behavior change for requests) and improves observability consistency between global and per-table views.

Fixes #28721

**We assume the alternator per-table metrics exist, but the batch ones are not updated**

Closes scylladb/scylladb#28732

* github.com:scylladb/scylladb:
  test(alternator): add per-table latency coverage for item and batch ops
  alternator: track per-table latency for batch get/write operations
2026-03-16 17:18:00 +02:00
Botond Dénes
fcc570c697 Merge 'Exorcise assertions from Alternator, using a new throwing_assert() macro' from Nadav Har'El
assert(), and SCYLLA_ASSERT() are evil (Refs #7871) because they can cause the entire Scylla cluster to crash mysteriously instead of cleanly failing the specific request that encountered a serious problem of failed pre-requisite.

In this two-patch series, in the first patch we introduce a new macro throwing_assert(), a convenient drop-in replacement for SCYLLA_ASSERT() but which has all the benefits of on_internal_error() instead of the dangers of SCYLLA_ASSERT().
In the second patch we use the new function to replace every call to SCYLLA_ASSERT() in Alternator by the new throwing_assert().

Here is an example from the second patch to demonstrate the power of this approach: The Alternator code uses the attrs_column() function to retrieve the ":attrs" column of a schema. Since every Alternator table always has an ":attrs" column in its schema, we felt safe to SCYLLA_ASSERT() that this column exists. However, imagine that one day because of a bug, one Alternator table is missing this column. Or maybe not a bug - maybe a malicious user on a shared cluster found a way to deliberately delete this column (e.g, with a CQL command!) and this check fails. Before this patch, the entire Scylla node will crash. If the same request is sent to all nodes - the entire cluster will crash. The user might not even know which request caused this crash. In contrast, after this patch, the specific operation - e.g., PutItem - will get an exception. Only this operation, and nothing else, will be aborted, and the user who sent this request will even get an "Internal Server Error" with the assertion-failure message, alerting them that this specific query is causing problems, while other queries might work normally.

There's no need to backport this patch - unless it becomes annoying that other branches don't have the throwing_assert() function and we want it to ease other backports.

Fixes #28308.

Closes scylladb/scylladb#28445

* github.com:scylladb/scylladb:
  alternator: replace SCYLLA_ASSERT with throwing_assert
  utils: introduce throwing_assert(), a safe replacement for assert
2026-02-27 15:35:36 +02:00
Amnon Heiman
29e0b4e08c alternator: track per-table latency for batch get/write operations
Batch operations were updating only global latency histograms, which left
table-level latency metrics incomplete.

This change computes request duration once at the end of each operation and
reuses it to update both global and per-table latency stats:

Latencies are stored per table used,

This aligns batch read/write metric behavior with other operations and improves
per-table observability.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2026-02-25 20:51:18 +02:00
Nadav Har'El
2823780557 alternator ttl: move TTL_TAG_KEY to a header file
TTL_TAG_KEY stores the name of the tag in which we store the name of the
table's expiration-time column, for Alternator's TTL feature.

We already need this name in two source files, and soon we'll need it
in more files - as we want to use the same implementation also for for
a new per-row TTL feature in CQL. So it's time to move the declaration
of this variable to a new header file - alternator/ttl_tag.hh.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-02-25 14:59:42 +02:00
Nadav Har'El
b78bb914d7 alternator: replace SCYLLA_ASSERT with throwing_assert
Replace all calls to SCYLLA_ASSSERT() in Alternator by the better and
safer throwing_assert() introduced in the previous patch.

As a result of this patch, if one of the call sites for these asserts
is buggy and ever fails, only the involved operation will be killed
by an exception, instead of crashing the whole server - and often the
entire cluster (as the same buggy request reaches all nodes and
crashes them all).

Additionally, this patch replaces a few existing uses in Alternator
of on_internal_error() with a non-interesting message with a
more-or-less equivalent, but shorter, throwing_assert(). The idea is
to convert the verbose idiom:

       if (!condition) {
          on_internal_error(logger, "some error message")
       }

With the shorter

       throwing_assert(condition)

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2026-02-25 14:58:47 +02:00
Nadav Har'El
f23e796e76 alternator: fix typos in comments and variable names
Copilot found these typos in comments and variable name in alternator/,
so might as well fix them.

There are no functional changes in this patch.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#28447
2026-02-02 19:16:43 +03:00
Michael Litvak
1f7a65904e alternator: don't require rf_rack flag for indexes, validate instead
In 8df61f6d99 we changed the requirements for creating materialized
views and MV-based indexes - instead of requiring the
rf_rack_valid_keyspaces flag to be set, we now require the keyspace to
be RF-rack-valid at the time of creation, and it is enforced to remain
RF-rack-valid while the MV exists. This validation is done in the cql
create view/index statements.

The same should be done also for alternator - when creating a table with
GSI or LSI, or when adding a GSI to an existing table, previously we
required the flag rf_rack_valid_keyspaces to be set. Now we change it to
instead check if the keyspace is RF-rack-valid, and if not the operation
fails with an appropriate error.
2026-01-22 16:11:35 +01:00
Michael Litvak
e7ec87382e Revert "alternator: require rf_rack_valid_keyspaces when creating index"
This reverts commit 4b26a86cb0.

The rf_rack_valid_keyspaces option is now not required for creating MVs.
2026-01-20 09:56:48 +01:00
Avi Kivity
66aee0fb5e alternator: add optional listeners for proxy protocol v2
Following 954f2cbd2f, which added proxy protocol v2 listeners
for CQL, we do the same for alternator. We add two optional ports
for plain and TLS-wrapped HTTP.

We test each new port, that the old ports still work, and that
mixing up a port with no proxy protocol and a connection with proxy
protocol (or the opposite) fails. The latter serves to show
that the testing strategy is valid and doesn't just pass whatever
happens. We also verify that the correct addresses (and TLS mode)
show up in system.clients.

Closes scylladb/scylladb#27889
2026-01-13 09:59:24 +02:00
Radosław Cybulski
df20f178aa alternator: fix invalid rebase
Fix an invalid rebase, that would properly merge code coming
from master, except that code would ignore refactor done in the patch.
2025-12-29 08:33:10 +01:00
Radosław Cybulski
a86b782d3f Add table size to DescribeTable's output
Add a table size to DescribeTable's output.
2025-12-29 08:33:07 +01:00
Radosław Cybulski
1bd855a650 Promote fill_table_description and create_table_on_shard0 to methods
Promote `executor::fill_table_description` and
`executor::create_table_on_shard0` to methods (from static functions).
2025-12-29 08:33:06 +01:00
Radosław Cybulski
e246abec4d Add ref to service::storage_service to executor
Add a reference to `service::storage_service` to executor object.
2025-12-29 08:33:03 +01:00
Michael Litvak
b9ec1180f5 alternator: require rf_rack_valid_keyspaces when creating index
When creating an alternator table with tablets, if it has an index, LSI
or GSI, require the config option rf_rack_valid_keyspaces to be enabled.

The option is required for materialized views in tablets keyspaces to
function properly and avoid consistency issues that could happen due to
cross-rack migrations and pairing switches when RF-rack validity is not
enforced.

Currently the option is validated when creating a materialized view via
the CQL interface, but it's missing from the alternator interface. Since
alternator indexes are based on materialized views, the same check
should be added there as well.

Fixes scylladb/scylladb#27612

Closes scylladb/scylladb#27622
2025-12-15 10:36:57 +02:00
Nadav Har'El
0c64e3be9a Merge 'Unify and fix rjson string and string_view conversions' from Marcin Maliszkiewicz
This patch-set consolidates and corrects rjson string conversion handling.
It removes unnecessary string copies, ensures proper length usage and
replaces ad-hoc conversions with consistent helper functions.

Overall, the changes make rjson string handling safer, faster, and more uniform across the codebase.

Backport: no, it's a refactor

Closes scylladb/scylladb#27394

* github.com:scylladb/scylladb:
  fix rjson::value to bytes conversion with missing GetStringLength call
  alternator: change type from string to string_view in should_add_capacity
  fix rjson::value to string_view conversion with missing GetStringLength call
  use rjson::to_string_view when rjson::value gets converted using GetStringLength
  use rjson::to_sstring and rjson::to_string for various string conversions
  utils: use rjson document wrapper in instance_profile_credentials_provider::parse_creds
  utils: move rjson::to_string_view func to string related place
  utils: add to_sstring and to_string rjson helper
2025-12-11 12:05:41 +02:00
Marcin Maliszkiewicz
be9992cfb3 fix rjson::value to bytes conversion with missing GetStringLength call 2025-12-09 19:27:22 +01:00
Marcin Maliszkiewicz
62962f33bb fix rjson::value to string_view conversion with missing GetStringLength call
In some cases we unnecessarily convert to string which
causes a copy. In other we convert without calling
GetStringLength which causes iteration to dermine length
which is already known. In some cases we do even both.
This commit fixes that.
2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz
060c2f7c0d use rjson::to_string_view when rjson::value gets converted using GetStringLength
This commit is only cosmetics, changes calls to GetStringLength
into rjson::to_string_view with the same underlying implementation.
2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz
64149b57c3 use rjson::to_sstring and rjson::to_string for various string conversions
In some cases we ommit size checking which is wrong
as according to rapid json documentation strings may
contain \0 byte in the middle.
2025-12-09 19:27:21 +01:00
Petr Gusev
608eee0357 alternator/executor.cc: eliminate redundant dk copy
A small refactoring/optimization.
2025-12-09 10:21:06 +01:00
Petr Gusev
0bcc2977bb alternator/executor.cc: release cas_shard on the original shard
Before this series, we kept the cas_shard on the original shard to
guard against tablet movements running in parallel with
storage_proxy::cas.

The bug addressed by this PR shows that this approach is flawed:
keeping the cas_shard on the original shard does not guarantee that
a new cas_shard acquired on the target shard won’t require another
jump.

We fixed this in the previous commit by checking cas_shard.this_shard()
on the target shard and continuing to jump to another shard if
necessary. Once cas_shard.this_shard() on the target shard returns
true, the storage_proxy::cas invariants are satisfied, and no other
cas_shard instances need to remain alive except the one passed
into storage_proxy::cas.
2025-12-09 10:21:06 +01:00
Petr Gusev
3a865fe991 alternator/executor.cc: move shard check into cas_write
This change ensures that if cas_shard points to a different shard,
the executor will continue issuing shard jumps until
cas_shard.this_shard() returns true. The commit simply moves the
this_shard() check from the parallel_for_each lambda into cas_write,
with minimal functional changes.

We enable test_alternator_invalid_shard_for_lwt since now it should
pass.

Fixes scylladb/scylladb#27353
2025-12-09 10:21:01 +01:00
Petr Gusev
c6eec4eeef alternator/executor.cc: make cas_write a private method
We will need to access executor::_stats field from cas_write. We could
pass it as a paramter, but it seems simpler to just make cas_write
and instance method too.
2025-12-08 10:29:54 +01:00
Petr Gusev
9bef142328 alternator/executor.cc: make do_batch_write a private method
We will need to access executor::_stats field on other shards.
2025-12-08 10:29:54 +01:00
Petr Gusev
74bf24a4a7 alternator/executor.cc: fix indent 2025-12-08 10:29:28 +01:00
Petr Gusev
e60bcd0011 test_alternator: add test_alternator_invalid_shard_for_lwt
This test reproduces scylladb/scylladb#27353 using two injection
points. First, the test triggers an intra-node tablet migration and
suspends it at the streaming stage using the
intranode_migration_streaming_wait injection. Next, it enables the
alternator_executor_batch_write_wait injection, which suspends a
batch write after its cas_shard has already been created.
The test then issues several batch writes and waits until one of them
hits this injection on the destination shard. At this point, the
cas_shard.erm for that write is still in the streaming state,
meaning the executor would need to jump back to the source shard.
The test then resumes the suspended tablet migration, allowing it to
update the ERM on the source shard to write_both_read_new. After that,
the test releases the suspended batch write and expects it to perform
two shard jumps: first from the destination to the source shard, and
then again back to the source shard.

This commit adds the alternator_executor_batch_write_wait injection to
alternator/executor.cc. Coroutines are intentionally avoided in the
parallel_for_each lambda to prevent unnecessary coroutine-frame
allocations.
2025-12-08 10:29:28 +01:00
Petr Gusev
f00f7976c1 alternator/executor.cc: avoid cross-shard free
This commit is an optimization: avoiding destruction of
foreign objects on the wrong shard. Releasing objects allocated on a
different shard causes their ::free calls to be executed remotely,
which adds unnecessary load to the SMP subsystem.

Before this patch, a std::vector could be moved
to another shard. When the vector was eventually destroyed,
its ::free had to be marshalled back to the shard where the memory had
originally been allocated. This change avoids that overhead by passing
the vector by const reference instead.

The referenced objects lifetime correctness reasoning:
* the put_or_delete_item refs usages in put_or_delete_item_cas_request
are bound to its lifetime
* cas_request lifetime is bound to storage_proxy::cas future
* we don't release put_or_delete_item-s untill all storage_proxy::cas
calls are done.
2025-12-07 16:14:56 +01:00
Petr Gusev
c428645d16 storage_proxy: cas: take cas_request by raw reference
In the next commit we want to add an optimization that relies on
precise control over the lifetime of cas_request. In particular, we
want the implementation of this interface in Alternator to operate on
raw references that are guaranteed to remain valid only until the
cas() future is resolved. We already depend on the same lifetime
assumptions in cas_request when used by modification_statement.
However, these assumptions are not clearly expressed in the current
interface: cas_request is taken by shared_ptr, and nothing prevents
cas() from storing that pointer inside paxos_response_handler, which
may outlive the cas() future.

This commit fixes that by taking cas_request by raw reference. This
makes it explicit that cas() does not assume ownership of the object.
Callers must ensure that the referenced object remains valid until
the returned future is resolved.
2025-12-07 16:14:56 +01:00
Nadav Har'El
350cbd1d66 alternator: fix typo of BatchWriteItem in comments
The DynamoDB API's "BatchWriteItem" operation is spelled like this, in
singular. Some comments incorrectly referred to as BatchWriteItems - in
plural. This patch fixes those mistakes.

There are no functional changes here or changes to user-facing documents -
these mistakes were only in code comments.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#27446
2025-12-05 15:08:58 +02:00
Piotr Dulikowski
44c605e59c Merge 'Fix the types of change events in Alternator Streams' from Piotr Wieczorek
This patch increases the compatibility with DynamoDB Streams by integrating the DynamoDB's event type rules (described in https://github.com/scylladb/scylladb/issues/6918) into Alternator. The main changes are:
- introduce a new flag `alternator_streams_strict_compatibility`, meant as a guard of performance-intensive operations that increase the compatibility with DynamoDB Streams. If enabled, Alternator always performs a RBW before a data-modifying operation, and propagates its result to CDC. Then, the old item is compared to the new one, to determine the mutation type (INSERT vs MODIFY). This option is a no-op for tables with disabled Alternator Streams,
- reduce splitting of simple Alternator mutations,
- correctly distinguish event types described in #6918, except for item deletes. Deleting a missing item with DeleteItem, BatchWriteItem, or a missing field with UpdateItem still emit REMOVEs.

To summarize, the emitted events of the data manipulation operations should be as follows:
- DeleteItem/BatchWriteItem.DeleteItem of existing item: REMOVE (OK)
- DeleteItem of nonexistent item: nothing (OK)
- BatchWriteItem.DeleteItem of nonexistent item: nothing (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of existing and not equal item: MODIFY (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of existing and equal item: nothing (OK)
- PutItem/UpdateItem/BatchWriteItem.PutItem of nonexistent item: INSERT (OK)

No backport is necessary.

Refs https://github.com/scylladb/scylladb/pull/26149
Refs https://github.com/scylladb/scylladb/pull/26396
Refs https://github.com/scylladb/scylladb/issues/26382
Fixes https://github.com/scylladb/scylladb/issues/6918

Closes scylladb/scylladb#26121

* github.com:scylladb/scylladb:
  test/alternator: Enable the tests failing because of #6918
  alternator, cdc: Don't emit events for no-op removes
  alternator, cdc: Don't emit an event for equal items
  alternator/streams, cdc: Differentiate item replace and item update in CDC
  alternator: Change the return type of rmw_operation_return
  config: Add alternator_streams_strict_compatibility flag
  cdc: Don't split a row marker away from row cells
2025-11-30 07:20:22 +01:00
Radosław Cybulski
b54a9f4613 Fix use-after-free in encode_paging_state in Alternator
Fix unlikely use-after-free in `encode_paging_state`. The function
incorrectly assumes that current position to encode will always have
data for all clustering columns the schema defines. It's possible to
encounter current position having less than all columns specified, for
eample in case of range tombstone. Those don't happen in Alternator
tables as DynamoDB doesn't allow range deletions and clustering key
might be of size at most 1. Alternator api can be used to read
scylla system tables and those do have range tombstones with more
than single clustering column.

The fix is to stop trying to encode columns, that don't have the value -
they are not needed anyway, as there's no possible position with those
values (range tombstone made sure of that).

Fixes #27001
Fixes #27125

Closes scylladb/scylladb#26960
2025-11-28 16:51:15 +03:00
Wojciech Mitros
3c376d1b64 alternator: use storage_proxy from the correct shard in executor::delete_table
When we delete a table in alternator, the schema change is performed on shard 0.
However, we actually use the storage_proxy from the shard that is handling the
delete_table command. This can lead to problems because some information is
stored only on shard 0 and using storage_proxy from another shard may make
us miss it.
In this patch we fix this by using the storage_proxy from shard 0 instead.

Fixes https://github.com/scylladb/scylladb/issues/27223

Closes scylladb/scylladb#27224
2025-11-25 18:56:31 +01:00
Nadav Har'El
64a075533b alternator: fix update of stats from wrong shard
In commit 51186b2 (PR #25457) we introduced new statistics for
authentication errors, and among other places we modified
executor::create_table() to update them when necessary.

This function runs its real work (create_table_on_shard0()) on shard
0, but incorrectly updates "_stats" from the original shard. It doesn't
really matter which shard's stats we update - but it does matter that
code running on shard 0 shouldn't touch some other shard's objects.
Since all we do on these stats is to increment an integer, the risk
of updating it on the wrong shard is minimal to non-existant, but it's
still wrong and can cause bigger trouble in the future as the code
continues to evolve.

The fix is simple - we should pass to create_table_on_shard0() the
_stats object from the acutal shard running it (shard 0).

Fixes #26942

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes scylladb/scylladb#26944
2025-11-21 11:53:06 +02:00
Radosław Cybulski
ce8db6e19e Add table name to tracing in alternator
Add a table name to Alternator's tracing output, as some clients would
like to consistently receive this information.

- add missing `tracing::add_table_name` in `executor::scan`
- add emiting tables' names in `trace_state::build_parameters_map`
- update tests, so when tracing is looked for it is filtered by table's
  name, which confirms table is being outputed.
- change `struct one_session_records` declaration to `class one_session_records`,
  as `one_session_records` is later defined as class.

Refs #26618
Fixes #24031

Closes scylladb/scylladb#26634
2025-11-21 09:33:40 +02:00
Nadav Har'El
c03081eb12 alternator: improve error in tablets_mode_for_new_keyspaces=enforced
When in tablets_mode_for_new_keyspaces=enforced mode, Alternator is
supposed to fail when CreateTable asks explicitly for vnodes. Before
this patch, this error was an ugly "Internal Server Error" (an
exception thrown from deep inside the implementation), this patch
checks for this case in the right place, to generate a proper
ValidationException with a proper error message.

We also enable the test test_tablets_tag_vs_config which should have
caught this error, but didn't because it was marked xfail because
tablets_mode_for_new_keyspaces had not been live-updatable. Now that
it is, we can enable the test. I also improved the test to be slightly
faster (no need to change the configuration so many times) and also
check the ordinary case - where the schema doesn't choose neither
vnodes nor tablets explicitly and we should just use the default.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-11-09 12:52:29 +02:00
Nadav Har'El
b34f28dae2 alternator: improve comment about non-hidden system tags
The previous patches added a somewhat misleading comment in front of
system:initial_tablets, which this patch improves.

That tag is NOT where Alternator "stores" table properties like the
existing comment claimed. In fact, the whole point is that it's the
opposite - Alternator never writes to this tag - it's a user-writable
tag which Alternator *reads*, to configure the new table. And this is
why it obviously can't be hidden from the user.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2025-11-09 12:52:29 +02:00
Piotr Szymaniak
63897370cb alternator: Fix tag name to request vnodes
The tag was lately renamed from `experimental:initial_tablets` to
`system::initial_tablets`. This commit fixes both the tests as well as
the exceptions sent to the user instructing how to create table with
vnodes.
2025-11-09 12:52:29 +02:00