In make_rf_change_plan, load balancer schedules necessary migrations,
considering the load of nodes and other pending tablet transitions.
Requests from ongoing_rf_changes are processed concurrently, independently
from one another. In each request racks are processed concurrently.
No tablet replica will be removed until all required replicas are added.
While adding replicas to each rack we always start with base tables
and won't proceed with views until they are done (while removing - the other
way around).
Node availability is checked at two levels for extending actions:
1) In prepare_per_rack_rf_change_plan: the entire RF change request is
aborted if any node in the target dc+rack is down, or if there are
no live (non-excluded) nodes at all. Shrinking is never aborted.
2) In make_rf_change_plan: extending is skipped for a given round if
any normal, non-excluded node in the target dc+rack is missing from
the balanced node set. Shrinking always proceeds regardless.
The resulting behavior per node state combination (extending only):
- all up -> proceed
- some excluded + some up -> proceed (excluded nodes are skipped)
- any down node -> abort
- all excluded (no live) -> abort
When the last step is finished:
- in system_schema.keyspaces:
- next_replication is cleared;
- new keyspace properties are saved (if request succeeded);
- request is removed from ongoing_rf_changes;
- the request is marked as done in system.topology_requests.
Add keyspace_rf_change_plan to migration_plan.
The keyspace_rf_change_plan consists of:
- completion - info about the request for which all migrations are done. Only one
request can be completed at the time, even if more have finished migrations
(the rest will be completed later). Based on it:
- next_replication is cleared;
- new keyspace properties are saved (only if succeeded);
- request is removed from ongoing_rf_changes;
- the request is marked as done in system.topology_requests.
- aborts - info about requests that cannot complete because the required
rf change is impossible (e.g. no available nodes in a required rack).
Multiple requests can be aborted in a single plan. Based on each:
- next_replication is set to current_replication (rolling back);
- the request is marked as aborted with an error in system.topology_requests.
The scheduled rebuilds will be kept in migration_plan::_migrations.
Based on that the canonical_mutations are generated.
Add update_topology_state_with_mixed_change and use it if any schema
changes are required, i.e. if plan contains keyspace_rf_change_plan::completion.
Split update_node_load_on_migration into decrease_node_load and
increase_node_load - in the following changes for rebuilds we will
need only one of those at the time.
In the following changes, keyspace_rf_change handler will also consider
a change of RF by more than one. Rearrange the handler, so that it
first chooses a kind of RF change and then creates relevant updates.
Do not wrap the code in schedule_migration function, as we no longer
need a quick return possibility.
Add a new next_replication column to system_schema.keyspaces table.
While there is an ongoing RF change:
- next_replication keeps the target RF values;
- existing replication_v2 column keeps initial RF values - the ones we
started the RF change with.
DESCRIBE KEYSPACE statement shows replication_v2.
When there is no ongoing RF change for this keyspace, its
next_replication is empty.
In this commit no data is kept in the new column.
Following changes, will allow adding or removing all keyspace
replicas in a DC with a single ALTER KEYSPACE. For such operations,
the tablet load balancer needs to schedule rebuilds. To track
which RF change requests require rebuilds, we maintain a vector
of RF changes along with their ongoing rebuild phases.
Add a new ongoing_rf_changes column to system.topology to keep track
of those requests.
In this commit no data is kept in the new column.
The stream_mutation_fragments RPC handler did not check
is_in_critical_disk_utilization_mode before accepting incoming mutation
fragments. This meant load-and-stream (nodetool refresh --load-and-stream)
could push data onto a node at critical disk utilization, potentially
filling the disk completely.
Add a critical disk utilization check in the get_next_mutation_fragment
lambda, throwing critical_disk_utilization_exception when the node is in
critical mode. This mirrors the existing protection in stream_blob.cc.
Also remove the xfail marker from the corresponding test added in the
previous commit.
This PR removes the power-of-two token constraint from vnodes-to-tablets migrations, allowing clusters with randomly generated tokens to migrate without manual token reassignment.
Previously, migrations required vnode tokens to be a power of two and aligned. In practice, these conditions are not met with Scylla's default random token assignment, so the constraint is a blocker for real-world use. With the introduction of arbitrary tablet boundaries in PR #28459, the tablet layer can now support arbitrary tablet boundaries. This PR builds on that capability to allow arbitrary vnode tokens during migration.
When the highest vnode token does not coincide with the end of the token ring, the vnode wraps around, but tablets do not support that. This is handled by splitting it into two tablets: one covering the tail end of the ring and one covering the beginning.
Testing has been updated accordingly: existing cluster tests now use randomly generated tokens instead of precomputed power-of-two values, and a new Boost test validates the wrap-around tablet boundary logic.
Fixes SCYLLADB-724.
New feature, no backport is needed.
Closesscylladb/scylladb#29319
* github.com:scylladb/scylladb:
test: Use arbitrary tokens in vnodes->tablets migration tests
test: boost: Add test for wrap-around vnodes
storage_service: Support vnodes->tablets migrations w/ arbitrary tokens
storage_service: Hoist migration precondition
With migration to pyest this fixture is useless. Removing and setting
the session to the module for the most of the tests.
Add dynamic_scope function to support running alternator fixtures in
session scope, while Test and TestSuite are not deleted. This is for
migration period, later on this function should be deleted.
With migration of preparation environment and starting 3rd party services
to the pytest, they're output the logs to the terminal. So this PR
binds them their own log file to avoid polluting the terminal.
With the latest changes, there are a lot of code that is redundant in
the test.py. This PR just cleans this code.
Changes in other files are related to cleaning code from the test.py,
especially with redundant parameter --test-py-init and moving
prepare_environment to pytest itself.
The retry loop in `start_docker_service` passes the parse callbacks via `std::move` into `create_handler` on each iteration. After the first iteration, the moved-from `std::function` objects are empty. All subsequent retries skip output parsing entirely and immediately treat the service as successfully started. This defeats the entire purpose of the retry mechanism.
Fix by passing the callbacks by copy instead of move, so the original callbacks remain valid across retries.
Fixes SCYLLADB-1542
This is a CI stability issue and should be backported.
Closesscylladb/scylladb#29504
* github.com:scylladb/scylladb:
test/lib: fix typos in proc_utils, gcs_fixture, and dockerized_service
test: gcs_fixture: rename container from "local-kms" to "fake-gcs-server"
test: fix proc_utils.cc formatting from previous commit
test: lib: use unique container name per retry attempt
test: lib: fix broken retry in start_docker_service
Add a `CHILD_SHARDS` filter to `DescribeStream` command.
When used, user need to pass a parent stream shard id as
json's ShardFilter.ShardId field. DescribeStream will
then return only list of stream shards, that are direct
descendants of passed parent stream shard.
Each stream shard cover a consecutive part of token space.
A stream shard Q is considered to be a child of stream shard W,
when at least one token belongs to token spaces from both streams.
The filtering algorithm itself is somewhat complicated - more details
in comments in streams.cc.
CHILD_SHARDS is a Amazon's functionality and is required by KCL.
Add unit tests.
Fixes: #25160Closesscylladb/scylladb#28189
Increase the non-AWS wait in the TTL streams test to reduce vnode CI flakes caused by delayed expiration visibility.
Fixes SCYLLADB-1556
Closesscylladb/scylladb#29516
This will make debugging of stalled tablet transitions easier. We saw
several issues when topology state machine was blocked by active
tablet migrations, which was not obvious at first glance of the
logs. Now it will be east to tell if tablet transitions are blocking
progress and which transitions are stuck.
Closesscylladb/scylladb#28616
drain() signals the postponed_reevaluation condition variable to terminate
the postponed_compactions_reevaluation() coroutine but does not await its
completion. When enable() is called afterwards, it overwrites
_waiting_reevalution with a new coroutine, orphaning the old one. During
shutdown, really_do_stop() only awaits the latest coroutine via
_waiting_reevalution, leaving the orphaned coroutine still alive. After
sharded::stop() destroys the compaction_manager, the orphaned coroutine
resumes and reads freed memory (is_disabled() accesses _state).
Fix by introducing stop_postponed_compactions(), awaiting the reevaluation coroutine in
both drain() and stop() after signaling it, if postponed_compactions_reevaluation() is running.
It uses an std::optional<future<>> for _waiting_reevalution and std::exchange to leave
_waiting_reevalution disengaged when postponed_compactions_reevaluation() is not running.
This prevents a race between drain() and stop().
While at it, fix typo in _waiting_reevalution -> _waiting_reevaluation.
Fixes: SCYLLADB-1463
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Closesscylladb/scylladb#29443
Add to test/alternator/run the option "-vs" which runs alongside with
Scylla a vector store, to allow running Alternator tests with vector
indexing.
To get the vector store, do
git clone git@github.com:scylladb/vector-store.git
cargo build --release
"run -vs" looks for an executable in ../vector-store/target/*/vector-store
but can also be overridden by the VECTOR_STORE environment variable.
test/alternator/run runs the vector store exactly like it runs Scylla -
in a temporary directory, on a temporary IP address in the localhost
subnet (127.0.0/8), killing it when the test end, and showing the output
of both programs (Scylla and vector store). These transient runs of
Scylla and vector store are configured to be able to communicate to
each other.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch adds a new document, docs/alternator/vector-search.md, on the
new vector search feature in Alternator. It introduces this feature, and
the DynamoDB APIs that we extended to support it.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The new_dynamodb_session() function had a bug which we never noticed
because we hardly used it, but it became more noticable when the new
test/alternator/test_vector.py started to use it:
By default, boto3 retries a request up to 9 times when it encounters
a retriable error (such as an Internal Server Error). We don't want such
retries in our tests - it makes failures slower, but more importantly
it can hide "flaky" bugs by retrying 9 times until it happens to succeed.
The new_dynamodb_session() had code (copied from the dynamodb fixture)
to set boto3's "max_attempts" configuration to 0, to disable this retry.
But this code had an incorrect "if" to only be done if we're testing on
"localhost". This is wrong: We almost never use "localhost" as the
target of the test; Both test/cqlpy/run and test.py pick an IP address
in the localhost subnet (127/8) and uses that IP address - not the string
"localhost".
This bug only existed in new_dynamodb_session() - the more commonly used
"dynamodb" fixture didn't have this bug.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
One of the tests in the previous patch checked that strange characters
are allowed in attribute names used for vector indexing. It turns out
we never had a test that verifies that regardless of vector indexes -
any character whatsoever is allowed in attribute names. This is
different from table names which are much more limited.
So this patch adds the missing test.
As usual, the new test also passes on DynamoDB, showing that these
stange characters in attribute names are also allowed by DynamoDB.
In this patch we add a large collection of basic functional tests for the
vector index support, covering the CreateTable, UpdateTable, DescribeTable
and Query operations and the various ways in which those are allowed to
work - or expected to fail. These tests were written in parallel with
writing the code so they (hopefully) cover all the corner cases considered
during development, and make sure these corner cases are all handled
correctly and will not regress in the future.
Some of these tests do not involve querying of the index and focus on
the structure of requests and the kind of syntax allowed. But other tests
are end-to-end, requiring the vector store to be running and trying to
index Alternator data and query it. These tests are marked
"needs_vector_store", and are immediately skipped in Scylla is not
configured to connect to a vector store. In a later patch we'll add a
an option to test/alternator/run to be able to run these end-to-end
tests by automatically running both Scylla and the Vector Store.
We'll have additional end-to-end tests in the vector-store repository.
Note that vector search is a new API feature that doesn't exist in DynamoDB,
so we are adding new parameters and outputs to existing operations. The AWS
SDKs don't normally allow doing that, so the test added here begins by
teaching the Python SDK to use the new APIs we added. This piece of code
can also be used by end-users to use vector search (at least in Python...)
before we officially add this support to ScyllaDB's SDK wrappers.
Non-finite numbers (Inf, NaN) don't make sense in vector search, and
also not allowed in the DynamoDB API as numbers. But the parsing code
in Query's QueryVector accepted "Inf" and "NaN" and then failed to
send the request to the vector store, resulting in a strange error
message. Let's fix it in the parsing code.
We have a test (test_query_vectorsearch_queryvector_bad_number_string)
that verifies this fix.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Before this patch, if we attempt a Query with IndexName is a vector index
but forget a "VectorSearch" parameter, the error is misleading: The code
expects a GSI or LSI, and when it can't find a GSI or LSI with that name,
it reports that the index is missing. But this is not helpful. So in this
patch we produce a more helpful message: That the index does exist, and
is a vector index, so a "VectorSearch" parameter is mandatory and is
missing.
The per-table metrics for Query were not incremented for the
vector variant of the Query operations, only the global metrics were
incremented. This patch fixes this oversight, and add a test that
reproduces it (the new test fails before this patch, and passes after).
De-duplicate some code introduced in earlier patches, such a two
nearly-identical loops over the indexes (one to check if there is a
vector index, the second to get its dimensions), and two nearly-
identical chunks of code to get the item contents when there is or
there isn't a clustering key.
There should be no functional changes in this patch.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
In earlier patches, when Query'ing a vector index, we set the default
Select to ALL_ATTRIBUTES. However, according to the DynamoDB documentation
for Query,
"If neither Select nor ProjectionExpression are specified, DynamoDB
defaults to ALL_ATTRIBUTES when accessing a table, and
ALL_PROJECTED_ATTRIBUTES when accessing an index."
This default should also apply to vector index, so this patch fixes this.
The new behavior is not only more compatible with DynamoDB, it is also
much more efficient by default, as ALL_PROJECTED_ATTRIBUTES does not need
to read from the base table - it returns the results that the vector store
returned. Of course, if the user needs the more efficient ALL_ATTRIBUTES
this option is still available - it's just no longer the default.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
This patch continues the effort to split the huge executor.cc (5000
lines before this patch) even more.
In this patch we introduce a new source file, executor_util.cc, for
various utility functions that are used for many different operations
and therefore are useful to have in a header file. These utility
functions will now be in executor_util.cc and executor_util.hh -
instead of executor.cc and executor.hh.
Various source files, including executor.cc, the executor_read.cc
introduced in the previous patch, as well as older source files like
as streams.cc, ttl.cc and serialization.cc, use the new header file.
This patch removes over 700 lines of code from executor.cc, and
also removes a large amount of utility functions declerations from
executor.hh. Originally, executor.hh was meant to be about the
interface that the Alternator server needs to *execute* the different
DynamoDB API operations - and after this patch it returns closer to
this original goal.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Already six years ago, in #5783, we noticed that alternator/executor.cc
has grown too large. The previous patches added hundreds of more lines
to it to implement vector search, and it reached a whopping 7,000 lines
of code. This is too much.
This patch splits from executor.cc two major chunks:
1. The implementation of **read** requests - GetItem, BatchGetItem,
Query (base table, GSI/LSI, and vector-search), and Scan - was
moved to a new source file alternator/executor_read.cc.
The new file has 2,000 lines.
2. Moved 250 lines of template functions dealing with attribute paths
and maps of them to a new header file, attribute_path.hh.
These utilities are used for many different operations - various
read operations use them for ProjectionExpression, and UpdateItem
uses them for modifications to nested attributes, so we need the
new header file from both executor.cc and executor_read.cc
The remaining executor.cc is still pretty big, 5,000 lines, and
contains write operations (PutItem, UpdateItem, DeleteItem,
BatchWriteItem) as well as various table and other operations, and
also many utility functions used by many types of operations, so
we can later continue this refactoring effort.
Refs #5783
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Extend system_info_encryption to encrypt system.raft SSTables.
system.raft contains the Raft log, which may hold sensitive user data
(e.g. batched mutations), so it warrants the same treatment as
system.batchlog and system.paxos.
During upgrade, existing unencrypted system.raft SSTables remain
readable. Existing data is rewritten encrypted via compaction, or
immediately via nodetool upgradesstables -a.
Update the operator-facing system_info_encryption description to
mention system.raft and add a focused test that verifies the schema
extension is present on system.raft.
Fixes: CUSTOMER-268
Backport: 2026.1 - closes an encryption-at-rest coverage gap: system.raft may persist sensitive user-originated data unencrypted; backport to the current LTS.
Closesscylladb/scylladb#29242
They are used only to prevent permission change, but since tables are
unused even if they exists there is no problem changing their
permissions, so no point keeping the definitions just for that.
There's a bunch of db::config options that are used by cql3/statements/ code. For that they use data_dictionary/database as a proxy to get db::config reference. This PR moves most of these accessed options onto cql_config
Options migrated to cql_config:
1. select_internal_page_size
2. strict_allow_filtering
3. enable_parallelized_aggregation
4. batch_size_warn_threshold_in_kb
5. batch_size_fail_threshold_in_kb
6. 7 keyspace replication restriction options
7. 2 TWCS restriction options
8. restrict_future_timestamp
9. strict_is_not_null_in_views (with view_restrictions struct)
10. enable_create_table_with_compact_storage
Some options need special treatment and are still abused via database, namely:
1. enable_logstor
2. cluster_name
3. partitioner
4. endpoint_snitch
Fixing components inter-dependencies, not backporting
Closesscylladb/scylladb#29424
* github.com:scylladb/scylladb:
cql3: Move enable_create_table_with_compact_storage to cql_config
cql3: Move strict_is_not_null_in_views to cql_config
cql3: Move restrict_future_timestamp to cql_config
cql3: Move TWCS restriction options to cql_config
cql3: Move keyspace restriction options to cql_config
cql3: Move batch_size_fail_threshold_in_kb to cql_config
cql3: Move batch_size_warn_threshold_in_kb to cql_config
cql3: Move enable_parallelized_aggregation to cql_config
cql3: Move strict_allow_filtering to cql_config
cql3: Move select_internal_page_size to cql_config
test: Fix cql_test_env to use updateable cql_config from db::config
cql3: Add cql_config parameter to parsed_statement::prepare()
`system.large_partitions`, `system.large_rows`, and `system.large_cells` store records keyed by SSTable name. When SSTables are migrated between shards or nodes (resharding, streaming, decommission), the records are lost because the destination never writes entries for the migrated SSTables.
This patch series moves the source of truth for large data records into the SSTable's scylla metadata component (new `LargeDataRecords` tag 13) and reimplements the three `system.large_*` tables as virtual tables that query live SSTables on demand. A cluster feature flag (`LARGE_DATA_VIRTUAL_TABLES`) gates the transition for safe rolling upgrades.
When the cluster feature is enabled, each node drops the old system large_* tables and starts serving the corresponding tables using virtual tables that represent the large data records now stored on the sstables.
Note that the virtual tables will be empty after upgrade until the sstables that contained large data are rewritten, therefore it is recommended to run upgrade sstables compaction or major compaction to repopulate the sstables scylla-metadata with large data records.
1. **keys: move key_to_str() to keys/keys.hh** — make the helper reusable across large_data_handler, virtual tables, and scylla-sstable
2. **sstables: add LargeDataRecords metadata type (tag 13)** — new struct with binary-serialized key fields, scylla-sstable JSON support, format documentation
3. **large_data_handler: rename partition_above_threshold to above_threshold_result** — generalize the struct for reuse
4. **large_data_handler: return above_threshold_result from maybe_record_large_cells** — separate booleans for cell size vs collection elements thresholds
5. **sstables: populate LargeDataRecords from writer** — bounded min-heaps (one per large_data_type), configurable top-N via `compaction_large_data_records_per_sstable`
6. **test: add LargeDataRecords round-trip unit tests** — verify write/read, top-N bounding, below-threshold behavior
7. **db: call initialize_virtual_tables from shard 0 only** — preparatory refactoring to enable cross-shard coordination
8. **db: implement large_data virtual tables with feature flag gating** — three virtual table classes, feature flag activation, legacy SSTable fallback, dual-threshold dedup, cross-shard collection
Fixes: https://scylladb.atlassian.net/browse/SCYLLADB-1276
* Although this fixes a bug where large data entries are effectively lost when sstables are renamed or migrated, the changes are intrusive and do not warrant a backport
Closesscylladb/scylladb#29257
* github.com:scylladb/scylladb:
db: implement large_data virtual tables with feature flag gating
db: call initialize_virtual_tables from shard 0 only
test: add LargeDataRecords round-trip unit tests
sstables: populate LargeDataRecords from writer
large_data_handler: return above_threshold_result from maybe_record_large_cells
large_data_handler: rename partition_above_threshold to above_threshold_result
sstables: add LargeDataRecords metadata type (tag 13)
sstables: add fmt::formatter for large_data_type
keys: move key_to_str() to keys/keys.hh
The ignore_component_digest_mismatch flag is now initialized at sstable construction
time from sstables_manager::config (which is populated from db::config at boot time).
Remove the flag from sstable_open_config struct and all call sites that were setting
it explicitly.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Initialize the ignore_component_digest_mismatch flag from sstables_manager::config
in the sstable constructor initializer list instead of in load(). This ensures the
flag value is set at construction time when the manager config is available, rather
than at load time. Mark the member const to reflect its immutability after construction.
Fixes the bootstrap path which now correctly reads the flag from manager config
initialized from db::config at boot time, instead of using the default value.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
Copy the ignore_component_digest_mismatch flag from db::config to sstables_manager::config
during database initialization. This makes the flag available early in the boot process,
before SSTables are loaded, enabling later commits to move the flag initialization from
load-time to construction-time.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Co-authored-by: Copilot <223556219+Copilot@users.noreply.github.com>
When a table has a vector index, writes to the indexed attribute
(via PutItem, UpdateItem, or BatchWriteItem) must supply a value that
is a vector of the appropriate length: It must be a list of exactly the
declared number of elements, where each element is a numeric type ("N")
representable as a 32-bit float. Before this patch, invalid values were
silently accepted and the item was simply not indexed (it was skipped
by the vector store when it read this item). Now these writes are
rejected with a ValidationException.
This is analogous to the existing validation of GSI/LSI key attribute
values - in DynamoDB after a certain attribute becomes the key of a
GSI or LSI, the user is no longer allowed to write the same type.
The implementation we add here is also analogous to the implementation
of the GSI/LSI key validation. The GSI/LSI key validation is done
by validate_value_if_index_key / si_key_attributes, and in this
patch we add the vector-index parallels: vector_index_attributes()
collects the attribute name and declared dimensions for every vector
index in the schema, and validate_value_if_vector_index_attribute()
enforces the type limitations.
For efficiency in the common case where a table has no vector indexes
and no GSIs/LSIs, both validation functions are out-of-line and each
call site guards the call with an explicit empty() check, so no
function-call overhead is incurred when there is nothing to validate.
For UpdateItem, the map of vector index attributes is cached in
update_item_operation (alongside the existing _key_attributes cache)
to avoid recomputing it on every call to update_attribute().
Add to DescribeTable's output for VectorIndexes two fields - IndexStatus
and Backfilling - which are intended to exactly mirror these two fields
that exist for GlobalSecondaryIndexes:
When a vector index is added, IndexStatus is "CREATING" before the index
is usable, and "ACTIVE" when it is finally usable for a Query. During
"CREATING" phase, "Backfilling" may be set to true when the index is
currently being backfilled (the table is scaned and an index is built).
A user is expected to call DescribeTable in a loop after creating a
vector index (via either CreateTable and UpdateTable) and only call
Query on the index after the IndexStatus is finally ACTIVE. Calling
Query earlier, while IndexStatus is still CREATING, will result in an
error.
In the current implementation, Alternator does not track the state of the
vector index, so it needs to contact the vector store to inquire about
the state of the index - using a new function introduced in this patch
that uses an existing vector-store API. This makes DescribeTable slower
on tables that have vector indexes, because the vector store is contacted
on every DescribeTable call.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
We introduce to the Query request a new "VectorSearch" parameter, which
take a mandatory "QueryVector" (a value which must be a numeric vector
of the right length) and "Limit".
The "Limit" of a vector search (Query with VectorSearch) determines the
number of nearest neighbors to return, and does not allow pagination
(ExclusiveKeyStart is not allowed). ConsistentRead=True is also not
allowed on a vector search query.
The "Select"/"ProjectionExpression"/"AttributesToGet" parameters are
also supported, requesting which attributes to fetch. Using Select=
ALL_PROJECTED_ATTRIBUTES means read only the attributes found in the
vector index - currently only the key columns - so it is significantly
faster than ALL_ATTRIBUTES because it doesn't require reading the items
from the base table.
The "FilterExpression" parameter is also supported. Like in DynamoDB's
traditional Query, this does post-filtering, i.e., removing some of the
results returned by the vector index that don't match the filter, and
as a result fewer than Limit results may be returned.
Pre-filtering (done on the vector store, and always returns Limit
results) is not yet implemented.
In commit a55c5e9ec7, the function
describe_multi_item() got a new item_callback parameter, that can
be used to calculate the size of the item. This new parameter
has a default, an empty noncopyable_function. But an empty
noncopyable_function shouldn't be called - exactly like std::function,
it throws std::bad_function_call if called when empty.
So describe_multi_item() should only call this item_callback if
it's not empty.
This became a problem in the next patch, implementing vector search
query, which called describe_multi_item with the default item_callback.
But in general, the function should be usable with the default parameter
(or we shouldn't have defined a default value for this parameter!).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
All the "indexes" we implement in Alternator - GSI, LSI and the new
vector index - share the same IndexName namespace, which we'll use in
Query to refer to the index. In the previous patch we already prevented
adding a vector index with the same name as an existing GSI or LSI.
In this patch we also prevent the reverse - adding a GSI with the name
of an existing vector index.
Additionally, one cannot add a GSI on a key that is already the key of
a vector index: The types conflict: The key of a vector index must be a
vector column, while the key of a GSI must have a standard key type
(string, binary or number).
We have tests for this later, this the big test patch.
After an earlier patch allowed CreateTable to create vector indexes
together with a table, in this patch we add to UpdateTable the ability
to add a new vector index to an existing table, as well as the ability
to delete a vector index from an existing table.
The implementation is inspired by DynamoDB's syntax for GSI - just like
GSI has GlobalSecondaryIndexUpdates with "Create" and "Delete" operations,
for vector indexes we have VectorIndexUpdates supporting Create and
Delete. "Update" is not yet supported - we didn't implement yet any
parameter that can be updated - but we can easily implement it in the
future.
ScyllaDB supports the "vector search" feature in CQL.
In this patch we start the path to adding vector search support also to
Alternator.
In this patch, we implement CreateTable support - allowing the user to enable
vector search in a new table. The following patches will enable additional
operations like UpdateTable (adding a vector index to an existing table or
deleting a vector index to an existing table) and DescribeTable.
Extensive tests for all these features will come at the end of the series.
Those tests were written in parallel with writing this implementation so cover
(hopefully) every nook and cranny of the imlementation.
Alternator has a function validate_attr_name_length() used to validate an
attribute name passed in different operations like PutItem, UpdateItem,
GetItem, etc. It fails the request if the attribute name is longer than
65535 characters.
It turns out that we forgot to check if the attribute name length isn’t 0 -
which should be forbidden as well!
This patch fixes the validation code, and also adds a test that confirms
that after this patch empty attribute names are rejected - just like DynamoDB
does - whereas before this patch they were silently accepted.
We want to fix this issue now, because in a later patch we intend to use
the same validation function also for vector indexes - and want it to be
accurate.
Fixes SCYLLADB-1069.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The vector-search feature, which is already supported in CQL, introduced
the somewhat confusing feature of enabling CDC without explicitly enabling
CDC: When a vector index is enabled on a table, CDC is "enabled" for it
even if the user didn't ask to enable CDC.
For this, some code in cdc/log.cc began to use cdc_enabled() instead of
checking schema.cdc_options.enabled() directly. This cdc_enabled()
function checks if either this enabled() is true, or has_vector_index()
is true.
But there's another twist to this story: To write with CDC, we also need
to create the CDC log table:
1. In CQL, a vector index can only be added on an existing table (with
CREATE INDEX), so the hook on_before_update_column_family() is the
one that noticed that a vector index was added, and created the CDC
log table.
2. But in Alternator, a vector index can be created up-front with a
brand-new table (in CreateTable), so the hook for a new table -
on_pre_create_column_families(), also needs to create the CDC log
table. It already did, but incorrectly checked just the explicit
CDC-enabled flag instead of the new cdc_enabled() function that
also allows vector index.
So this patch just fixes on_pre_create_column_families to use cdc_enabled().
Before this patch, when a vector index will be created in Alternator with
CreateTable, an attempt to write to the table (PutItem) will fail
because it will try to write to the CDC log, which wasn't created.
After this patch, it works. The reproducing test is
test_putitem_vectorindex_createtable (introduced in a later patch).
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Pin all external GitHub Actions to full commit SHAs and upgrade to
their latest major versions to reduce supply chain attack surface:
- actions/checkout: v3/v4/v5 -> v6.0.2
- actions/github-script: v7 -> v8.0.0
- actions/setup-python: v5 -> v6.2.0
- actions/upload-artifact: v4 -> v7.0.0
- astral-sh/setup-uv: v6 -> v8.0.0
- mheap/github-action-required-labels: v5.5.2 (pinned)
- redhat-plumbers-in-action/differential-shellcheck: v5.5.6 (pinned)
- codespell-project/actions-codespell: v2.2 (pinned, was @master)
Set FORCE_JAVASCRIPT_ACTIONS_TO_NODE24=true in all 21 workflows that
use JavaScript-based actions to opt into the Node.js 24 runtime now.
This resolves the deprecation warning:
"Node.js 20 actions are deprecated. Please check if updated versions
of these actions are available that support Node.js 24. Actions will
be forced to run with Node.js 24 by default starting June 2nd,
2026. Node.js 20 will be removed from the runner on September 16th,
2026."
See: https://github.blog/changelog/2025-09-19-deprecation-of-node-20-on-github-actions-runners/
scylladb/github-automation references are intentionally left at @main
as they are org-internal reusable workflows.
Fixes: SCYLLADB-1410
Backport: Backport is required for live branches that run GH actions:
2026.1, 2025.4, 2025.1 and 2024.1
Closesscylladb/scylladb#29421