scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-05-22 07:42:16 +00:00

Author	SHA1	Message	Date
Nadav Har'El	85c6cafb1d	alternator: add optimized vector type for vector search Today in Alternator vector search, vectors are presented to the API as lists of numbers. I.e., in JSON a vector is sent in requests and responses as: {"L": [{"N": "3.14159"}, {"N":" "6.7"}} This format is verbose and inefficient for long vectors. Even worse, because the "N" number format has precision guarantees in DynamoDB, we cannot optimize the storage of such vectors by, for example, storing the numbers as 32-bit floats. We actually store these vectors as JSON, exactly as shown above. So in this patch we introduce a new DynamoDB type, "FLOAT32VECTOR", for vectors. The above vector will look like this in JSON: {"FLOAT32VECTOR": [3.14159, 6.7]} Note that each number is an unquoted JSON number, not a JSON string. Importantly, the definition of the "FLOAT32VECTOR" type specifies that components of the vector only have 32-bit precision. This means that Scylla may store internally these vectors as lists of 32-bit floats - not as a JSON. And indeed, this patch includes this optimization: Top-level vector attributes are now encoded in an optimized way, as a byte 5 (alternator_type::FLOAT32VECTOR) followed by the elements of the vector, just 4 bytes each (the 4-byte big-endian IEEE 754 representation of each floating-point component). This patch also includes documentation, and extensive tests that the new "FLOAT32VECTOR" type works (which also serves as an example how to use it in the boto3 SDK), that it is indeed encoded internally as 32-bit floats and not wasteful JSON strings, and that vector search on such items work. The last thing requires cooperation from the vector store, of course - it needs to be able to understand the new optimized encoding of vector attributes in addition to the old unoptimized one. Note that the old unoptimized ("list of numbers") vectors are still supported. Although not recommended for general use, some users might still want to use the unoptimized type if they have pre-existing data created on DynamoDB or Alternator without vector search in mind, and the vectors already exist as lists of numbers. Although this is less important, the new vector type "FLOAT32VECTOR" is also allowed in a Query's QueryVector. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-05-13 11:57:45 +03:00
Nadav Har'El	ea910acdd4	alternator: add SimilarityFunction option to vector index creation Before this patch, vector search always used the COSINE similarity function. In this patch we add the ability to choose a different similarity function when creating a new vector index (with CreateTable or UpdateTable) by using the SimilarityFunction option. We still default to "COSINE" if SimilarityFunction isn't specified. Allowed similarity functions are COSINE, DOT_PRODUCT, and EUCLIDEAN. DescribeTable can also retrieve a vector index's SimilarityFunction. As usual, this patch also includes documentation for the new feature, and tests. Some of the tests can run without a vector store - verifying the API syntax and which similarity function is supported - but we also add tests that require the vector store and check that the different similarity functions actually sort the nearest items in the expected order. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-05-13 11:57:45 +03:00
Nadav Har'El	63927e07ea	Merge 'alternator/streams: keep disabled streams usable and purge on re-enable' from Piotr Szymaniak When an Alternator stream is disabled, the data should continue to be accessible so that consumers can finish reading. When the stream is later re-enabled, a new StreamArn is produced and only then the old data is purged. On disable, the existing CDC options (including preimage and postimage) are preserved so that DescribeStream can still report StreamViewType. All stream APIs continue to work on the disabled stream, with all shards reported as closed (EndingSequenceNumber set). No new CDC records are written; existing data expires via TTL after 24 hours. On re-enable, the old CDC log table is dropped as a separate Raft group0 schema change and a fresh one is created with a new UUID, giving a new StreamArn. This is Alternator-specific — CQL CDC keeps reusing the log table. Re-enabling is the only way to immediately purge old stream data. Old stream data is removed immediately upon re-enable (a discrepancy with DynamoDB, which keeps it readable for 24 hours through the old StreamArn). Tests updated to cover the new disable and re-enable behavior. Fixes #7239 Fixes SCYLLADB-523 Closes scylladb/scylladb#29413 * github.com:scylladb/scylladb: alternator/streams: remove dead next_iter in get_records test/alternator: fix stream wait timeouts to use wall-clock time docs/alternator: document stream disable/re-enable behavior alternator/streams: keep disabled streams usable and purge on re-enable	2026-05-10 22:04:35 +03:00
Piotr Szymaniak	38bd068f78	alternator/streams: keep disabled streams usable and purge on re-enable Previously, disabling Alternator Streams would create a blank cdc::options with only enabled=false, which meant losing access also to stored Streams's data (including preimage and postimage). Now, when a stream is disabled: - The existing CDC options are preserved (only 'enabled' is flipped to false), so StreamViewType remains available. - DescribeStream enumerates all shards with EndingSequenceNumber set, indicating they are closed. - GetRecords omits NextShardIterator for disabled streams. - DescribeTable (supplement_table_stream_info) reports the stream ARN and StreamEnabled: false when the CDC log table still exists. - ListStreams uses get_base_table instead of is_log_for_some_table so that disabled streams whose log table still exists are listed. When a stream is re-enabled on an Alternator table that has an existing (disabled) CDC log table, the old log table is dropped and a fresh one is created with a new UUID, producing a new StreamArn. This is Alternator-specific behavior; CQL CDC tables continue to reuse the existing log table. The old stream data is lost immediately upon re-enable. DynamoDB keeps it readable for 24 hours. Tests: - test_streams_closed_read, test_streams_disabled_stream: remove xfail now that disabled streams are usable. - test_streams_reenable: new test verifying that re-enabling produces a new ARN and the old data is still readable via the old ARN (xfail because Scylla currently purges old data on re-enable). Fixes scylladb/scylladb#7239	2026-05-07 14:45:42 +02:00
Nadav Har'El	b70beb3e13	alternator: improve CreateTable/UpdateTable schema agreement timeout CreateTable and UpdateTable call wait_for_schema_agreement() after announcing the schema change, to ensure all live nodes have applied the new schema before returning to the user. This wait has a hard- coded 10 second timeout, and on some overloaded test machines we saw it not completing in time, and causing tests to become flaky. This patch increases this timeout from 10 seconds to 30 seconds. It's still hard-coded and not configurable via alternator_timeout_in_ms because it is unlikely any user will want to change it - it just needs to be long. The patch also improves the behavior of a schema-agreement timeout, when it happens: 1. Provide an InternalServerError with more descriptive text. 2. This InternalServerError tells the user that the result of the operation is unknown; So the user will repeat the CreateTable, and will get a ResourceInUseException because the table exists. In that case too, we need to wait for schema agreement. So we added this missing wait. Fixes SCYLLADB-1804 Refs #5052 (claiming CreateTable shouldn't wait at all) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-05-05 15:41:06 +03:00
Radosław Cybulski	74b523ea20	treewide: fix spelling errors. Fix various spelling errors. Closes scylladb/scylladb#29574	2026-04-21 18:20:26 +03:00
Piotr Szymaniak	4b6937b570	alternator/streams: Block tablet merges when Alternator Streams are enabled DynamoDB Streams API can only convey a single parent per stream shard. Tablet merges produce 2 parents, which is incompatible. When streams are requested on a tablet table, block tablet merges via tablet_merge_blocked (the allocator suppresses new merge decisions and revokes any active merge decision). add_stream_options() sets tablet_merge_blocked=true alongside enabled=true, so CreateTable needs no special handling — the flag is inert on vnode tables and immediately effective on tablet tables. For UpdateTable, CDC enablement is deferred: store the user's intent via enable_requested, and let the topology coordinator finalize enablement once no in-progress merges remain. A new helper, defer_enabling_streams_block_tablet_merges(), amends the CDC options to this deferred state. Disabling streams clears all flags, immediately re-allowing merges. The tablet allocator accesses the merge-blocked flag through a schema::tablet_merges_forbidden() accessor rather than reaching into CDC options directly. Mark test_parent_children_merge as xfail and remove downward (merge) steps from tablet_multipliers in test_parent_filtering and test_get_records_with_alternating_tablets_count.	2026-04-19 03:54:33 +02:00
Radosław Cybulski	6be16cf224	alternator: remove antitablet guards when using Streams Remove `if` condition, that prevented tables with tablets working with Streams. Remove a test, that verifies, that Alternator will reject tables with tablets underneath working with Streams feature enabled on them. Update few tests, that were expected to fail on tablets to enable their normal execution.	2026-04-17 18:58:26 +02:00
Radosław Cybulski	d93299b605	alternator: add system_keyspace reference Add a reference to `system_keyspace` object to `executor` object in alternator. The reference is needed, because in future commit we will add there (and use) helper functions that read `cdc_log` tables for tablet based tables similarly to already existing siblings for vnodes living in `system_distributed_keyspace`.	2026-04-17 18:57:43 +02:00
Nadav Har'El	2e274bbdba	alternator: split executor.cc even more This patch continues the effort to split the huge executor.cc (5000 lines before this patch) even more. In this patch we introduce a new source file, executor_util.cc, for various utility functions that are used for many different operations and therefore are useful to have in a header file. These utility functions will now be in executor_util.cc and executor_util.hh - instead of executor.cc and executor.hh. Various source files, including executor.cc, the executor_read.cc introduced in the previous patch, as well as older source files like as streams.cc, ttl.cc and serialization.cc, use the new header file. This patch removes over 700 lines of code from executor.cc, and also removes a large amount of utility functions declerations from executor.hh. Originally, executor.hh was meant to be about the interface that the Alternator server needs to execute the different DynamoDB API operations - and after this patch it returns closer to this original goal. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-04-16 14:30:16 +03:00
Nadav Har'El	751da00692	alternator: split alternator/executor.cc Already six years ago, in #5783, we noticed that alternator/executor.cc has grown too large. The previous patches added hundreds of more lines to it to implement vector search, and it reached a whopping 7,000 lines of code. This is too much. This patch splits from executor.cc two major chunks: 1. The implementation of read requests - GetItem, BatchGetItem, Query (base table, GSI/LSI, and vector-search), and Scan - was moved to a new source file alternator/executor_read.cc. The new file has 2,000 lines. 2. Moved 250 lines of template functions dealing with attribute paths and maps of them to a new header file, attribute_path.hh. These utilities are used for many different operations - various read operations use them for ProjectionExpression, and UpdateItem uses them for modifications to nested attributes, so we need the new header file from both executor.cc and executor_read.cc The remaining executor.cc is still pretty big, 5,000 lines, and contains write operations (PutItem, UpdateItem, DeleteItem, BatchWriteItem) as well as various table and other operations, and also many utility functions used by many types of operations, so we can later continue this refactoring effort. Refs #5783 Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-04-16 14:30:10 +03:00
Nadav Har'El	83670d2493	alternator: validate vector index attribute values on write When a table has a vector index, writes to the indexed attribute (via PutItem, UpdateItem, or BatchWriteItem) must supply a value that is a vector of the appropriate length: It must be a list of exactly the declared number of elements, where each element is a numeric type ("N") representable as a 32-bit float. Before this patch, invalid values were silently accepted and the item was simply not indexed (it was skipped by the vector store when it read this item). Now these writes are rejected with a ValidationException. This is analogous to the existing validation of GSI/LSI key attribute values - in DynamoDB after a certain attribute becomes the key of a GSI or LSI, the user is no longer allowed to write the same type. The implementation we add here is also analogous to the implementation of the GSI/LSI key validation. The GSI/LSI key validation is done by validate_value_if_index_key / si_key_attributes, and in this patch we add the vector-index parallels: vector_index_attributes() collects the attribute name and declared dimensions for every vector index in the schema, and validate_value_if_vector_index_attribute() enforces the type limitations. For efficiency in the common case where a table has no vector indexes and no GSIs/LSIs, both validation functions are out-of-line and each call site guards the call with an explicit empty() check, so no function-call overhead is incurred when there is nothing to validate. For UpdateItem, the map of vector index attributes is cached in update_item_operation (alongside the existing _key_attributes cache) to avoid recomputing it on every call to update_attribute().	2026-04-16 13:31:49 +03:00
Nadav Har'El	aea7b6a66b	alternator: DescribeTable for vector index: add IndexStatus and Backfilling Add to DescribeTable's output for VectorIndexes two fields - IndexStatus and Backfilling - which are intended to exactly mirror these two fields that exist for GlobalSecondaryIndexes: When a vector index is added, IndexStatus is "CREATING" before the index is usable, and "ACTIVE" when it is finally usable for a Query. During "CREATING" phase, "Backfilling" may be set to true when the index is currently being backfilled (the table is scaned and an index is built). A user is expected to call DescribeTable in a loop after creating a vector index (via either CreateTable and UpdateTable) and only call Query on the index after the IndexStatus is finally ACTIVE. Calling Query earlier, while IndexStatus is still CREATING, will result in an error. In the current implementation, Alternator does not track the state of the vector index, so it needs to contact the vector store to inquire about the state of the index - using a new function introduced in this patch that uses an existing vector-store API. This makes DescribeTable slower on tables that have vector indexes, because the vector store is contacted on every DescribeTable call. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-04-16 13:31:49 +03:00
Nadav Har'El	e43a2e5086	alternator: implement Query with a vector index We introduce to the Query request a new "VectorSearch" parameter, which take a mandatory "QueryVector" (a value which must be a numeric vector of the right length) and "Limit". The "Limit" of a vector search (Query with VectorSearch) determines the number of nearest neighbors to return, and does not allow pagination (ExclusiveKeyStart is not allowed). ConsistentRead=True is also not allowed on a vector search query. The "Select"/"ProjectionExpression"/"AttributesToGet" parameters are also supported, requesting which attributes to fetch. Using Select= ALL_PROJECTED_ATTRIBUTES means read only the attributes found in the vector index - currently only the key columns - so it is significantly faster than ALL_ATTRIBUTES because it doesn't require reading the items from the base table. The "FilterExpression" parameter is also supported. Like in DynamoDB's traditional Query, this does post-filtering, i.e., removing some of the results returned by the vector index that don't match the filter, and as a result fewer than Limit results may be returned. Pre-filtering (done on the vector store, and always returns Limit results) is not yet implemented.	2026-04-16 13:31:47 +03:00
Nadav Har'El	68e34c57e1	alternator: fix bug in describe_multi_item() In commit `a55c5e9ec7`, the function describe_multi_item() got a new item_callback parameter, that can be used to calculate the size of the item. This new parameter has a default, an empty noncopyable_function. But an empty noncopyable_function shouldn't be called - exactly like std::function, it throws std::bad_function_call if called when empty. So describe_multi_item() should only call this item_callback if it's not empty. This became a problem in the next patch, implementing vector search query, which called describe_multi_item with the default item_callback. But in general, the function should be usable with the default parameter (or we shouldn't have defined a default value for this parameter!). Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-04-16 13:30:02 +03:00
Nadav Har'El	ffe1029b7c	alternator: prevent adding GSI conflicting with a vector index All the "indexes" we implement in Alternator - GSI, LSI and the new vector index - share the same IndexName namespace, which we'll use in Query to refer to the index. In the previous patch we already prevented adding a vector index with the same name as an existing GSI or LSI. In this patch we also prevent the reverse - adding a GSI with the name of an existing vector index. Additionally, one cannot add a GSI on a key that is already the key of a vector index: The types conflict: The key of a vector index must be a vector column, while the key of a GSI must have a standard key type (string, binary or number). We have tests for this later, this the big test patch.	2026-04-16 13:30:02 +03:00
Nadav Har'El	82de16f92c	alternator: implement UpdateTable with a vector index After an earlier patch allowed CreateTable to create vector indexes together with a table, in this patch we add to UpdateTable the ability to add a new vector index to an existing table, as well as the ability to delete a vector index from an existing table. The implementation is inspired by DynamoDB's syntax for GSI - just like GSI has GlobalSecondaryIndexUpdates with "Create" and "Delete" operations, for vector indexes we have VectorIndexUpdates supporting Create and Delete. "Update" is not yet supported - we didn't implement yet any parameter that can be updated - but we can easily implement it in the future.	2026-04-16 13:30:02 +03:00
Nadav Har'El	217090a996	alternator: implement DescribeTable with a vector index In this patch we add to DescribeTable the ability to list the vector indexes enabled on an Alternator table.	2026-04-16 13:30:02 +03:00
Nadav Har'El	e156d67177	alternator: implement CreateTable with a vector index ScyllaDB supports the "vector search" feature in CQL. In this patch we start the path to adding vector search support also to Alternator. In this patch, we implement CreateTable support - allowing the user to enable vector search in a new table. The following patches will enable additional operations like UpdateTable (adding a vector index to an existing table or deleting a vector index to an existing table) and DescribeTable. Extensive tests for all these features will come at the end of the series. Those tests were written in parallel with writing this implementation so cover (hopefully) every nook and cranny of the imlementation.	2026-04-16 13:29:58 +03:00
Piotr Szymaniak	6913efab5c	audit/alternator: Audit requests Both the successful ones as well as the failed ones are audited. Each Alternator operation sets up audit metadata via an executor::maybe_audit() helper, which checks will_log() and only heap-allocates audit_info_alternator when auditing is enabled. DDL and metadata operations pass no consistency level; data read/write operations pass the actual CL used. BatchWriteItem and BatchGetItem guard table name collection with will_log() to avoid unnecessary work when auditing is disabled. ListStreams audits the input table name rather than collecting output table names during iteration. UntagResource sets up auditing after parameter validation. Exception re-throw in server.cc uses co_return coroutine::exception(). The chosen audit types for the operations: - CreateTable - DDL - DescribeTable - QUERY - DeleteTable - DDL - UpdateTable - DDL - PutItem - DML - UpdateItem - DML - GetItem - QUERY - DeleteItem - DML - ListTables - QUERY - Scan - QUERY - DescribeEndpoints - QUERY - BatchWriteItem - DML - BatchGetItem - QUERY - Query - QUERY - TagResource - DDL - UntagResource - DDL - ListTagsOfResource - QUERY - UpdateTimeToLive - DDL - DescribeTimeToLive - QUERY - ListStreams - QUERY - DescribeStream - QUERY - GetShardIterator - QUERY - GetRecords - QUERY - DescribeContinuousBackups - QUERY	2026-04-15 11:55:42 +02:00
Radosław Cybulski	4b984212ba	alternator: improve parsing / generating of StreamArn parameter Previously Alternator, when emit Amazon's ARN would not stick to the standard. After our attempt to run KCL with scylla we discovered few issues. Amazon's ARN looks like this: arn:partition:service:region:account-id:resource-type/resource-id for example: arn:aws:dynamodb:us-west-2:111122223333:table/TestTable/stream/2015-05-11T21:21:33.291 KCL checks for: - ARN provided from Alternator calls must fit with basic Amazon's ARN pattern shown above, - region constisting only of lower letter alphabets and `-`, no underscore character - account-id being only digits (exactly 12) - service being `dynamodb` - partition starting with `aws` The patch updates our code handling ARNs to match those findings. 1. Split `stream_arn` object into `stream_arn` - ARN for streams only and `stream_shard_id` - id value for stream shards. The latter receives original implementation. The former emits and parses ARN in a Amazon style. for example: 2. Update new `stream_arn` class to encode keyspace and table together separating them by `@`. New ARN looks like this: arn:aws:dynamodb:us-east-1:000000000000:table/TestKeyspace@TestTable/stream/2015-05-11T21:21:33.291 3. hardcode `dynamodb` as service, `aws` as partition, `us-east-1` as region and `000000000000` as account-id (must have 12 digits) 4. Update code handling ARNs for tags manipulation to be able to parse Amazon's style ARNs. Emiting code is left intact - the parser is now capable of parsing both styles. 5. Added unit tests. Fixes #28350 Fixes: SCYLLADB-539 Fixes: #28142 Closes scylladb/scylladb#28187	2026-04-14 18:07:05 +03:00
Avi Kivity	0ae22a09d4	LICENSE: Update to version 1.1 Updated terms of non-commercial use (must be a never-customer).	2026-04-12 19:46:33 +03:00
Botond Dénes	035aa90d4b	Merge 'Alternator: add per-table batch latency metrics and test coverage' from Amnon Heiman This series fixes a metrics visibility gap in Alternator and adds regression coverage. Until now, BatchGetItem and BatchWriteItem updated global latency histograms but did not consistently update per-table latency histograms. As a result, table-level latency dashboards could miss batch traffic. It updates the batch read/write paths to compute request duration once and record it in both global and per-table latency metrics. Add the missing tests, including a metric-agnostic helper and a dedicated per-table latency test that verifies latency counters increase for item and batch operations. This change is metrics-only (no API/behavior change for requests) and improves observability consistency between global and per-table views. Fixes #28721 We assume the alternator per-table metrics exist, but the batch ones are not updated Closes scylladb/scylladb#28732 * github.com:scylladb/scylladb: test(alternator): add per-table latency coverage for item and batch ops alternator: track per-table latency for batch get/write operations	2026-03-16 17:18:00 +02:00
Botond Dénes	fcc570c697	Merge 'Exorcise assertions from Alternator, using a new throwing_assert() macro' from Nadav Har'El assert(), and SCYLLA_ASSERT() are evil (Refs #7871) because they can cause the entire Scylla cluster to crash mysteriously instead of cleanly failing the specific request that encountered a serious problem of failed pre-requisite. In this two-patch series, in the first patch we introduce a new macro throwing_assert(), a convenient drop-in replacement for SCYLLA_ASSERT() but which has all the benefits of on_internal_error() instead of the dangers of SCYLLA_ASSERT(). In the second patch we use the new function to replace every call to SCYLLA_ASSERT() in Alternator by the new throwing_assert(). Here is an example from the second patch to demonstrate the power of this approach: The Alternator code uses the attrs_column() function to retrieve the ":attrs" column of a schema. Since every Alternator table always has an ":attrs" column in its schema, we felt safe to SCYLLA_ASSERT() that this column exists. However, imagine that one day because of a bug, one Alternator table is missing this column. Or maybe not a bug - maybe a malicious user on a shared cluster found a way to deliberately delete this column (e.g, with a CQL command!) and this check fails. Before this patch, the entire Scylla node will crash. If the same request is sent to all nodes - the entire cluster will crash. The user might not even know which request caused this crash. In contrast, after this patch, the specific operation - e.g., PutItem - will get an exception. Only this operation, and nothing else, will be aborted, and the user who sent this request will even get an "Internal Server Error" with the assertion-failure message, alerting them that this specific query is causing problems, while other queries might work normally. There's no need to backport this patch - unless it becomes annoying that other branches don't have the throwing_assert() function and we want it to ease other backports. Fixes #28308. Closes scylladb/scylladb#28445 * github.com:scylladb/scylladb: alternator: replace SCYLLA_ASSERT with throwing_assert utils: introduce throwing_assert(), a safe replacement for assert	2026-02-27 15:35:36 +02:00
Amnon Heiman	29e0b4e08c	alternator: track per-table latency for batch get/write operations Batch operations were updating only global latency histograms, which left table-level latency metrics incomplete. This change computes request duration once at the end of each operation and reuses it to update both global and per-table latency stats: Latencies are stored per table used, This aligns batch read/write metric behavior with other operations and improves per-table observability. Signed-off-by: Amnon Heiman <amnon@scylladb.com>	2026-02-25 20:51:18 +02:00
Nadav Har'El	2823780557	alternator ttl: move TTL_TAG_KEY to a header file TTL_TAG_KEY stores the name of the tag in which we store the name of the table's expiration-time column, for Alternator's TTL feature. We already need this name in two source files, and soon we'll need it in more files - as we want to use the same implementation also for for a new per-row TTL feature in CQL. So it's time to move the declaration of this variable to a new header file - alternator/ttl_tag.hh. Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-02-25 14:59:42 +02:00
Nadav Har'El	b78bb914d7	alternator: replace SCYLLA_ASSERT with throwing_assert Replace all calls to SCYLLA_ASSSERT() in Alternator by the better and safer throwing_assert() introduced in the previous patch. As a result of this patch, if one of the call sites for these asserts is buggy and ever fails, only the involved operation will be killed by an exception, instead of crashing the whole server - and often the entire cluster (as the same buggy request reaches all nodes and crashes them all). Additionally, this patch replaces a few existing uses in Alternator of on_internal_error() with a non-interesting message with a more-or-less equivalent, but shorter, throwing_assert(). The idea is to convert the verbose idiom: if (!condition) { on_internal_error(logger, "some error message") } With the shorter throwing_assert(condition) Signed-off-by: Nadav Har'El <nyh@scylladb.com>	2026-02-25 14:58:47 +02:00
Nadav Har'El	f23e796e76	alternator: fix typos in comments and variable names Copilot found these typos in comments and variable name in alternator/, so might as well fix them. There are no functional changes in this patch. Signed-off-by: Nadav Har'El <nyh@scylladb.com> Closes scylladb/scylladb#28447	2026-02-02 19:16:43 +03:00
Michael Litvak	1f7a65904e	alternator: don't require rf_rack flag for indexes, validate instead In `8df61f6d99` we changed the requirements for creating materialized views and MV-based indexes - instead of requiring the rf_rack_valid_keyspaces flag to be set, we now require the keyspace to be RF-rack-valid at the time of creation, and it is enforced to remain RF-rack-valid while the MV exists. This validation is done in the cql create view/index statements. The same should be done also for alternator - when creating a table with GSI or LSI, or when adding a GSI to an existing table, previously we required the flag rf_rack_valid_keyspaces to be set. Now we change it to instead check if the keyspace is RF-rack-valid, and if not the operation fails with an appropriate error.	2026-01-22 16:11:35 +01:00
Michael Litvak	e7ec87382e	Revert "alternator: require rf_rack_valid_keyspaces when creating index" This reverts commit `4b26a86cb0`. The rf_rack_valid_keyspaces option is now not required for creating MVs.	2026-01-20 09:56:48 +01:00
Avi Kivity	66aee0fb5e	alternator: add optional listeners for proxy protocol v2 Following `954f2cbd2f`, which added proxy protocol v2 listeners for CQL, we do the same for alternator. We add two optional ports for plain and TLS-wrapped HTTP. We test each new port, that the old ports still work, and that mixing up a port with no proxy protocol and a connection with proxy protocol (or the opposite) fails. The latter serves to show that the testing strategy is valid and doesn't just pass whatever happens. We also verify that the correct addresses (and TLS mode) show up in system.clients. Closes scylladb/scylladb#27889	2026-01-13 09:59:24 +02:00
Radosław Cybulski	df20f178aa	alternator: fix invalid rebase Fix an invalid rebase, that would properly merge code coming from master, except that code would ignore refactor done in the patch.	2025-12-29 08:33:10 +01:00
Radosław Cybulski	a86b782d3f	Add table size to DescribeTable's output Add a table size to DescribeTable's output.	2025-12-29 08:33:07 +01:00
Radosław Cybulski	1bd855a650	Promote fill_table_description and create_table_on_shard0 to methods Promote `executor::fill_table_description` and `executor::create_table_on_shard0` to methods (from static functions).	2025-12-29 08:33:06 +01:00
Radosław Cybulski	e246abec4d	Add ref to service::storage_service to executor Add a reference to `service::storage_service` to executor object.	2025-12-29 08:33:03 +01:00
Michael Litvak	b9ec1180f5	alternator: require rf_rack_valid_keyspaces when creating index When creating an alternator table with tablets, if it has an index, LSI or GSI, require the config option rf_rack_valid_keyspaces to be enabled. The option is required for materialized views in tablets keyspaces to function properly and avoid consistency issues that could happen due to cross-rack migrations and pairing switches when RF-rack validity is not enforced. Currently the option is validated when creating a materialized view via the CQL interface, but it's missing from the alternator interface. Since alternator indexes are based on materialized views, the same check should be added there as well. Fixes scylladb/scylladb#27612 Closes scylladb/scylladb#27622	2025-12-15 10:36:57 +02:00
Nadav Har'El	0c64e3be9a	Merge 'Unify and fix rjson string and string_view conversions' from Marcin Maliszkiewicz This patch-set consolidates and corrects rjson string conversion handling. It removes unnecessary string copies, ensures proper length usage and replaces ad-hoc conversions with consistent helper functions. Overall, the changes make rjson string handling safer, faster, and more uniform across the codebase. Backport: no, it's a refactor Closes scylladb/scylladb#27394 * github.com:scylladb/scylladb: fix rjson::value to bytes conversion with missing GetStringLength call alternator: change type from string to string_view in should_add_capacity fix rjson::value to string_view conversion with missing GetStringLength call use rjson::to_string_view when rjson::value gets converted using GetStringLength use rjson::to_sstring and rjson::to_string for various string conversions utils: use rjson document wrapper in instance_profile_credentials_provider::parse_creds utils: move rjson::to_string_view func to string related place utils: add to_sstring and to_string rjson helper	2025-12-11 12:05:41 +02:00
Marcin Maliszkiewicz	be9992cfb3	fix rjson::value to bytes conversion with missing GetStringLength call	2025-12-09 19:27:22 +01:00
Marcin Maliszkiewicz	62962f33bb	fix rjson::value to string_view conversion with missing GetStringLength call In some cases we unnecessarily convert to string which causes a copy. In other we convert without calling GetStringLength which causes iteration to dermine length which is already known. In some cases we do even both. This commit fixes that.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	060c2f7c0d	use rjson::to_string_view when rjson::value gets converted using GetStringLength This commit is only cosmetics, changes calls to GetStringLength into rjson::to_string_view with the same underlying implementation.	2025-12-09 19:27:21 +01:00
Marcin Maliszkiewicz	64149b57c3	use rjson::to_sstring and rjson::to_string for various string conversions In some cases we ommit size checking which is wrong as according to rapid json documentation strings may contain \0 byte in the middle.	2025-12-09 19:27:21 +01:00
Petr Gusev	608eee0357	alternator/executor.cc: eliminate redundant dk copy A small refactoring/optimization.	2025-12-09 10:21:06 +01:00
Petr Gusev	0bcc2977bb	alternator/executor.cc: release cas_shard on the original shard Before this series, we kept the cas_shard on the original shard to guard against tablet movements running in parallel with storage_proxy::cas. The bug addressed by this PR shows that this approach is flawed: keeping the cas_shard on the original shard does not guarantee that a new cas_shard acquired on the target shard won’t require another jump. We fixed this in the previous commit by checking cas_shard.this_shard() on the target shard and continuing to jump to another shard if necessary. Once cas_shard.this_shard() on the target shard returns true, the storage_proxy::cas invariants are satisfied, and no other cas_shard instances need to remain alive except the one passed into storage_proxy::cas.	2025-12-09 10:21:06 +01:00
Petr Gusev	3a865fe991	alternator/executor.cc: move shard check into cas_write This change ensures that if cas_shard points to a different shard, the executor will continue issuing shard jumps until cas_shard.this_shard() returns true. The commit simply moves the this_shard() check from the parallel_for_each lambda into cas_write, with minimal functional changes. We enable test_alternator_invalid_shard_for_lwt since now it should pass. Fixes scylladb/scylladb#27353	2025-12-09 10:21:01 +01:00
Petr Gusev	c6eec4eeef	alternator/executor.cc: make cas_write a private method We will need to access executor::_stats field from cas_write. We could pass it as a paramter, but it seems simpler to just make cas_write and instance method too.	2025-12-08 10:29:54 +01:00
Petr Gusev	9bef142328	alternator/executor.cc: make do_batch_write a private method We will need to access executor::_stats field on other shards.	2025-12-08 10:29:54 +01:00
Petr Gusev	74bf24a4a7	alternator/executor.cc: fix indent	2025-12-08 10:29:28 +01:00
Petr Gusev	e60bcd0011	test_alternator: add test_alternator_invalid_shard_for_lwt This test reproduces scylladb/scylladb#27353 using two injection points. First, the test triggers an intra-node tablet migration and suspends it at the streaming stage using the intranode_migration_streaming_wait injection. Next, it enables the alternator_executor_batch_write_wait injection, which suspends a batch write after its cas_shard has already been created. The test then issues several batch writes and waits until one of them hits this injection on the destination shard. At this point, the cas_shard.erm for that write is still in the streaming state, meaning the executor would need to jump back to the source shard. The test then resumes the suspended tablet migration, allowing it to update the ERM on the source shard to write_both_read_new. After that, the test releases the suspended batch write and expects it to perform two shard jumps: first from the destination to the source shard, and then again back to the source shard. This commit adds the alternator_executor_batch_write_wait injection to alternator/executor.cc. Coroutines are intentionally avoided in the parallel_for_each lambda to prevent unnecessary coroutine-frame allocations.	2025-12-08 10:29:28 +01:00
Petr Gusev	f00f7976c1	alternator/executor.cc: avoid cross-shard free This commit is an optimization: avoiding destruction of foreign objects on the wrong shard. Releasing objects allocated on a different shard causes their ::free calls to be executed remotely, which adds unnecessary load to the SMP subsystem. Before this patch, a std::vector could be moved to another shard. When the vector was eventually destroyed, its ::free had to be marshalled back to the shard where the memory had originally been allocated. This change avoids that overhead by passing the vector by const reference instead. The referenced objects lifetime correctness reasoning: * the put_or_delete_item refs usages in put_or_delete_item_cas_request are bound to its lifetime * cas_request lifetime is bound to storage_proxy::cas future * we don't release put_or_delete_item-s untill all storage_proxy::cas calls are done.	2025-12-07 16:14:56 +01:00
Petr Gusev	c428645d16	storage_proxy: cas: take cas_request by raw reference In the next commit we want to add an optimization that relies on precise control over the lifetime of cas_request. In particular, we want the implementation of this interface in Alternator to operate on raw references that are guaranteed to remain valid only until the cas() future is resolved. We already depend on the same lifetime assumptions in cas_request when used by modification_statement. However, these assumptions are not clearly expressed in the current interface: cas_request is taken by shared_ptr, and nothing prevents cas() from storing that pointer inside paxos_response_handler, which may outlive the cas() future. This commit fixes that by taking cas_request by raw reference. This makes it explicit that cas() does not assume ownership of the object. Callers must ensure that the referenced object remains valid until the returned future is resolved.	2025-12-07 16:14:56 +01:00

1 2 3 4 5 ...

612 Commits