Add a segment_set member to replica::compaction_group that manages the
logstor segments that belong to the compaction group, similarly to how
it manages sstables. Add also a separator buffer in each compaction
group.
When writing a mutation to a compaction group, the mutation is written
to the active segment and to the separator buffer of the compaction
group, and when the separator buffer is flushed the segment is added to
the compaction_group's segment set.
initial implementation of the logstor storage engine for key-value
tables that supports writes, reads and basic compaction.
main components:
* logstor: this is the main interface to users that supports writing and
reading back mutations, and manages the internal components.
* index: the primary index in-memory that maps a key to a location on
disk.
* write buffer: writes go initially to a write buffer. it accumulates
multiple records in a buffer and writes them to the segment manager in
4k sized blocks.
* segment manager: manages the storage - files, segments, compaction. it
manages file and segment allocation, and writes 4k aligned buffers to
the active segment sequentially. it tracks the used space in each
segment. the compaction finds segment with low space usage and writes
them to new segments, and frees the old segments.
In this series we add support for forwarding strongly consistent CQL requests to suitable replicas, so that clients can issue reads/writes to any node and have the request executed on an appropriate tablet replica (and, for writes, on the Raft leader). We return the same CQL response as what the user would get while sending the request to the correct replica and we perform the same logging/stats updates on the request coordinator as if the coordinator was the appropriate replica.
The core mechanism of forwarding a strongly consistent request is sending an RPC containing the user's cql request frame to the appropriate replica and returning back a ready, serialized `cql_transport::response`. We do this in the CQL server - it is most prepared for handling these types and forwarding a request containing a CQL frame allows us to reuse near-top-level methods for CQL request handling in the new RPC handler (such as the general `process`)
For sending the RPC, the CQL server needs to obtain the information about who should it forward the request to. This requires knowledge about the tablet raft group members and leader. We obtain this information during the execution of a `cql3/strong_consistency` statement, and we return this information back to the CQL server using the generalized `bounce_to_shard` `response_message`, where we now store the information about either a shard, or a specific replica to which we should forward to. Similarly to `bounce_to_shard`, we need to handle this `result_message` in a loop - a replica may move during statement execution, or the Raft leader can change. We also use it for forwarding strongly consistent writes when we're not a member of the affected tablet raft group - in that case we need to forward the statement twice - once to any replica of the affected tablet, then that replica can find the leader and return this information to the coordinator, which allows the second request to be directed to the leader.
This feature also allows passing through exception messages which happened on the target replica while executing the statement. For that, many methods of the `cql_transport::cql_server::connection` for creating error responses needed to be moved to `cql_transport::cql_server`. And for final exception handling on the coordinator, we added additional error info to the RPC response, so that the handling can be performed without having the `result_message::exception` or `exception_ptr` itself.
Fixes [SCYLLADB-71](https://scylladb.atlassian.net/browse/SCYLLADB-71)
[SCYLLADB-71]: https://scylladb.atlassian.net/browse/SCYLLADB-71?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQClosesscylladb/scylladb#27517
* github.com:scylladb/scylladb:
test: add tests for CQL forwarding
transport: enable CQL forwarding for strong consistency statements
transport: add remote statement preparation for CQL forwarding
transport: handle redirect responses in CQL forwarding
transport: add exception handling for forwarded CQL requests
transport: add basic CQL request forwarding
idl: add a representation of client_state for forwarding
cql_server: handle query, execute, batch in one case
transport: inline process_on_shard in cql_server::process
transport: extract process() to cql_server
transport: add messaging_service to cql_server
transport: add response reconstruction helpers for forwarding
transport: generalize the bounce result message for bouncing to other nodes
strong consistency: redirect requests to live replicas from the same rack
transport: pass foreign_ptr into sleep_until_timeout_passes and move it to cql_server
transport: extract the error handling from process_request_one
transport: move error response helpers from connection to cql_server
Add the infrastructure for forwarding CQL requests to other nodes.
When a process() call results in a node bounce (as opposed to a shard
bounce), the coordinator serializes the request and sends it via the
FORWARD_CQL_EXECUTE RPC verb to the target node.
In this patch we omit several features that allow handling more
scenarios that can happen when trying to forward a CQL request,
but the RPC request and response are already prepared for them.
They will be handled in the following commits.
In the following patches, when we start allowing to forward CQL
requests to other nodes, we'll need to use the same client state
for executing the request on the destination node as we had on the
source. client_state contains many fields and we need to create
a new instance of it when we start handling the forwarded request,
so to prepare for the forwarding RPC, we add a serializable format
of the client_state as an IDL struct. The new class is missing some
fields that are not used while executing requests, and some whose
value is determined by the fact that the client state is used for
a forwarded request.
These include:
- driver name, driver version, client options - not used for executing
requests. Instead, we use these as data sources for the virtual
"clients" system table.
- auth_state - must be READY - we reached a bounce message, so we were
able to try executing the request locally
- _control_connection - used for altering a cql_server::connection, which
we don't have on the target node
- _default_timeout_config - used when updating service levels, also only
per-connection
- workload_type - used for deciding whether to allow shedding at the
start of processing the request, and for getting per-connection service
level params (for an API)
This series adds a global read barrier to raft_group0_client, ensuring that Raft group0 mutations are applied on all live nodes before returning to the caller.
Currently, after a group0_batch::commit, the mutations are only guaranteed to be applied on the leader. Other nodes may still be catching up, leading to stale reads. This patch introduces a broadcast read barrier mechanism. Calling send_group0_read_barrier_to_live_members after committing will cause the coordinator to send a read barrier RPC to all live nodes (discovered via gossiper) and waits for them to complete. This is best effort attempt to get cluster-wide visibility of the committed state before the response is returned to the user.
Auth and service levels write paths are switched to use this new mechanism.
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-650
Backport: no, new feature
Closesscylladb/scylladb#28731
* https://github.com/scylladb/scylladb:
test: add tests for global group0_batch barrier feature
qos: switch service levels write paths to use global group0_batch barrier
auth: switch write paths to use global group0_batch barrier
raft: add function to broadcast read barrier request
raft: add gossiper dependency to raft_group0_client
raft: add read barrier RPC
This patch series removes creation of default 'cassandra:cassandra' superuser on system start.
Disable creation of a superuser with default 'cassandra:cassandra' credentials to improve security. The current flow requires clients to create another superuser and then drop the default `cassandra:cassandra' role. For those who do, there is a time window where the default credentials exist. For those who do not, that role stays. We want to improve security by forcing the client to either use config to specify default values for default superuser name and password or use cqlsh over maintenance socket connection to explicitly create/alter a superuser role.
The patch series:
- Enable role modification over the maintenance socket
- Stop using default 'cassandra' value for default superuser, skipping creation instead
Design document: https://scylladb.atlassian.net/wiki/spaces/RND/pages/165773327/Drop+default+cassandra+superuserFixesscylladb/scylla-enterprise#5657
This is an improvement. It does not need a backport.
Closesscylladb/scylladb#27215
* github.com:scylladb/scylladb:
config: enable maintenance socket in workdir by default
docs: auth: do not specify password with -p option
docs: update documentation related to default superuser
test: maintenance socket role management
test: cluster: add logs to test_maintenance_socket.py
test: pylib: fix connect_driver handling when adding and starting server
auth: do not create default 'cassandra:cassandra' superuser
auth: remove redundant DEFAULT_USER_NAME from password authenticator
auth: enable role management operations via maintenance socket
client_state: add has_superuser method
client_state: add _bypass_auth_checks flag
auth: let maintenance_socket_role_manager know if node is in maintenance mode
auth: remove class registrator usage
auth: instantiate auth service with factory functors
auth: add service constructor with factory functors
auth: add transitional.hh file
service: qos: handle special scheduling group case for maintenance socket
service: qos: use _auth_integration as condition for using _auth_integration
In this series we introduce new system tables and use them for storing the raft metadata
for strongly consistent tables. In contrast to the previously used raft group0 tables, the
new tables can store data on any shard. The tables also allow specifying the shard where
each partition should reside, which enables the tablets of strongly consistent tables to have
their raft group metadata co-located on the same shard as the tablet replica.
The new tables have almost the same schemas as the raft group0 tables. However, they
have an additional column in their partition keys. The additional column is the shard
that specifies where the data should be located. While a tablet and its corresponding
raft group server resides on some shard, it now writes and reads all requests to the
metadata tables using its shard in addition to the group_id.
The extra partition key column is used by the new partitioner and sharder which allow
this special shard routing. The partitioner encodes the shard in the token and the
sharder decodes the shard from the token. This approach for routing avoids any
additional lookups (for the tablet mapping) during operations on the new tables
and it also doesn't require keeping any state. It also doesn't interact negatively
with resharding - as long as tablets (and their corresponding raft metadata) occupy
some shard, we do not allow starting the node with a shard count lower than the
id of this shard. When increasing the shard count, the routing does not change,
similarly to how tablet allocation doesn't change.
To use the new tables, a new implementation of `raft::persistence` is added. Currently,
it's almost an exact copy of the `raft_sys_table_storage` which just uses the new tables,
but in the future we can modify it with changes specific to metadata (or mutation)
storage for strongly consistent tables. The new storage is used in the `groups_manager`,
which combined with the removal of some `this_shard_id() == 0` checks, allows strongly
consistent tables to be used on all shards.
This approach for making sure that the reads/writes to the new tables end up on the correct shards
won in the balance of complexity/usability/performance against a few other approaches we've considered.
They include:
1. Making the Raft server read/write directly to the database, skipping the sharder, on its shard, while using
the default partitioner/sharder. This approach could let us avoid changing the schema and there should be
no problems for reads and writes performed by the Raft server. However, in this approach we would input
data in tables conflicting with the placement determined by the sharder. As a result, any read going through
the sharder could miss the rows it was supposed to read. Even when reading all shards to find a specific value,
there is a risk of polluting the cache - the rows loaded on incorrect shards may persist in the cache for an unknown
amount of time. The cache may also mistakenly remember that a row is missing, even though it's actually present,
just on an incorrect shard.
Some of the issues with this approach could be worked around using another sharder which always returns
this_shard_id() when asked about a shard. It's not clear how such a sharder would implement a method like
`token_for_next_shard`, and how much simpler it would be compared to the current "identity" sharder.
2. Using a sharder depending on the current allocation of tablets on the node. This approach relies on the
knowledge of group_id -> shard mapping at any point in time in the cluster. For this approach we'd also
need to either add a custom partitioner which encodes the group_id in the token, or we'd need to track the
token(group_id) -> shard mapping. This approach has the benefit over the one used in the series of keeping
the partition key as just group_id. However, it requires more logic, and the access to the live state of the node
in the sharder, and it's not static - the same token may be sharded differently depending on the state of the
node - it shouldn't occur in practice, but if we changed the state of the node before adjusting the table data,
we would be unable to access/fix the stale data without artificially also changing the state of the node.
3. Using metadata tables co-located to the strongly consistent tables. This approach could simplify the
metadata migrations in the future, however it would require additional schema management of all co-located
metadata tables, and it's not even obvious what could be used as the partition key in these tables - some
metadata is per-raft-group, so we couldn't reuse the partition key of the strongly consistent table for it. And
finding and remembering a partition key that is routed to a specific shard is not a simple task. Finally, splits
and merges will most likely need special handling for metadata anyway, so we wouldn't even make use of
co-located table's splits and merges.
Fixes [SCYLLADB-361](https://scylladb.atlassian.net/browse/SCYLLADB-361)
[SCYLLADB-361]: https://scylladb.atlassian.net/browse/SCYLLADB-361?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQClosesscylladb/scylladb#28509
* github.com:scylladb/scylladb:
docs: add strong consistency doc
test/cluster: add tests for strongly-consistent tables' metadata persistence
raft: enable multi-shard raft groups for strongly consistent tablets
test/raft: add unit tests for raft_groups_storage
raft: add raft_groups_storage persistence class
db: add system tables for strongly consistent tables' raft groups
dht: add fixed_shard_partitioner and fixed_shard_sharder
raft: add group_id -> shard mapping to raft_group_registry
schema: add with_sharder overload accepting static_sharder reference
Introduce maintenance_socket_authenticator and rework
maintenance_socket_role_manager to support role management operations.
Maintenance auth service uses allow_all_authenticator. To allow
role modification statements over the maintenance socket connections,
we need to treat the maintenance socket connections as superusers and
give them proper access rights.
Possible approaches are:
1. Modify allow_all_authenticator with conditional logic that
password_authenticator already does
2. Modify password_authenticator with conditional logic specific
for the maintenance socket connections
3. Extend password_authenticator, overriding the methods that differ
Option 3 is chosen: maintenance_socket_authenticator extends
password_authenticator with authentication disabled.
The maintenance_socket_role_manager is reworked to lazily create a
standard_role_manager once the node joins the cluster, delegating role
operations to it. In maintenance mode role operations remain disabled.
Refs SCYLLADB-409
The PR removes most of the code that assumes that group0 and raft topology is not enabled. It also makes sure that joining a cluster in no raft mode or upgrading a node in a cluster that not yet uses raft topology to this version will fail.
Refs #15422
No backport needed since this removes functionality.
Closesscylladb/scylladb#28514
* https://github.com/scylladb/scylladb:
group0: fix indentation after previous patch
raft_group0: simplify get_group0_upgrade_state function since no upgrade can happen any more
raft_group0: move service::group0_upgrade_state to use fmt::formatter instead of iostream
raft_group0: remove unused code from raft_group0
node_ops: remove topology over node ops code
topology: fix indentation after the previous patch
topology: drop topology_change_enabled parameter from raft_group0 code
storage_service: remove unused handle_state_* functions
gossiper: drop wait_for_gossip_to_settle and deprecate correspondent option
storage_service: fix indentation after the last patch
storage_service: remove gossiper bootstrapping code
storage_service: drop get_group_server_if_raft_topolgy_enabled
storage_service: drop is_topology_coordinator_enabled and its uses
storage_service: drop run_with_api_lock_in_gossiper_mode_only
topology: remove code that assumes raft_topology_change_enabled() may return false
test: schema_change_test: make test_schema_digest_does_not_change_with_disabled_features tests run in raft mode
test: schema_change_test: drop schema tests relevant for no raft mode only
topology: remove upgrade to raft topology code
group0: remove upgrade to group0 code
group0: refuse to boot if a cluster is still is not in a raft topology mode
storage_service: refuse to join a cluster in legacy mode
Add raft_groups_storage, a raft::persistence implementation for
strongly consistent tablet groups.
Currently, it's almost an exact copy of the raft_sys_table_storage that
uses the new raft tables for strongly consistent tables (raft_groups,
raft_groups_snapshots, raft_groups_snapshot_config) which have
a (shard, group_id) partition key.
In the future, the mutation, term and commit_idx data will be stored
differently for for strongly consistent tables than for group0, which
will differentiate this class from the original raft_sys_table_storage.
The storage is created for each raft group server and it takes a shard
parameter at construction time to ensure all queries target the correct
partition (and thus shard).
Add a custom partitioner and sharder that will be used for Raft tables
for strongly consistent tables. These tables will have partition keys
of the form (shard, group_id) and the partitioner creates tokens that
encode the target shard in the high 16 bits.
Token layout:
[shard: 16 bits][partition key hash: 48 bits]
This encoding guarantees that raft group data will be located on the
same shard as the tablet replica corresponding to that raft group as long
we use the tablet replica's shard as the value in the partition key.
Storing the shard directly in the partition key avoids additional lookups
for request routing to the incoming new raft tables.
For even more simplicity, we avoid biasing between uint64_t and int64_t
by limiting the acceptable shard ids up to 32767 (leaving the top bit 0),
which results in the same value of the token when interpreting either as
uint64_t or int64_t.
The sharder decodes the shard by extracting the high bits, which is
shard-count independent. This allows the partition key:shard mapping
to remain the same even during smp changes (only increases are allowed,
the same limitation as for tablets).
remove hand rolled error handling from object storage client
and replace with common machinery that supports exception
handling and retrying when appropriate
This patchset replaces permissions cache based on loading_cache with a new unified (permissions and roles), full, coherent auth cache.
Reason for the change is that we want to improve scenarios under stress and simplify operation manuals. New cache doesn't require any tweaking. And it behaves particularly better in scenarios with lots of schema entities (e.g. tables) combined with unprepared queries. Old cache can generate few thousands of extra internal tps due to cache refresh.
Benchmark of unprepared statements (just to populate the cache) with 1000 tables shows 3k tps of internal reads reduction and 9.1% reduction of median instructions per op. So many tables were used to show resource impact, cache could be filled with other resource types to show the same improvement.
Backport: no, it's a new feature.
Fixes https://github.com/scylladb/scylladb/issues/7397
Fixes https://github.com/scylladb/scylladb/issues/3693
Fixes https://github.com/scylladb/scylladb/issues/2589
Fixes https://scylladb.atlassian.net/browse/SCYLLADB-147Closesscylladb/scylladb#28078
* github.com:scylladb/scylladb:
test: boost: add auth cache tests
auth: add cache size metrics
docs: conf: update permissions cache documentation
auth: remove old permissions cache
auth: use unified cache for permissions
auth: ldap: add permissions reload to unified cache
auth: add permissions cache to auth/cache
auth: add service::revoke_all as main entry point
auth: explicitly life-extend resource in auth_migration_listener
The cache is covered already with general auth
dtests but some cases are more tricky and easier
to express directly as calls to cache class.
For such tests boost test file was added.
This reverts commit bcd1758911, reversing
changes made to b2c2a99741.
There is a design decision to not introduce additional test
orchestration tool for scylladb.git (see comments for #27499). One
commit has already been reverted in 55c7bc7. Last CI runs made validator
test flaky, so it is a time to remove all remaining validator tests.
It needs a backport to 2026.1 to remove remaining validator tests from there.
Fixes: VECTOR-497
Closesscylladb/scylladb#28568
Only two tests use it now -- the limit-data-source-test iself and a test
that validates continuous_data_consumer template.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
We use decoration instead of inheritance, since inheritance already
serves to differentiate statement types (modification_statement has
update_statement and delete_statement as descendants). A better
solution would likely involve refactoring modification_statement and
extracting the mutation-generation logic into a reusable component
shared by both eventual and strongly consistent statements.
Introduce two helper methods that will be used for strongly consistent
select_statement and modification_statement.
redirect_statement() forwards the request to another shard or node.
Currently, only shard forwarding is implemented; node-level proxying
will be added in follow-up PRs.
is_strongly_consistent() will be used in the prepare() method of raw
statements to determine whether a strongly consistent statement should
be created for the given CQL statement.
Add the `coordinator` class, which will be responsible for coordinating
reads and writes to strongly consistent tables. This commit includes
only the boilerplate; the methods will be implemented in separate
commits.
These commands will be used by strongly consistent tablets to submit
mutations to Raft. A simple state_machine implementation is introduced
to apply these commands.
We apply commands in batches to reduce commitlog I/O overhead. The
batched variant of database::apply has known atomicity issues. For
example, it does not guarantee atomicity under memory pressure: some
mutations may be published to the memtable while others are blocked in
run_when_memory_available. We will address these issues later.
In preparation for upcoming work on strongly consistent queries in
Scylla, this commit renames the existing `strongly_consistent`
statements to `broadcast_statements` to avoid confusion.
The old code paths are kept temporarily, as they may be useful for
reference or for copying parts during the implementation of the new
strongly consistent statements.
Since Vector Store service filtering API has been implemented (scylladb/vector-store#334), there is a need for the implementation of Scylla side part.
This patch should implement a `statement_restrictions` parsing into Vector Store filtering API compatible JSON objects.
Those objects should be added to ANN query vector POST requests as `filter` object.
After this patch, the subset of all operations ([Vector Search Filtering Milestone 1](https://scylladb.atlassian.net/wiki/spaces/RND/pages/156729450/Vector+Search+Filtering+Design+Document#Milestone-1)) happy path should be completed, allowing users to filter on primary key columns with single column `=` and `IN` or multiple column `()=()` and `() IN ()`.
The restrictions for other operations should be implemented in a PR on Vector Store service side.
---
This PR implements parsing the `statement_restrictions` into Vector Store filtering API compatible JSON objects.
The JSON objects are created and used in ANN vector queries with filtering.
It closes the Scylla side implementation of Vector Search filtering milestone 1.
Unit tests for `statement_restrictions` parsing are added. Integration tests will be added on Vector Store service side PR.
---
Fixes: SCYLLADB-249
New feature, should land into 2026.1
Closesscylladb/scylladb#28109
* github.com:scylladb/scylladb:
docs: update documentation on filtering with vector queries
test/vector_search: add test for filtered ANN with VS mock
test/vector_search: add restriction to JSON conversion unit tests
vector_search: cql: construct and use filter in ANN vector queries
select_statement: do not require post query ordering for vector queries
vector_search: add `statement_restrictions` to JSON parsing
Problem
-------
Secondary indexes are implemented via materialized views under the
hood. The way an index behaves is determined by the configuration
of the view. Currently, it can be modified by performing the CQL
statement `ALTER MATERIALIZED VIEW` on it. However, that raises some
concerns.
Consider, for instance, the following scenario:
1. The user creates a secondary index on a table.
2. In parallel, the user performs writes to the base table.
3. The user modifies the underlying materialized view, e.g. by setting
the `synchronous_updates` to `true` [1].
Some of the writes that happened before step 3 used the default value
of the property (which is `false`). That had an actual consequence
on what happened later on: the view updates were performed
asynchronously. Only after step 3 had finished did it change.
Unfortunately, as of now, there is no way to avoid a situation like
that. Whenever the user wants to configure a secondary index they're
creating, they need to do it in another schema change. Since it's
not always possible to control how the database is manipulated in
the meantime, it leads to problems like the one described.
That's not all, though. The fact that it's not possible to configure
secondary indexes is inconsistent with other schema entities. When
it comes to tables or materialized views, the user always have a means
to set some or even all of the properties during their creation.
Solution
--------
The solution to this problem is extending the `CREATE INDEX` CQL
statement by view properties. The syntax is of form:
```
> CREATE INDEX <index name>
> .. ON <keyspace>.<table> (<columns>)
> .. WITH <properties>
```
where `<properties>` corresponds to both index-specific and view
properties [2, 3]. View properties can only be used with indexes
implemented with materialized views; for example, it will be impossible
to create a vector index when specifying any view property (see
examples below).
When a view property is provided, it will be applied when creating the
underlying materialized view. The behavior should be similar to how
other CQL statements responsible for creating schema entities work.
High-level implementation strategy
----------------------------------
1. Make auxiliary changes.
2. Introduce data structures representing the new set of index
properties: both index-specific and those corresponding to the
underlying view.
3. Extend `CREATE INDEX` to accept view properties.
4. Extend `DESCRIBE INDEX` and other `DESCRIBE` statements to include
view properties in their output.
User documentation is also updated at the steps to reflect the
corresponding changes.
Implementation considerations
-----------------------------
There are a number of schema properties that are now obsolete. They're
accepted by other CQL statements, but they have no effect. They
include:
* `index_interval`
* `replicate_on_write`
* `populate_io_cache_on_flush`
* `read_repair_chance`
* `dclocal_read_repair_chance`
If the user tries to create a secondary index specifying any of those
keywords, the statement will fail with an appropriate error (see
examples below).
Unlike materialized views, we forbid specifying the clustering order
when creating a secondary index [4]. This limitation may be lifted
later on, but it's a detail that may or may not prove troublesome. It's
better to postpone covering it to when we have a better perspective on
the consequences it would bring.
Examples
--------
Good examples
```
> CREATE INDEX idx ON ks.t (v);
> CREATE INDEX idx ON ks.t (v) WITH comment = 'ok view property';
> CREATE INDEX idx ON ks.t (v)
.. WITH comment = 'multiple view properties are ok'
.. AND synchronous_updates = true;
> CREATE INDEX idx ON ks.t (v)
.. WITH comment = 'default value ok'
.. AND synchronous_updates = false;
```
Bad examples
```
> CREATE INDEX idx ON ks.t (v) WITH replicate_on_write = true;
SyntaxException: Unknown property 'replicate_on_write'
> CREATE INDEX idx ON ks.t (v)
.. WITH OPTIONS = {'option1': 'value1'}
.. AND comment = 'some text';
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Cannot specify options for a non-CUSTOM index"
> CREATE CUSTOM INDEX idx ON ks.t (v)
.. WITH OPTIONS = {'option1': 'value1'}
.. AND comment = 'some text';
InvalidRequest: Error from server: code=2200 [Invalid query]
message="CUSTOM index requires specifying the index class"
> CREATE CUSTOM INDEX idx ON ks.t (v)
.. USING 'vector_index'
.. WITH OPTIONS = {'option1': 'value1'}
.. AND comment = 'some text';
InvalidRequest: Error from server: code=2200 [Invalid query]
message="You cannot use view properties with a vector index"
> CREATE INDEX idx ON ks.t (v) WITH CLUSTERING ORDER BY (v ASC);
InvalidRequest: Error from server: code=2200 [Invalid query]
message="Indexes do not allow for specifying the clustering order"
```
and so on. For more examples, see the relevant tests.
References:
[1] https://docs.scylladb.com/manual/branch-2025.4/cql/cql-extensions.html#synchronous-materialized-views
[2] https://docs.scylladb.com/manual/branch-2025.4/cql/secondary-indexes.html#create-index
[3] https://docs.scylladb.com/manual/branch-2025.4/cql/mv.html#mv-options
[4] https://docs.scylladb.com/manual/branch-2025.4/cql/dml/select.html#ordering-clauseFixesscylladb/scylladb#16454
Backport: not needed. This is an enhancement.
Closesscylladb/scylladb#24977
* github.com:scylladb/scylladb:
cql3: Extend DESC INDEX by view properties
cql3: Forbid using CLUSTERING ORDER BY when creating index
cql3: Extend CREATE INDEX by MV properties
cql3/statements/create_index_statement: Allow for view options
cql3/statements/create_index_statement: Rename member
cql3/statements/index_prop_defs: Re-introduce index_prop_defs
cql3/statements/property_definitions: Add extract_property()
cql3/statements/index_prop_defs.cc: Add namespace
cql3/statements/index_prop_defs.hh: Rename type
cql3/statements/view_prop_defs.cc: Move validation logic into file
cql3/statements: Introduce view_prop_defs.{hh,cc}
cql3/statements/create_view_statement.cc: Move validation of ID
schema/schema.hh: Do not include index_prop_defs.hh
Use f strings instead, they are just as convenient with the added bonus
of editors providing syntax highighting for it.
Additionally, this shuts up CodeQL complaint about "Suspicious unused
loop iteration variable" in loops where the loop variable was passed to
format indirectly via **locals().
In 12dcf79c60, we avoid the ccache masquarate directory
when choosing sccache, as that would give us a double-caching
effect: first sccache is called, then clang++ is looked up
finding ccache masquarading as clang++. We solved that by
converting the name clang++ to the absolute path /usr/bin/clang++
(or whatever), skipping over the masquarade directory in $PATH.
It turns out that we need to do the same for ccache. That commit
changed the compile command to 'ccache clang++', and ccache will
look up clang++ in $PATH, finding itself in the masquarade directory.
Fix that by avoiding the masquarade directory if a compiler cache is
specified explicitly or is found with --compiler-cache=auto.
Closesscylladb/scylladb#27996
This pull request introduces HTTP response compression to Alternator, allowing responses (both string and chunked) to be compressed using `gzip` or `deflate` when requested by clients and when the response size exceeds configurable thresholds.
* Added new source files `http_compression.cc` and `http_compression.hh` implementing compression logic, including parsing client `Accept-Encoding` headers, selecting compression algorithms, and compressing response bodies using zlib.
* Added two new configuration options to `db::config` (`alternator_response_gzip_compression_level` and `alternator_response_gzip_compression_threshold_in_bytes`) to control compression level (and optionally disable compression with level 0 - no compression) and minimum response size for compression.
* Added tests showing compliance with DynamoDB behavior.
Fixes#27246
New feature - no backporting
Closesscylladb/scylladb#27454
* github.com:scylladb/scylladb:
alternator/http_compression: Add compression of streamed response
alternator/http_compression: Add implementation od gzip/deflate of string response
alternator/http_compression: Add handling of Accept-Encoding header
test/alternator: add tests for compressed responses
It should be possible to return the similarity of vectors in CQL statements following the [Cassandra compatible syntax](https://cassandra.apache.org/doc/latest/cassandra/getting-started/vector-search-quickstart.html#query-vector-data-with-cql):
```
SELECT comment, similarity_cosine(comment_vector, [0.1, 0.15, 0.3, 0.12, 0.05])
FROM cycling.comments_vs;
```
Although the calculations are slow, and we already have calculated results returned via Vector Store API,
we need the functionality as it allows us to calculate similarity of vectors not stored in vector indexes.
It will be needed for [quantization and rescoring](https://scylladb.atlassian.net/wiki/spaces/RND/pages/195985800/Quantization+and+Rescoring).
The feature is also a nice-to-have in testing as requested many times by testing and CX teams.
The optimized version utilizing already calculated distances from Vector Store without a need of rescoring will be coming soon after via https://github.com/scylladb/scylladb/pull/27991.
---
The patch adds functions:
- `similarity_cosine(<vector>, <vector>)`,
- `similarity_euclidean(<vector>, <vector>)`,
- `similarity_dot_product(<vector>, <vector>)`
Where `<vector>` is either a column of type `VECTOR<FLOAT, N>` or a vector of floats literal.
These functions can be called with every `SELECT` query, not only ANN vector queries as opposed to https://github.com/scylladb/scylladb/pull/25993.
The similarity calculations are implemented inspired by [USearch's implementation](
a2f1759910/include/usearch/index_plugins.hpp (L1304-L1385)) and made compatible with [Cassandra's documentation](https://cassandra.apache.org/doc/5.0/cassandra/developing/cql/functions.html#vector-similarity-functions).
That would guarantee the results in ScyllaDB are calculated using the exact same algorithms as used in Vector Store indexes.
---
Fixes: SCYLLADB-88
Fixes: SCYLLADB-89
New feature, should land into 2026.1
Closesscylladb/scylladb#27524
* github.com:scylladb/scylladb:
docs: add vector similarity functions documentation
test/cqlpy: add similarity functions correctness tests
test/cqlpy: add similarity functions invalid call tests
cql3: introduce similarity functions syntax
vector_similarity_fcts: introduce similarity functions
vector_similarity_fcts: retrieve similarity function argument types
vector_similarity_fcts: add calculating similarity between vectors
This is an initial patch to add support of Alternator's compressed responses.
The actual compression (gzip,deflate) will be added in the following commits.
The main functionality added in this commmit is parsing of Accept-Encoding header,
that indicates compression algorithms supported by the client.
In this commit we add also configuration parameters of response gzip/deflate compression.
They allow to enable/disable compression, set level and a size threshold below which a response is not compressed.
With current implementation it is possible to decide a compression for each response, but it is not used yet.
This commit introduces `compute_cosine_similarity`, `compute_euclidean_similarity`,
`compute_dot_product_similarity` functions to calculate the vectors similarity
in respective metric.
The similarity is a float value meaning how similar the vectors are in a range of [0, 1].
Values closer to 1 indicate greater similarity.
The `dot_product` similarity requires L2 normalized vectors as arguments.
The similarity is calculated based on the jVector's implementation used by Cassandra.
f967f1c924/jvector-base/src/main/java/io/github/jbellis/jvector/vector/VectorSimilarityFunction.java (L36-L69)
Add a `simple_value_with_expiry` utility class, which functions like
a `std::optional` with added timeout. When emplacing a value, user
needs to provide timeout, after which value expires (in which case
the `simple_value_with_expiry` object behaves as if was never set
at all).
Add boost tests for the new class.
Currently, we support ccache as the compiler cache. Since it is transparent, nothing
much is needed to support it.
This series adds support to sccache[1] and prefers it over ccache when it is installed.
sccache brings the following benefits over ccache:
1. Integrated distributed build support similar to distcc, but with automatic toolchain packaging and a scheduler
2. Rust support
3. C++20 modules (upcoming[2])
It is the C++20 modules support that motivates the series. C++20 modules have the potential to reduce
build times, but without a compiler cache and distributed build support, they come with too large
a penalty. This removes the penalty.
The series detects that sccache is installed, selects it if so (and if not overridden
by a new option), enables it for C++ and Rust, and disables ccache transparent
caching if sccache is selected.
Note: this series doesn't add sccache to the frozen toolchain or add dbuild support. That
is left for later.
[1] https://github.com/mozilla/sccache
[2] https://github.com/mozilla/sccache/pull/2516
Toolchain improvement, won't be backported.
Closesscylladb/scylladb#27834
* github.com:scylladb/scylladb:
build: apply sccache to rust builds too
build: prevent double caching by compiler cache
build: allow selecting compiler cache, including sccache
* seastar 7ec14e83...f0298e40 (8):
> Merge 'coroutine/try_future: call set_current_task() when resuming the coroutine' from Botond Dénes
coroutine/try_future: call set_current_task() when resuming the coroutine
core: move set_current_task() out-of-line
> stop_signal: stop including reactor.hh
> cmake: Mark hwloc headers as system includes to suppress warnings
> build: explicitly enable vptr sanitizer
> httpd: Add API to set tcp keepalive params
> Merge 'Make datagram_channel::send() use temporary_buffer-s' from Pavel Emelyanov
net: Remove no longer used to_iovec() helpers
net,code: Update callers to use new datagram_channel::send()
net: Introduce datagram_channel::send(span<temporary_buffer>) method
posix-stack: Make UDP socket implementation use wrapped_iovec
posix-stack: Introduce wrapped_iovec
> code: Move pollable_fd_state::write_all(const char*) from API level 9
> thread: Remove unused sched_group() helper
configure.py: added -lubsan to DEBUG sanitizer flags
Closesscylladb/scylladb#27511