Commit Graph

124 Commits

Author SHA1 Message Date
Petr Gusev
167a3c9c50 cql: fix missing TABLETS_ROUTING_V1 payload after CAS shard bounce
After an internal CAS shard bounce, check_locality() was evaluating
against this_shard_id() of the post-bounce shard — which is the correct
tablet shard — so it returned nullopt, and LWT/SERIAL responses omitted
the tablets-routing-v1 custom payload. The client never learned the
correct tablet map.

Fix by recording the original entry shard in client_state (initialized
to this_shard_id() at construction, preserved across shard bounces via
client_state_for_another_shard) and passing it to check_locality() so
it compares against the client's actual routing decision.

No host_id tracking or forwarded_client_state IDL changes are needed
because CAS shard bounces are always intra-node.

Fixes SCYLLADB-2041
2026-05-15 11:56:14 +02:00
Avi Kivity
0ae22a09d4 LICENSE: Update to version 1.1
Updated terms of non-commercial use (must be a never-customer).
2026-04-12 19:46:33 +03:00
Dawid Mędrek
4a87bdc778 service: client_state: Extend with abort_source
We make `client_state` store a pointer to an `abort_source`. This will
be useful in the following commit that will implement aborting ongoing
requests to strongly consistent tables upon connection shutdowns.
It might also be useful in some other places in the code in the future.

We set the abort source for client states in relevant places.
2026-04-09 11:35:35 +02:00
Piotr Dulikowski
d8b283e1fb Merge 'Add CQL forwarding for strongly consistent tables' from Wojciech Mitros
In this series we add support for forwarding strongly consistent CQL requests to suitable replicas, so that clients can issue reads/writes to any node and have the request executed on an appropriate tablet replica (and, for writes, on the Raft leader). We return the same CQL response as what the user would get while sending the request to the correct replica and we perform the same logging/stats updates on the request coordinator as if the coordinator was the appropriate replica.

The core mechanism of forwarding a strongly consistent request is sending an RPC containing the user's cql request frame to the appropriate replica and returning back a ready, serialized `cql_transport::response`. We do this in the CQL server - it is most prepared for handling these types and forwarding a request containing a CQL frame allows us to reuse near-top-level methods for CQL request handling in the new RPC handler (such as the general `process`)

For sending the RPC, the CQL server needs to obtain the information about who should it forward the request to. This requires knowledge about the tablet raft group members and leader. We obtain this information during the execution of a `cql3/strong_consistency` statement, and we return this information back to the CQL server using the generalized `bounce_to_shard` `response_message`, where we now store the information about either a shard, or a specific replica to which we should forward to. Similarly to `bounce_to_shard`, we need to handle this `result_message` in a loop - a replica may move during statement execution, or the Raft leader can change. We also use it for forwarding strongly consistent writes when we're not a member of the affected tablet raft group - in that case we need to forward the statement twice - once to any replica of the affected tablet, then that replica can find the leader and return this information to the coordinator, which allows the second request to be directed to the leader.

This feature also allows passing through exception messages which happened on the target replica while executing the statement. For that, many methods of the `cql_transport::cql_server::connection` for creating error responses needed to be moved to `cql_transport::cql_server`. And for final exception handling on the coordinator, we added additional error info to the RPC response, so that the handling can be performed without having the `result_message::exception` or `exception_ptr` itself.

Fixes [SCYLLADB-71](https://scylladb.atlassian.net/browse/SCYLLADB-71)

[SCYLLADB-71]: https://scylladb.atlassian.net/browse/SCYLLADB-71?atlOrigin=eyJpIjoiNWRkNTljNzYxNjVmNDY3MDlhMDU5Y2ZhYzA5YTRkZjUiLCJwIjoiZ2l0aHViLWNvbS1KU1cifQ

Closes scylladb/scylladb#27517

* github.com:scylladb/scylladb:
  test: add tests for CQL forwarding
  transport: enable CQL forwarding for strong consistency statements
  transport: add remote statement preparation for CQL forwarding
  transport: handle redirect responses in CQL forwarding
  transport: add exception handling for forwarded CQL requests
  transport: add basic CQL request forwarding
  idl: add a representation of client_state for forwarding
  cql_server: handle query, execute, batch in one case
  transport: inline process_on_shard in cql_server::process
  transport: extract process() to cql_server
  transport: add messaging_service to cql_server
  transport: add response reconstruction helpers for forwarding
  transport: generalize the bounce result message for bouncing to other nodes
  strong consistency: redirect requests to live replicas from the same rack
  transport: pass foreign_ptr into sleep_until_timeout_passes and move it to cql_server
  transport: extract the error handling from process_request_one
  transport: move error response helpers from connection to cql_server
2026-03-13 15:03:10 +01:00
Wojciech Mitros
170b82ddca idl: add a representation of client_state for forwarding
In the following patches, when we start allowing to forward CQL
requests to other nodes, we'll need to use the same client state
for executing the request on the destination node as we had on the
source. client_state contains many fields and we need to create
a new instance of it when we start handling the forwarded request,
so to prepare for the forwarding RPC, we add a serializable format
of the client_state as an IDL struct. The new class is missing some
fields that are not used while executing requests, and some whose
value is determined by the fact that the client state is used for
a forwarded request.
These include:
- driver name, driver version, client options - not used for executing
requests. Instead, we use these as data sources for the virtual
"clients" system table.
- auth_state - must be READY - we reached a bounce message, so we were
able to try executing the request locally
- _control_connection - used for altering a cql_server::connection, which
we don't have on the target node
- _default_timeout_config - used when updating service levels, also only
per-connection
- workload_type - used for deciding whether to allow shedding at the
start of processing the request, and for getting per-connection service
level params (for an API)
2026-03-12 17:48:58 +01:00
Gleb Natapov
c67f876893 service level: make maybe_update_per_service_level_params synchronous
It does not call async functions any more.
2026-03-12 15:53:08 +02:00
Dario Mirovic
6a1edab2ac client_state: add has_superuser method
Encapsulate the superuser check in client_state so that it
respects _bypass_auth_checks. Connections that bypass auth
(internal callers and the maintenance socket) are always
considered superusers.

Migrate existing call sites from auth::has_superuser(service, user)
to client_state.has_superuser(). Also add _bypass_auth_checks
handling to ensure_not_anonymous().

Refs SCYLLADB-409
2026-03-03 22:31:35 +01:00
Dario Mirovic
d765b5b309 client_state: add _bypass_auth_checks flag
Authorization checks were previously skipped based on the
_is_internal flag. This couples two concerns: marking client
state as internal and bypassing authorization.

Introduce _bypass_auth_checks to handle only the authorization
bypass. Internal client state sets it to true, preserving current
behavior. External client state accepts it as a constructor
parameter, defaulting to false.

This will allow maintenance socket connections to skip
authorization without being marked as internal.

Refs SCYLLADB-409
2026-03-03 22:31:35 +01:00
Marcin Maliszkiewicz
931a38de6e service: remove unused has_schema_access
It became unused after we dropped support for thrift
in ad649be1bf

Closes scylladb/scylladb#28341
2026-01-28 10:18:26 +02:00
Vlad Zolotarov
28cbaef110 service/client_state and alternator/server: use cached values for driver_name and driver_version fields
Optimize memory usage changing types of driver_name and driver_version be
a reference to a cached value instead of an sstring.

These fields very often have the same value among different connections hence
it makes sense to cache these values and use references to them instead of duplicating
such strings in each connection state.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2025-12-20 12:26:22 -05:00
Vlad Zolotarov
85adf6bdb1 system.clients: add a client_options column
This new column is going to contain all OPTIONS sent in the
STARTUP frame of the corresponding CQL session.

The new column has a `frozen<map<text, text>>` type, and
we are also optimizing the amount of required memory for storing
corresponding keys and values by caching them on each shard level.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2025-12-20 12:26:15 -05:00
Nadav Har'El
eb06ace944 Merge 'auth: implement vector store authorization' from Michał Hudobski
This patch implements the changes required by the Vector Store authorization, as described in https://scylladb.atlassian.net/wiki/spaces/RND/pages/107085899/Vector+Store+Authentication+And+Authorization+To+ScyllaDB, that is:

- adding a new permission VECTOR_SEARCH_INDEXING, grantable only on ALL KEYSPACES
- allowing users with that permission to perform SELECT queries, but only on tables with a vector index
- increasing the number of scheduling groups by one to allow users to create a service level for a vector store user
- adjusting the tests and documentation

These changes are needed, as the vector indexes are managed by the external service, Vector Store, which needs to read the tables to create the indexes in its memory. We would like to limit the privileges of that service to a minimum to maintain the principle of least privilege, therefore a new permission, one that allows the SELECTs conditional on the existence of a vector_index on the table.

Fixes: VECTOR-201

Backport reasoning:
Backport to 2025.4 required as this can make upgrading clusters more difficult if we add it in 2026.1. As for now Scylla Cloud requires version 2025.4 to enable vector search and permission is set by orchestrator so there is no chance that someone will try to add this permission during upgrade. In 2026.1 it will be more difficult.

Closes scylladb/scylladb#25976

* github.com:scylladb/scylladb:
  docs: adjust docs for VS auth changes
  test: add tests for VECTOR_SEARCH_INDEXING permission
  cql: allow VECTOR_SEARCH_INDEXING users to select
  auth: add possibilty to check for any permission in set
  auth: add a new permission VECTOR_SEARCH_INDEXING
2025-10-20 17:32:00 +03:00
Andrzej Jackowski
f99b8c4a55 transport: use sl:driver to handle driver's control connections
Before `sl:driver` was introduced, service levels were assigned as
follows:
 1. New connections were processed in `main`.
 2. After user authentication was completed, the connection's SL was
    changed to the user's SL (or `sl:default` if the user had no SL).

This commit introduces `service_level_state` to `client_state` and
implements the following logic in `transport/server`:

 1. If `sl:driver` is not present in the system (for example, it was
    removed), service levels behave as described above.
 2. If `sl:driver` is present, the flow is:
   I.   New connections use `sl:driver`.
   II.  After user authentication is completed, the connection's SL is
        changed to the user's SL (or `sl:default`).
   III. If a REGISTER (to events) request is handled, the client is
        processing the control connection. We mark the client_state
        to permanently use `sl:driver`.

The aforementioned state `2.III` is represented by
`_control_connection` flag in `client_state`.

Fixes: scylladb/scylladb#24411
2025-10-08 08:25:28 +02:00
Michał Hudobski
6a69bd770a cql: allow VECTOR_SEARCH_INDEXING users to select
This patch allows users with the VECTOR_SEARCH_INDEXING permission
to perform SELECT queries on tables that have a vector index.
This is needed for the Vector Store service, which
reads the vector-indexed tables, but does not require
the full SELECT permission.
2025-10-03 16:55:57 +02:00
Michał Hudobski
3025a35aa6 auth: add possibilty to check for any permission in set
This commit adds a new version of command_desc struct
that contains a set of permissions instead of a singular
permission. When this struct is passed to ensure/check_has_permission,
we check if the user has any of the included permission on the resource.
2025-10-03 16:55:57 +02:00
Ernest Zaslavsky
5ba5aec1f8 treewide: Move mutation related files to a mutation directory
As requested in #22104, moved the files and fixed other includes and build system.

Moved files:
 - combine.hh
 - collection_mutation.hh
 - collection_mutation.cc
 - converting_mutation_partition_applier.hh
 - converting_mutation_partition_applier.cc
 - counters.hh
 - counters.cc
 - timestamp.hh

Fixes: #22104

This is a cleanup, no need to backport

Closes scylladb/scylladb#25085
2025-09-24 13:23:38 +03:00
Avi Kivity
1258e7c165 Revert "Merge 'transport: service_level_controller: create and use driver service level' from Andrzej Jackowski"
This reverts commit fe7e63f109, reversing
changes made to b5f3f2f4c5. It is causing
test.py failures around cqlpy.

Fixes #26163

Closes scylladb/scylladb#26174
2025-09-22 09:32:46 +03:00
Andrzej Jackowski
c02535635e transport: use sl:driver to handle driver's control connections
Before `sl:driver` was introduced, service levels were assigned as
follows:
 1. New connections were processed in `main`.
 2. After user authentication was completed, the connection's SL was
    changed to the user's SL (or `sl:default` if the user had no SL).

This commit introduces `service_level_state` to `client_state` and
implements the following logic in `transport/server`:

 1. If `sl:driver` is not present in the system (for example, it was
    removed), service levels behave as described above.
 2. If `sl:driver` is present, the flow is:
   I.   New connections use `sl:driver`.
   II.  After user authentication is completed, the connection's SL is
        changed to the user's SL (or `sl:default`).
   III. If a REGISTER (to events) request is handled, the client is
        processing the control connection. We mark the client_state
        to permanently use `sl:driver`.

The aforementioned state `2.III` is represented by
`_control_connection` flag in `client_state`.

Fixes: scylladb/scylladb#24411
2025-09-18 09:29:37 +02:00
Petr Gusev
78aa36b257 check_internal_table_permissions: handle Paxos state tables
CDC and $paxos tables are managed internally by Scylla. Users are
already prohibited from running ALTER and DROP commands on CDC tables.
In this commit, we extend the same restrictions to $paxos tables to
prevent users from shooting themselves in the foot.

Other commands are generally allowed for CDC and $paxos tables. An
important distinction is that CDC tables are meant to be accessed
directly by users, so appropriate permissions must be set for
non-superusers. In contrast, $paxos tables are not intended for direct
access by users. Therefore, this commit explicitly disallows
non-superusers from accessing them. Superusers are still allowed
access for debugging and troubleshooting purposes.

Note that these restrictions apply even if explicit permissions have
been granted. For example, a non-superuser may be granted SELECT
permissions on a $paxos table, but the restriction above will
still take precedence. We don't try to restrict users
from giving permissions to $paxos tables for simplicity.
2025-07-24 19:48:08 +02:00
Petr Gusev
ec3c5f4cbc client_state: extract check_internal_table_permissions
This is a refactoring commit — it extracts the CDC permissions handling
logic into a separate function: check_internal_table_permissions.

This is a preparatory step for the next commit, where we'll handle
paxos state tables similarly to CDC tables.
2025-07-24 19:48:08 +02:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Michał Jadwiszczak
087bbdc4c8 service/client_state: add synchronous method to update service level params
Similarly to `maybe_update_per_service_level_params`, the method update
connection's params but it gets `service_level_options` as an argument
instead of asking `service_level_controller`.
2024-12-03 10:50:02 +01:00
Kefu Chai
ad649be1bf treewide: drop thrift support
thrift support was deprecated since ScyllaDB 5.2

> Thrift API - legacy ScyllaDB (and Apache Cassandra) API is
> deprecated and will be removed in followup release. Thrift has
> been disabled by default.

so let's drop it. in this change,

* thrift protocol support is dropped
* all references to thrift support in document are dropped
* the "thrift_version" column in system.local table is
  preserved for backward compatibility, as we could load
  from an existing system.local table which still contains
  this clolumn, so we need to write this column as well.
* "/storage_service/rpc_server" is only preserved for
  backward compatibility with java-based nodetool.
* `rpc_port` and `start_rpc` options are preserved, but
  they are marked as "Unused". so that the new release
  of scylladb can consume existing scylla.yaml configurations
  which might contain these settings. by making them
  deprecated, user will be able get warned, and update
  their configurations before we actually remove them
  in the next major release.

Fixes #3811
Fixes #18416
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2024-06-07 06:44:59 +08:00
Marcin Maliszkiewicz
6f654675c6 auth: add a way to announce mutations having only client_state ref
Statements code have only access to client_state from
which it takes auth::service. It doesn't have abort_source
nor group0_client so we need to add them to auth::service.

Additionally since abort_source can't be const the whole
announce_mutations method needs non const auth::service
so we need to remove const from the getter function.
2024-06-04 15:43:04 +02:00
Marcin Maliszkiewicz
5607aa590e service: don't loose service_level_controller when bouncing client_state
When bounce_to_shard happens we need to fill client_state with
sl_controller appropriate for destination shard.

Before the patch sl_controller was set to null after the bounce.
It was fine becauase looks like it was never used in such scenario.
With auth-v2 we need to bounce attach/detach service level statements
because they modify things via auth subsystem which needs to be called
on shard 0.
2024-03-01 16:25:14 +01:00
Kefu Chai
ece2bd2f6e service: do not include unused headers
these unused includes were identified by clangd. see
https://clangd.llvm.org/guides/include-cleaner#unused-include-warning
for more details on the "Unused include" warning.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes scylladb/scylladb#16764
2024-01-15 13:29:33 +02:00
Yaniv Kaul
c658bdb150 Typos: fix typos in comments
Fixes some typos as found by codespell run on the code.
In this commit, I was hoping to fix only comments, not user-visible alerts, output, etc.
Follow-up commits will take care of them.

Refs: https://github.com/scylladb/scylladb/issues/16255
Signed-off-by: Yaniv Kaul <yaniv.kaul@scylladb.com>
2023-12-02 22:37:22 +02:00
Gleb Natapov
4bad482e4b cql3: move validation::validate_column_family from client_state::has_column_family_access
Checking keyspace/table presence should not be part of authorization code
and it is not done consistently today.  For instance keyspace presence
is not checked in "alter keyspace" during authorization, but during
statement execution. Make it consistent.
2023-06-22 13:57:36 +03:00
Gleb Natapov
31bddb65c7 client_state: drop unneeded argument from has.*access functions
After previous patch we can drop db argument to most of has.*access
functions in the client_state.
2023-06-22 13:57:36 +03:00
Kefu Chai
98b9cbbc92 client_state: split the param list of ctor into multi lines
it is 215-chars long, so let's breaks it into multiple lines for
better readability.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
2023-03-29 20:17:45 +08:00
Piotr Sarna
d10799a834 client_state: add functions for checking function permissions
The helper functions will be later used to enforce permissions
for user-defined functions.
2023-03-09 17:50:56 +01:00
Petr Gusev
bd80a449d5 transport server: log client errors with debug level
Ideally, these errors should be transparently delivered
to the client, but in practice, due to various
flaws/bugs in scylla and/or the driver,
they can be lost, which enormously complicates troubleshooting.

const socket_address& get_remote_address() is needed for its
convenient conversion to string, which includes ip and port.
2023-02-07 13:53:38 +04:00
Piotr Sarna
66f7ab666f client_state: add internal constructor with auth_service
The constructor can be used as backdoor from frontends other than
CQL to create a session with an authenticated user, with access
to its attached service level information.
2022-09-05 10:03:00 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
71c3a7525b client_state: Make has_access use data_dictionary::database
This db argument is only needed to be pushed into
cdc::is_log_for_some_table() helper. All callers already have
the d._d.::database at hands and convert it into .real_database()
call-time, so this patch effectively generalizes those calls to
the .real_database().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-14 12:59:35 +03:00
Pavel Emelyanov
f22eb22b8b client_state: Make has_schema_access use data_dictionary::database
It's now called with d._d.::database converted to .real_database()
right in the argument passing, so this change can be treated as
the generalization of that .real_database() call.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-14 12:55:53 +03:00
Pavel Emelyanov
b6bc7a9b29 client_state: Make has_column_family_access use data_dictionary::database
Straightforward replacement. Internals of the has_column_family_access()
temporarily get .real_database(), but it will be changed soon.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-14 12:55:15 +03:00
Pavel Emelyanov
1ed237120a client_state: Make has_keyspace_access use data_dictionary::database
Straightforward replacement. Internals of the has_keyspace_access()
temporarily get .real_database(), but it will be changed soon.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2022-01-14 12:54:01 +03:00
Gleb Natapov
bcfdcc51d6 thrift: authenticate a statement before verifying in system_update_column_family()
Otherwise it is possible to infer if a table exist without having proper
credentials.
2022-01-12 16:33:16 +02:00
Avi Kivity
bbad8f4677 replica: move ::database, ::keyspace, and ::table to replica namespace
Move replica-oriented classes to the replica namespace. The main
classes moved are ::database, ::keyspace, and ::table, but a few
ancillary classes are also moved. There are certainly classes that
should be moved but aren't (like distributed_loader) but we have
to start somewhere.

References are adjusted treewide. In many cases, it is obvious that
a call site should not access the replica (but the data_dictionary
instead), but that is left for separate work.

scylla-gdb.py is adjusted to look for both the new and old names.
2022-01-07 12:04:38 +02:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Pavel Emelyanov
2701a1ee28 client_state: Pass database into has_access()
All callers of it already have it, so just pass it along

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-27 14:07:26 +03:00
Pavel Emelyanov
de7761985c client_state: Add database argument to has_schema_access
The only caller is thrift that has database reference on board

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-27 14:07:26 +03:00
Pavel Emelyanov
36a4c1ddc1 client_state: Add database argument to has_keyspace_access()
Callers are cql3, that has database via proxy, and thrift that
has one by reference.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-27 14:07:18 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Sarna
409c67b1b4 client_state: hook workload type from service levels
The client state is now aware of its workload type derived
from its attached service level.
2021-05-27 13:02:22 +02:00
Piotr Sarna
7ee5686d6c client_state: allow updating per service level params
Per-service-level params can now be updated with a helper function.
2021-05-10 12:39:41 +02:00
Piotr Sarna
e257ec11c0 treewide: remove service level controller from query state
... since it's accessible through its member, client state.
2021-05-10 11:48:14 +02:00
Piotr Sarna
d1f2e8b469 treewide: propagate service level to client state
... since it's going to be used to set up per-service-level
timeouts.
2021-05-10 11:48:14 +02:00