Because keyspace is part of the query when we
migrate from v1 to v2 query should change otherwise
code would operate on old keyspace if those statics
were initialized.
Likewise keyspace name can no longer be class
field initialized in constructor as it can change
during class lifetime.
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.
Fixes#15050
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>
This series aims to allow users to set permissions on user-defined functions.
The implementation is based on Cassandra's documentation and should be fully compatible: https://cassandra.apache.org/doc/latest/cassandra/cql/security.html#cql-permissionsFixes: #5572Fixes: #10633Closes#12869
* github.com:scylladb/scylladb:
cql3: allow UDTs in permissions on UDFs
cql3: add type_parser::parse() method taking user_types_metadata
schema_change_test: stop using non-existent keyspace
cql3: fix parameter names in function resource constructors
cql3: handle complex types as when decoding function permissions
cql3: enforce permissions for ALTER FUNCTION
cql-pytest: add a (failing) test case for UDT in UDF
cql-pytest: add a test case for user-defined aggregate permissions
cql-pytest: add tests for function permissions
cql3: enforce permissions on function calls
selection: add a getter for used functions
abstract_function_selector: expose underlying function
cql3: enforce permissions on DROP FUNCTION
cql3: enforce permissions for CREATE FUNCTION
client_state: add functions for checking function permissions
cql-pytest: add a case for serializing function permissions
cql3: allow specifying function permissions in CQL
auth: add functions_resource to resources
Currently, when preparing an authorization statement on a specific
function, we're trying to "prepare" all cql types that appear in
the function signature while parsing the statement. We cannot
do that for UDTs, because we don't know the UDTs that are present
in the databse at parsing time. As a result, such authorization
statements fail.
To work around this problem, we postpone the "preparation" of cql
types until the actual statement validation and execution time.
Until then, we store all type strings in the resource object.
The "preparation" happens in the `maybe_correct_resource` method,
which is called before every `execute` during a `check_access` call.
At that point, we have access to the `query_processor`, and as a
result, to `user_types_metadata` which allows us to prepare the
argument types even for UDTs.
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
This is mostly a sed script that replaces methods' first argument
plus fixes of compiler-generated errors.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.
data_dictionary::database::real_database() is called from several
places, for these reasons:
- calling yet-to-be-converted code
- callers with a legitimate need to access data (e.g. system_keyspace)
but with the ::database accessor removed from query_processor.
We'll need to find another way to supply system_keyspace with
data access.
- to gain access to the wasm engine for testing whether used
defined functions compile. We'll have to find another way to
do this as well.
The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.
Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
This allows us to forward-declare raw_selector, which in turn reduces
indirect inclusions of expression.hh from 147 to 58, reducing rebuilds
when anything in that area changes.
Includes that were lost due to the change are restored in individual
translation units.
Closes#9434
Currently the statement's execute() method accepts storage
proxy as the first argument. This is enough for all of them
but schema altering ones, because the latter need to call
migration manager's announce.
To provide the migration manager to those who need it it's
needed to have some higher-level service that the proxy. The
query processor seems to be good candidate for it.
Said that -- all the .execute()s now accept the querty
processor instead of the proxy and get the proxy itself from
the query processor.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Change CQL parsing routine to return std::unique_ptr
instead of seastar::shared_ptr.
This can help reduce redundant shared_ptr copies even further.
Make some supplementary changes necessary for this transition:
* Remove enabled_shared_from_this base class from the following
classes: truncate_statement, authorization_statement,
authentication_statement: these were previously constructing
prepared_statement instance in `prepare` method using
`shared_from_this`.
Make `prepare` methods implementation of inheriting classes
mirror implementation from other statements (i.e.
create a shallow copy of the object when prepairing into
`prepared_statement`; this could be further refactored
to avoid copies as much as possible).
* Remove unused fields in create_role_statement which led to
error while using compiler-generated copy ctor (copying
uninitialied bool values via ctor).
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
cql_statement is a class representing a prepared statement in Scylla.
It is used concurrently during execution, so it is important that its
change is not changed by execution.
Add const qualifier to the execution methods family, throghout the
cql hierarchy.
Mark a few places which do mutate prepared statement state during
execution as mutable. While these are not affecting production today,
as code ages, they may become a source of latent bugs and should be
moved out of the prepared state or evaluated at prepare eventually:
cf_property_defs::_compaction_strategy_class
list_permissions_statement::_resource
permission_altering_statement::_resource
property_definitions::_properties
select_statement::_opts
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().
Mechanically converted with https://github.com/avikivity/unsprint.
So far the only way of returing a result of a CQL query was to build a
result_set. An alternative lazy result generator is going to be
introduced for the simple cases when no transformations at CQL layer are
needed. To do that we need to hide the fact that there are going to be
multiple representations of a cql results from the users.
The storage_proxy represents the entire cluster, so there's never a need
to access it on a remote shard; the local shard instance will contact
remote shard or remote nodes as needed.
Simplify the API by passing storage_proxy references instead of
seastar::sharded<storage_proxy> references. query_processor and
other callers are adjusted to call seastar::sharded::local() first.
Message-Id: <20180415142656.25370-2-avi@scylladb.com>
Instead of some functions in `allow_all_authorizer` throwing exceptions
and others being silently pass-through, we consistently return exception
futures with `auth::unsupported_authorization_operation`. These errors
are converted to `invalid_request_exception` in the CQL error and
ignored where appropriate in the auth subsystem.
auth: Decouple authorization and role management
Access control in Scylla consists of three main modules: authentication,
authorization, and role-management.
Each of these modules is intended to be interchangeable with alternative
implementations. The `auth::service` class composes these modules
together to perform all access-control functionality, including caching.
This architecture implies two main properties of the individual
access-control modules:
- Independence of modules. An implementation of authentication should
have no dependence or knowledge of authorization or role-management,
for example.
- Simplicity of implementing the interface. Functionality that is common
to all implementations should not have to be duplicated in each
implementation. The abstract interface for a module should capture
only the differences between particular implementations.
Previously, the authorization interface depended on an instance of
`auth::service` for certain operations, since it required aggregation
over all the roles granted to a particular role or required checking if
a given role had superuser.
This change decouples authorization entirely from role-management: the
authorizer now manages only permissions granted directly to a role, and
not those inherited through other roles.
When a query needs to be authorized, `auth::service::get_permissions`
first uses the role manager to check if the role has superuser. Then, it
aggregates calls to `auth::authorizer::authorize` for each role granted
to the role (again, from the role-manager) to determine the sum-total
permission set. This information is cached for future queries.
This structure allows for easier error handling and
management (something I hope to improve in the future for both the
authorizer and authenticator interfaces), easier system testing, easier
implementation of the abstract interfaces, and clearer system
boundaries (so the code is easier to grok).
Some authorizers, like the "TransitionalAuthorizer", grant permissions
to anonymous users. Therefore, we could not unconditionally authorize an
empty permission set in `auth::service` for anonymous users. To account
for this, the interface of the authorizer has changed to accept an
optional name in `authorize`.
One additional notable change to the authorizer is the
`auth::authorizer::list`: previously, the filtering happened at the CQL
query layer and depended on the roles granted to the role in question.
I've changed the function to simply query for all roles and I do the
filtering in `auth::system` in-memory with the STL. This was necessary
to allow the authorizer to be decoupled from role-management. This
function is only called for LIST PERMISSIONS (so performance is not a
concern), and it significantly reduces demand on the implementation.
Finally, we unconditionally create a user in `cql_test_env` since
authorization requires its existence.
the value for the `role` column is equal to the value for the `username`
column.
This change makes LIST PERMISSIONS backwards compatible with clients
that expect the `username` column to exist. This functionality also
exists in Apache Cassandra.
This patch replaces duplicated code for checking the existence of a user
with the same mechanism for doing so as elsewhere: by checking for
`auth::nonexistent_role` being thrown during the course of checking
access-control.
This patch also ensures that exceptions thrown while querying the list
of permissions on a resource get handled correctly.
Previously, a "data" auth. resource knew how to check it's own existence by
accessing a global variable.
This patch accomplishes two things: it adds existence checking to all
kinds of resources, and moves these checks outside of `auth::resource`
itself and into `auth::service` (so that global variables are no longer
accessed).
This has the dual benefit of not enforcing copying on implementations of
the abstract interface and also limiting unnecessary copies.
As usual with Seastar, we follow the convention that a reference
parameter to a function is assumed valid for the duration of the
`future` that is returned. `do_with` helps here.
By adding some constants for root resources, we can avoid using
`seastar::do_with` at some call-sites involving `resource` instances.
All authorization checking lives in the CQL layer. The individual
authenticator, authorizer, and role-manager enforce no access-checks.
It may be a good idea to move these checks a level downward in the
future for ease of testing, but for now we aim for consistency.
All we require are value semantics.
`client_state` still stores `authenticated_user` in a `shared_ptr`, but
the behavior of that class is complex enough to warrant its own
discussion/design/refactor.
This is a large change, but it's a necessary evil.
This change brings us to a minimally-functional implementation of roles.
There are many additional changes that are necessary, including refined
grammar, bug fixes, code hygiene, and internal code structure changes.
In the interest of keeping this patch somewhat read-able, those changes
will come in subsequent patches. Until that time, roles are still marked
"unimplemented".
IMPORTANT: This code does not include any mechanism for transitioning a
cluster from user-based access-control to role-based access control. All
existing access-control metadata will be ignored (though not deleted).
Specific changes:
- All user-specific CQL statements now delegate to their roles
equivalent. The statements are effectively the same, but CREATE USER
will include LOGIN automatically. Also, LIST USERS only lists roles
with LOGIN.
- A call to LIST PERMISSIONS will now also list permissions of roles
that have been granted to the caller, in addition to permissions which
have been granted directly.
- Much of the logic of creating, altering, and deleting roles has been
moved to `auth::service`, since these operations require cooperation
between the authenticator, authorizer, and role-manager.
- LIST USERS actually works as expected now (fixes#2968).
This change generalizes the implementation of a `resource` to many
different kinds of resources, though there is still only one
kind (`data`). In the future, we also expect resource kinds for roles,
user-defined functions (UDFs), and possibly on particular REST
end-points.
I considered several approaches to generalizing to different kinds of
resources.
One approach is to have a base class that is inherited from by different
resource kinds. The common functionality would be accessed through
virtual member functions and kind-specific functions would exist in
sub-classes. I rejected this approach because dealing with different
kinds of resources uniformly requires storage and life-time management
through something like `std::unique_ptr<auth::resource>`, which means
that we lose value semantics (including comparison) and must deal with
complications around ownership.
Another option was to use `boost::variant` (or, in future,
`std::variant`). This is closer to what we want, since there a static
set of resource kinds that we support. I rejected this approach for two
reasons. The first is that all resource kinds share the same data (a
list of segments and a root identifier), which would be duplicated in
each type that composed the variant. The second is that the complexity
and source-code overhead of `boost::variant` didn't seem warranted.
The solution I ended up with is home-grown variant. All resources are
described in the same `final` class: `auth::resource`. This class has
value semantics, supports equality comparison, and has a strict
ordering. All resources have in common a tag ("kind") and a list of
parts. Most operations on resources don't care about the kind of
resource (like getting its name, parsing a name, querying for the
parent, etc). These are just member functions of the class.
When we care about a kind-specific interpretation of a resource, we can
produce a "view" of the resource. For example, `data_resource_view`
allows for accessing the (optional) keyspace and table names.
I anticipate in the future to add functions for creating role
resources (`auth::resource::role`) and also `role_resource_view`.
The functional behaviour of the system should be unchanged with this
patch.
I've added new unit tests in `auth_resource_test.cc` and removed the old
test from `auth_test.cc`.
Fixes#3027.
This change appears quite large, but is logically fairly simple.
Previously, the `auth` module was structured around global state in a
number of ways:
- There existed global instances for the authenticator and the
authorizer, which were accessed pervasively throughout the system
through `auth::authenticator::get()` and `auth::authorizer::get()`,
respectively. These instances needed to be initialized before they
could be used with `auth::authenticator::setup(sstring type_name)`
and `auth::authorizer::setup(sstring type_name)`.
- The implementation of the `auth::auth` functions and the authenticator
and authorizer depended on resources accessed globally through
`cql3::get_local_query_processor()` and
`service::get_local_migration_manager()`.
- CQL statements would check for access and manage users through static
functions in `auth::auth`. These functions would access the global
authenticator and authorizer instances and depended on the necessary
systems being started before they were used.
This change eliminates global state from all of these.
The specific changes are:
- Move out `allow_all_authenticator` and `allow_all_authorizer` into
their own files so that they're constructed like any other
authenticator or authorizer.
- Delete `auth.hh` and `auth.cc`. Constants and helper functions useful
for implementing functionality in the `auth` module have moved to
`common.hh`.
- Remove silent global dependency in
`auth::authenticated_user::is_super()` on the auth* service in favour
of a new function `auth::is_super_user()` with an explicit auth*
service argument.
- Remove global authenticator and authorizer instances, as well as the
`setup()` functions.
- Expose dependency on the auth* service in
`auth::authorizer::authorize()` and `auth::authorizer::list()`, which
is necessary to check for superuser status.
- Add an explicit `service::migration_manager` argument to the
authenticators and authorizers so they can announce metadata tables.
- The permissions cache now requires an auth* service reference instead
of just an authorizer since authorizing also requires this.
- The permissions cache configuration can now easily be created from the
DB configuration.
- Move the static functions in `auth::auth` to the new `auth::service`.
Where possible, previously static resources like the `delayed_tasks`
are now members.
- Validating `cql3::user_options` requires an authenticator, which was
previously accessed globally.
- Instances of the auth* service are accessed through `external`
instances of `client_state` instead of globally. This includes several
CQL statements including `alter_user_statement`,
`create_user_statement`, `drop_user_statement`, `grant_statement`,
`list_permissions_statement`, `permissions_altering_statement`, and
`revoke_statement`. For `internal` `client_state`, this is `nullptr`.
- Since the `cql_server` is responsible for instantiating connections
and each connection gets a new `client_state`, the `cql_server` is
instantiated with a reference to the auth* service.
- Similarly, the Thrift server is now also instantiated with a reference
to the auth* service.
- Since the storage service is responsible for instantiating and
starting the sharded servers, it is instantiated with the sharded
auth* service which it threads through. All relevant factory functions
have been updated.
- The storage service is still responsible for starting the auth*
service it has been provided, and shutting it down.
- The `cql_test_env` is now instantiated with an instance of the auth*
service, and can be accessed through a member function.
- All unit tests have been updated and pass.
Fixes#2929.
This change is motivated partly be aesthetics, but more significantly
due to the future work to refactor `auth` into a sharded service. Since
doing so will require writing `auth::auth` from scratch, these
constants (and other common functionality) need a new home.
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
- 'net' namespace conflicts with seastar::net, renamed to 'netw'.
- 'transport' namespace conflicts with seastar::transport, renamed to
cql_transport.
- "logger" global variables now conflict with logger global type, renamed
to xlogger.
- other minor changes