Materialized Views and Secondary Indexes are yet another features that
keyspaces with tablets do not support, but these were not listed in a
warning message returned to the user on CREATE KEYSPACE statement. This
commit adds the 2 missing features.
Fixes: #24006Closesscylladb/scylladb#23902
(cherry picked from commit f740f9f0e1)
Closesscylladb/scylladb#24084
All tablets configuration was moved into its own "with tablets" section,
this option name cannot be met among replication factors.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#23555
In this commit, we refuse to create or alter a keyspace when that operation
would make it RF-rack-invalid if the option `rf_rack_valid_keyspaces` is
enabled.
We provide two tests verifying that the changes work as intended.
Fixesscylladb/scylladb#23276
Use the keyspace initial_tablets for min_tablet_count, if the latter
isn't set, then take the maximum of the option-based tablet counts:
- min_tablet_count
- and expected_data_size_in_gb / target_tablet_size
- min_per_shard_tablet_count (via
calculate_initial_tablets_from_topology)
If none of the hints produce a positive tablet_count,
fall back to calculate_initial_tablets_from_topology * initial_scale.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Per-table hints should be used instead.
Note: the warning is produced by check_against_restricted_replication_strategies
which is called also from alter_keyspace_statement.
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Integrates audit functionality into CQL statement processing to enable tracking of database operations. Key changes:
- Add audit_info and statement_category to all CQL statements
- Implement audit categories for different statement types:
- DDL: Schema altering statements (CREATE/ALTER/DROP)
- DML: Data manipulation (INSERT/UPDATE/DELETE/TRUNCATE/USE)
- DCL: Access control (GRANT/REVOKE/CREATE ROLE)
- QUERY: SELECT statements
- ADMIN: Service level operations
- Add audit inspection points in query processing:
- Before statement execution
- After access checks
- After statement completion
- On execution failures
- Add password sanitization for role management statements
- Mask plaintext passwords in audit logs
- Handle both direct password parameters and options maps
- Preserve query structure while hiding sensitive data
- Modify prepared statement lifecycle to carry audit context
- Pass audit info during statement preparation
- Track audit info through statement execution
- Support batch statement auditing
This change enables comprehensive auditing of CQL operations while ensuring sensitive data is properly masked in audit logs.
With the tablets feature always enabled (Unless gossip toopology
changes are forced), the enable_tablets option now controls only
the default for newly created keyspaces.
Even when set to `false`, tablets are still enabled as a
feature and the user may explicitly enable tablets
using `CREATE KEYSPACE <name> WITH tablets = {'enabled': true}`
Note: best viewed with `git show -w`
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
This reverts commit c286434e4c, reversing
changes made to 6712fcc316.
The commit causes memtable_test to be very flaky in debug mode.
Specifically, subtests test_exceptions_in_flush_on_sstable_open
and test_exceptions_in_flush_on_sstable_write).
With the tablets feature always enabled (Unless gossip toopology
changes are forced), the enable_tablets option now controls only
the default for newly created keyspaces.
Even when set to `false`, tablets are still enabled as a
feature and the user may explicitly enable tablets
using `CREATE KEYSPACE <name> WITH tablets = {'enabled': true}`
Note: best viewed with `git show -w`
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
In case multiple clients issue concurrently CREATE KEYSPACE IF NOT EXISTS
and later USE KEYSPACE it can happen that schema in driver's session is
out of sync because it synces when it receives special message from
CREATE KEYSPACE response.
Similar situation occurs with other schema change statements.
In this patch we fix only create keyspace/table/type/view statements
by always sending created event. Behavior of any other schema altering
statements remains unchanged.
This is done to achieve single transaction semantics.
The change includes auto-grant feature. In particular
for schema related auto-grant we don't use normal
mutation collector announce path but follow migration manager,
this may be unified in the future.
The CDC feature is not supported on a table that uses tablets
(Refs #16317), so if a user creates a keyspace with tablets enabled
they may be surprised later (perhaps much later) when they try to enable
CDC on the table and can't.
The LWT feature always had issue Refs #5251, but it has become potentially
more common with tablets.
So it was proposed that as long as we have missing features (like CDC or
LWT), every time a keyspace is created with tablets it should output a
warning (a bona-fide CQL warning, not a log message) that some features
are missing, and if you need them you should consider re-creating the
keyspace without tablets.
This patch does this. It was surprisingly hard and ugly to find a place
in the code that can check the tablet-ness of a keyspace while it is
still being created, but I think I found a reasonable solution.
The warning text in this patch is the following (obviously, it can
be improved later, as we perhaps find more missing features):
"Tables in this keyspace will be replicated using tablets, and will
not support the CDC feature (issue #16317) and LWT may suffer from
issue #5251 more often. If you want to use CDC or LWT, please drop
this keyspace and re-create it without tablets, by adding AND TABLETS
= {'enabled': false} to the CREATE KEYSPACE statement."
This patch also includes a test - that checks that this warning is is
indeed generated when a keyspace is created with tablets (either by default
or explicitly), and not generated if the keyspace is created without
tablets.
Obviously, this entire patch - the warning and its test - can be reverted
as soon as we support CDC (and all other features) on tablets.
Fixes#16807
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Now all the callers have it at hands (spoiler: not yet initialized, but
still) so the params can also have it.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Previous patch added params to r.s. classes' constructors, but callers
don't construct those directly, instead they use the create_r.s.()
wrapper. This patch adds params to the wrapper too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When checking replication strategy options the code assumes (and it's
stated in the preceeding code comment) that all options are replication
factors. Nowadays it's no longer so, the initial_tablets one is not
replication factor and should be skipped
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#16335
Currently the cql statement .validate() callback is responsible for
checking if the non-local storage options are allowed with the
respective feature. Next patch will need to extend this check to also
validate the details of the provided storage options, but doing it at
cql level doesn't seem correct -- it's "too far" from query processor
down to sstables manager.
Good news is that there's a lower-level validation of the new keyspace,
namely the database::validate_new_keyspace() call. Move the storage
options validation into sstables manager, while at it, reimplement it
as a visitor to facilitate further extentions and plug the new
validation to the aforementioned database::validate_new_keyspace().
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Replacing `restrict_replication_simplestrategy` config option with
2 config options: `replication_strategy_{warn,fail}_list`, which
allow us to impose soft limits (issue a warning) and hard limits (not
execute CQL) on replication strategy when creating/altering a keyspace.
The reason to rather replace than extend `restrict_replication_simplestrategy` config
option is that it was not used and we wanted to generalize it.
Only soft guardrail is enabled by default and it is set to SimpleStrategy,
which means that we'll generate a CQL warning whenever replication strategy
is set to SimpleStrategy. For new cloud deployments we'll move
SimpleStrategy from warn to the fail list.
Guardrails violations will be tracked by metrics.
Resolves#5224
Refs #8892 (the replication strategy part, not the RF part)
Closesscylladb/scylladb#15399
Replacing `minimum_keyspace_rf` config option with 4 config options:
`{minimum,maximum}_replication_factor_{warn,fail}_threshold`, which
allow us to impose soft limits (issue a warning) and hard limits (not
execute CQL) on RF when creating/altering a keyspace.
The reason to rather replace than extend `minimum_keyspace_rf` config
option is to be aligned with Cassandra, which did the same, and has the
same parameters' names.
Only min soft limit is enabled by default and it is set to 3, which means
that we'll generate a CQL warning whenever RF is set to either 1 or 2.
RF's value of 0 is always allowed and means that there will not be any
replicas on a given DC. This was agreed with PM.
Because we don't allow to change guardrails' values when scylla is
running (per PM), there're no tests provided with this PR, and dtests will be
provided separately.
Exceeding guardrails' thresholds will be tracked by metrics.
Resolves#8619
Refs #8892 (the RF part, not the replication-strategy part)
Closes#14262
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.
Fixes#15050
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>
After changing the prepare_ methods of migration_manager to
functions, the migration_manager& parameter of
schema_altering_statement::prepare_schema_mutations has been
unused by all classes inheriting from schema_altering_statement.
The migration_manager service is responsible for schema convergence
in the cluster - pushing schema changes to other nodes and pulling
schema when a version mismatch is observed. However, there is also
a part of migration_manager that doesn't really belong there -
creating mutations for schema updates. These are the functions with
prepare_ prefix. They don't modify any state and don't exchange any
messages. They only need to read the local database.
We take these functions out of migration_manager and make them
separate functions to reduce the dependency of other modules
(especially query_processor and CQL statements) on
migration_manager. Since all of these functions only need access
to storage_proxy (or even only replica::database), doing such a
refactor is not complicated. We just have to add one parameter,
either storage_proxy or database and both of them are easily
accessible in the places where these functions are called.
We want to stop relying on `qp.get_migration_manager()`, so we can make
the function private in the future. This in turn is a prerequisite for
splitting `query_processor` initialization into two phases, where the
first phase will only allow local queries (and won't require
`migration_manager`).
statement_restrictions: forbid IS NOT NULL on columns outside the primary key
IS NOT NULL is currently allowed only when creating materialized views.
It's used to convey that the view will not include any rows that would make the view's primary key columns NULL.
Generally materialized views allow to place restrictions on the primary key columns, but restrictions on the regular columns are forbidden. The exception was IS NOT NULL - it was allowed to write regular_col IS NOT NULL. The problem is that this restriction isn't respected, it's just silently ignored (see #10365).
Supporting IS NOT NULL on regular columns seems to be as hard as supporting any other restrictions on regular columns.
It would be a big effort, and there are some reasons why we don't support them.
For now let's forbid such restrictions, it's better to fail than be wrong silently.
Throwing a hard error would be a breaking change.
To avoid breaking existing code the reaction to an invalid IS NOT NULL restrictions is controlled by the `strict_is_not_null_in_views` flag.
This flag can have the following values:
* `true` - strict checking. Having an `IS NOT NULL` restriction on a column that doesn't belong to the view's primary key causes an error to be thrown.
* `warn` - allow invalid `IS NOT NULL` restrictions, but throw a warning. The invalid restrictions are silently ignored.
* `false` - allow invalid `IS NOT NULL` restricitons, without any warnings or errors. The invalid restrictions are silently ignored.
The default values for this flag are `warn` in `db::config` and `true` in scylla.yaml.
This way the existing clusters will have `warn` by default, so they'll get a warning if they try to create such an invalid view.
New clusters with fresh scylla.yaml will have the flag set to `true`, as scylla.yaml overwrites the default value in `db::config`.
New clusters will throw a hard error for invalid views, but in older existing clusters it will just be a warning.
This way we can maintain backwards compatibility, but still move forward by rejecting invalid queries on new clusters.
Fixes: #10365Closes#13013
* github.com:scylladb/scylladb:
boost/restriction_test: test the strict_is_not_null_in_views flag
docs/cql/mv: columns outside of view's primary key can't be restricted
cql-pytest: enable test_is_not_null_forbidden_in_filter
statement_restrictions: forbid IS NOT NULL on columns outside the primary key
schema_altering_statement: return warnings from prepare_schema_mutations()
db/config: add strict_is_not_null_in_views config option
statement_restrictions: add get_not_null_columns()
test: remove invalid IS NOT NULL restrictions from tests
Validation of a CREATE MATERIALIZED VIEW statement takes place inside
the prepare_schema_mutations() method.
I would like to generate warnings during this validation, but there's
currently no way to pass them.
Let's add one more return value - a vector of CQL warnings generated
during the execution of this statement.
A new alias is added to make it clear what the function is returning:
```c++
// A vector of CQL warnings generated during execution of a statement.
using cql_warnings_vec = std::vector<sstring>;
```
Later the warnings will be sent to the user by the function
schema_altering_statment::execute(), which is the only caller
of prepare_schema_mutations().
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
This reverts commit 52e4edfd5e, reversing
changes made to d2d53fc1db. The associated test
fails with about 10% probablity, which blocks other work.
Fixes#13919Reopens#13747
When a user creates a function, they should have all permissions on
this function.
Similarly, when a user creates a keyspace, they should have all
permissions on functions in the keyspace.
This patch introduces GRANTs on the missing permissions.
In the following patch, the grant_permissions_to_creator method is going
to be also used to grant permissions on a newly created function. The
function resource may contain user-defined types which need the
query processor to be prepared, so we add a reference to it in advance
in this patch for easier review.
We have seen users unintentionally use RF=1 or RF=2 for a keyspace.
We would like to have an option for a minimal RF that is allowed.
Cassandra recently added, in Cassandra 4.1 (see apache/cassandra@5fdadb2
and https://issues.apache.org/jira/browse/CASSANDRA-14557), exactly such
a option, called "minimum_keyspace_rf" - so we chose to use the same option
name in Scylla too. This means that unlike the previous "safe mode"
options, the name of this option doesn't start with "restrict_".
The value of the minimum_keyspace_rf option is a number, and lower
replication factors are rejected with an error like:
cqlsh> CREATE KEYSPACE x WITH REPLICATION = { 'class' : 'SimpleStrategy',
'replication_factor': 2 };
ConfigurationException: Replication factor replication_factor=2 is
forbidden by the current configuration setting of minimum_keyspace_rf=3.
Please increase replication factor, or lower minimum_keyspace_rf set in
the configuration.
This restriction applies to both CREATE KEYSPACE and ALTER KEYSPACE
operations. It applies to both SimpleStrategy and NetworkTopologyStrategy,
for all DCs or a specific DC. However, a replication factor of zero (0)
is *not* forbidden - this is the way to explicitly request not to
replicate (at all, or in a specific DC).
For the time being, minimum_keyspace_rf=0 is still the default, which
means that any replication factor is allowed, as before. We can easily
change this default in a followup patch.
Note that in the current implementation, trying to use RF below
minimum_keyspace_rf is always an error - we don't have a syntax
to make into just a warning. In any case the error message explains
exactly which configuration option is responsible for this restriction.
Fixes#8891.
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Closes#9830
these warnings are found by Clang-17 after removing
`-Wno-unused-lambda-capture` and '-Wno-unused-variable' from
the list of disabled warnings in `configure.py`.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.
mutation_reader remains in the readers/ module.
mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.
This is a step forward towards librarization or modularization of the
source base.
Closes#12788
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.
Closes#10562
Each feature has a private variable and a public accessor. Since the
accessor effectively makes the variable public, avoid the intermediary
and make the variable public directly.
To ease mechanical translation, the variable name is chosen as
the function name (without the cluster_supports_ prefix).
References throughout the codebase are adjusted.
The STORAGE option is designed to hold a map of options
used for customizing storage for given keyspace.
The option is kept in a system_schema.scylla_keyspaces table.
The option is only available if the whole cluster is aware
of it - guarded by a cluster feature.
Example of the table contents:
```
cassandra@cqlsh> select * from system_schema.scylla_keyspaces;
keyspace_name | storage_options | storage_type
---------------+------------------------------------------------+--------------
ksx | {'bucket': '/tmp/xx', 'endpoint': 'localhost'} | S3
```
The functions which prepare schema change mutations (such as
`prepare_new_column_family_announcement`) would use internally
generated timestamps for these mutations. When schema changes are
managed by group 0 we want to ensure that timestamps of mutations
applied through Raft are monotonic. We will generate these timestamps at
call sites and pass them into the `prepare_` functions. This commit
prepares the APIs.
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
This was needed to fix issue #2129 which was only manifest itself with
auto_bootstrap set to false. The option is ignored now and we always
wait for schema to synch during boot.
After previous patches there's a whole bunch of places that do
qp.proxy().data_dictionary()
while the data_dictionary is present on the query processor itself
and there's a public method to get one. So use it everywhere.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The prepare_schema_mutations is not sleeping method, so there's no
point in getting call-local shared pointer on proxy. Plain reference
is more than enough.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>