Auto-exands numeric RF in CREATE/ALTER KEYSPACE statements for
new DCs specified in the statement.
Doesn't auto-expand existing options, as the rack choice may not be in
line with current replica placement. This requires co-locating tablet
replicas, and tracking of co-location state, which is not implemented yet.
Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
Tests expect this failure in some scenarios, but later changes make us
fail ealier due to topology constraints.
As a rule, more general validation should come before more specific
validation. So syntax validation before topology validation.
We want to add strongly consistent tables as an option. We will have
two kind of strongly consistent tables: globally consistent and locally
consistent. The former means that requests from all DCs will be globally
linearisable while the later - only requests to the same DCs will be
linearisable. To allow configuring all the possibilities the patch
adds new parameter to a keyspace definition "consistency" that can be
configured to be `eventual`, `global` or `local`. Non eventual setting
is supported for tablets enabled keyspaces only. Since we want to start
with implementing local consistency configuring global consistency will
result in an error for now.
There are several problems with how ALTER execution works with tablets.
1) Currently, option processing bypasses
ks_prop_defs::prepare_options(), and will pass them directly to
keyspace_metadata. This deviates from the vnode path, causing
discrepancy in logic. But also there will be some non-trivial options
post-processing added there - numeric RF will be replaced with a
rack list. We should preserve it in the tablet path which alters
the keyspace, otherwise it will fail when trying to construct
network_topology_strategy.
2) Option merging happens on the flat version of the map, which won't
work correctly with extended map which contains lists. We want
the new list to replace the old list or numeric RF, not get its items
merged. For example:
We want:
{'dc1': 3} + {'dc1': ['rack1', 'rack2']} = {'dc1': ['rack1', 'rack2']}
If we merge flattened options, we would get incorrect flattened options:
{'dc1': 3,
'dc1:0', 'rack1'
'dc1:1', 'rack2'}
3) We lose atomicity of update. Validation and merging which happens on the CQL
coordinator is done in a different group0 transaction context than mutation
generation inside topology coordinator later.
Fixes https://github.com/scylladb/scylladb/issues/25269
Allows per-DC replication factor to be either a string, holding a
numerical value, or a list of strings, holding a list of rack names.
The rack list is not respected yet by the tablet allocator, this is
achieved in subsequent commit.
This changes the format of options stored in the flattened map
in system_schema.keyspaces#replication. Values which are rack lists,
are converted into multiple entries, with the list index appended to
the key with ':' as the separator:
For example, this extended map:
{
'dc1': '3',
'dc2': ['rack1', 'rack2']
}
is stored as a flattened map:
{
'dc1': '3',
'dc2:0': 'rack1',
'dc2:1': 'rack2'
}
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Signed-off-by: Tomasz Grabiec <tgrabiec@scylladb.com>
It will become more complex when options will contain rack lists.
It's a good change regardless, as it reduces duplication and makes
parsing uniform. We already diverged to use stoi / stol / stoul.
The change in create_keyspace_statement.cc to add a catch clause is
needed because get_replication_factor() now throws
configuration_exception on parsing errors instead of
std::invalid_argument, so the existing catch clause in the outer scope
is not effective. That loop is trying to interpret all options as RF
to run some validations. Not all options are RF, and those are
supposed to be ignored.
In preparation for changing their structure.
1) std::map<sstring, sstring> -> replication_strategy_config_options
Parsed options. Values will become std::variant<sstring, rack_list>
2) std::map<sstring, sstring> -> property_definitions::map_type
Flattened map of options, as stored system tables.
Although RF-rack-valid keyspaces are not universally enforced
yet (they're governed by the configuration option
`rf_rack_valid_keyspaces`), we'd like to encourage the user to
abide by the restriction.
To that end, we're introducing a warning when creating or
altering a keyspace. If the configuration option is disabled,
but the user is trying to create an RF-rack-invalid keyspace,
they'll receive a warning.
We provide a validation test.
Old nodes do not expect global topology request names to be in
request_type field, so set it only if a cluster is fully upgraded
already.
Closesscylladb/scylladb#24731
Now that we have a global request queue do not check that there is
global request before adding another one. Amend truncation test that
expects it explicitly and add another one that checks that two truncates
can be submitted in parallel.
Requests, together with their parameters, are added to the
topology_request tables and the queue of active global requests is
kept in topology state. Thy are processed one by one by the topology
state machine.
Fixes: #16822
Currently parameters to alter table global topology command are stored
in static column in the topology table, but this way there can be only one
outstanding alter table request. This patch moves the parameters to
the topology_request table where parameters are stored per request.
In this commit, we refuse to create or alter a keyspace when that operation
would make it RF-rack-invalid if the option `rf_rack_valid_keyspaces` is
enabled.
We provide two tests verifying that the changes work as intended.
Fixesscylladb/scylladb#23276
Altering a keyspace (that has tablets enabled) without changing
tablets attributes, i.e. no `AND tablets = {...}` results in incorrect
"Update Keyspace..." log message being printed. The printed log
contains "tablets={"enabled":false}".
Refs https://github.com/scylladb/scylladb/issues/22261Closesscylladb/scylladb#22324
Fixes#22688
If we set a dc rf to zero, the options map will still retain a dc=0 entry.
If this dc is decommissioned, any further alters of keyspace will fail,
because the union of new/old options will now contained an unknown keyword.
Change alter ks options processing to simply remove any dc with rf=0 on
alter, and treat this as an implicit dc=0 in nw-topo strategy.
This means we change the reallocate_tablets routine to not rely on
the strategy objects dc mapping, but the full replica topology info
for dc:s to consider for reallocation. Since we verify the input
on attribute processing, the amount of rf/tablets moved should still
be legal.
v2:
* Update docs as well.
v3:
* Simplify dc processing
* Reintroduce options empty check, but do early in ks_prop_defs
* Clean up unit test some
Closesscylladb/scylladb#22693
Integrates audit functionality into CQL statement processing to enable tracking of database operations. Key changes:
- Add audit_info and statement_category to all CQL statements
- Implement audit categories for different statement types:
- DDL: Schema altering statements (CREATE/ALTER/DROP)
- DML: Data manipulation (INSERT/UPDATE/DELETE/TRUNCATE/USE)
- DCL: Access control (GRANT/REVOKE/CREATE ROLE)
- QUERY: SELECT statements
- ADMIN: Service level operations
- Add audit inspection points in query processing:
- Before statement execution
- After access checks
- After statement completion
- On execution failures
- Add password sanitization for role management statements
- Mask plaintext passwords in audit logs
- Handle both direct password parameters and options maps
- Preserve query structure while hiding sensitive data
- Modify prepared statement lifecycle to carry audit context
- Pass audit info during statement preparation
- Track audit info through statement execution
- Support batch statement auditing
This change enables comprehensive auditing of CQL operations while ensuring sensitive data is properly masked in audit logs.
now that we are allowed to use C++23. we now have the luxury of using
`std::ranges::transform`.
in this change, we:
- replace `boost::transform` with `std::ranges::transform`
- update affected code to work with `std::ranges::transform`
to reduce the dependency to boost for better maintainability, and
leverage standard library features for better long-term support.
this change is part of our ongoing effort to modernize our codebase
and reduce external dependencies where possible.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closesscylladb/scylladb#21318
Tablets load balancer is unable to process more than a single pending
replica, thus ALTER tablets KS cannot accept an ALTER statement which
would result in creating 2+ pending replicas, hence it has to validate
if the sum of absoulte differences of RFs specified in the statement is
not greter than 1.
A bug has been discovered while trying to ALTER tablets KS and
specifying only 1 out of 2 DCs - the not specified DC's RF has been
zeroed. This is because ALTER tablets KS updated the KS only with the
RF-per-DC mapping specified in the ALTER tablets KS statement, so if a
DC was ommitted, it was assigned a value of RF=0.
This commit fixes that plus additionally passes all the KS options, not
only the replication options, to the topology coordinator, where the KS
update is performed.
`initial_tablets` is a special case, which requires a special handling
in the source code, as we cannot simply update old initial_tablet's
settings with the new ones, because if only ` and TABLETS = {'enabled':
true}` is specified in the ALTER tablets KS statement, we should not zero the `initial_tablets`, but
rather keep the old value - this is tested by the
`test_alter_preserves_tablets_if_initial_tablets_skipped` testcase.
Other than that, the above mentioned testcase started to fail with
these changes, and it appeared to be an issue with the test not waiting
until ALTER is completed, and thus reading the old value, hence the
test's body has been modified to wait for ALTER to complete before
performing validation.
The validation has been corrected with:
1. Checking if a DC specified in ALTER exists.
2. Removing `REPLICATION_STRATEGY_CLASS_KEY` key from a map of RFs that
needs their RFs to be validated.
This function assumed that strings passed as arguments will be of
integer types, but that wasn't the case, and we missed that because this
function didn't have any validation, so this change adds proper
validation and error logging.
Arguments passed to this function were forwarded from a call to
`ks_prop_defs::get_replication_options`, which, among rf-per-dc mapping, returns also
`class:replication_strategy` pair. Second pair's member has been casted
into an `int` type and somehow the code was still running fine, but only
extra testing added later discovered a bug in here.
ALTER tablets KS validated if RF is not changed by more than 1 for DCs
that already had replicas, but not for DCs that didn't have them yet, so
specifying an RF jump from 0 to 2 was possible when listing a new DC in
ALTER tablets KS statement, which violated internal invariants of
tablets load balancer.
This PR fixes that bug and adds a multi-dc testcases to check if adding
replicas to a new DC and removing replicas from a DC is honoring the RF
change constraints.
Refs: #20039
before this change, we rely on `using namespace seastar` to use
`seastar::format()` without qualifying the `format()` with its
namespace. this works fine until we changed the parameter type
of format string `seastar::format()` from `const char*` to
`fmt::format_string<...>`. this change practically invited
`seastar::format()` to the club of `std::format()` and `fmt::format()`,
where all members accept a templated parameter as its `fmt`
parameter. and `seastar::format()` is not the best candidate anymore.
despite that argument-dependent lookup (ADT for short) favors the
function which is in the same namespace as its parameter, but
`using namespace` makes `seastar::format()` more competitive,
so both `std::format()` and `seastar::format()` are considered
as the condidates.
that is what is happening scylladb in quite a few caller sites of
`format()`, hence ADT is not able to tell which function the winner
in the name lookup:
```
/__w/scylladb/scylladb/mutation/mutation_fragment_stream_validator.cc:265:12: error: call to 'format' is ambiguous
265 | return format("{} ({}.{} {})", _name_view, s.ks_name(), s.cf_name(), s.id());
| ^~~~~~
/usr/bin/../lib/gcc/x86_64-redhat-linux/14/../../../../include/c++/14/format:4290:5: note: candidate function [with _Args = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
4290 | format(format_string<_Args...> __fmt, _Args&&... __args)
| ^
/__w/scylladb/scylladb/seastar/include/seastar/core/print.hh:143:1: note: candidate function [with A = <const std::basic_string_view<char> &, const seastar::basic_sstring<char, unsigned int, 15> &, const seastar::basic_sstring<char, unsigned int, 15> &, const utils::tagged_uuid<table_id_tag> &>]
143 | format(fmt::format_string<A...> fmt, A&&... a) {
| ^
```
in this change, we
change all `format()` to either `fmt::format()` or `seastar::format()`
with following rules:
- if the caller expects an `sstring` or `std::string_view`, change to
`seastar::format()`
- if the caller expects an `std::string`, change to `fmt::format()`.
because, `sstring::operator std::basic_string` would incur a deep
copy.
we will need another change to enable scylladb to compile with the
latest seastar. namely, to pass the format string as a templated
parameter down to helper functions which format their parameters.
to miminize the scope of this change, let's include that change when
bumping up the seastar submodule. as that change will depend on
the seastar change.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Now when function context creation is encapsulated in lang::manager,
some .cc files can stop using wasm-specific headers and just go with the
lang/manager.hh one.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
This is done to achieve single transaction semantics.
The change includes auto-grant feature. In particular
for schema related auto-grant we don't use normal
mutation collector announce path but follow migration manager,
this may be unified in the future.
We want to ensure that when the replication factor
of a keyspace changes, it changes by at most 1 per DC
if it uses tablets. The rationale for that is to make
sure that the old and new quorums overlap by at least
one node.
After these changes, attempts to change the RF of
a keyspace in any DC by more than 1 will fail.
This patch removes the support for the "wildcard" replication_factor
option for ALTER KEYSPACE when the keyspace supports tablets.
It will still be supported for CREATE KEYSPACE so that a user doesn't
have to know all datacenter names when creating the keyspace,
but ALTER KEYSPACE will require that and the user will have to
specify the exact change in replication factors they wish to make by
explicitly specifying the datacenter names.
Expanding the replication_factor option in the ALTER case is
unintuitive and it's a trap many users fell into.
See #8881, #15391, #16115
This commit adds support for executing ALTER KS for keyspaces with
tablets and utilizes all the previous commits.
The ALTER KS is handled in alter_keyspace_statement, where a global
topology request in generated with data attached to system.topology
table. Then, once topology state machine is ready, it starts to handle
this global topology event, which results in producing mutations
required to change the schema of the keyspace, delete the
system.topology's global req, produce tablets mutations and additional
mutations for a table tracking the lifetime of the whole req. Tracking
the lifetime is necessary to not return the control to the user too
early, so the query processor only returns the response while the
mutations are sent.
This kills three birds with one stone
1. fixes broken indentation
2. re-uses new_options local variable
3. stops using string literal to check storage type
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Closesscylladb/scylladb#17111
Previous patch added params to r.s. classes' constructors, but callers
don't construct those directly, instead they use the create_r.s.()
wrapper. This patch adds params to the wrapper too.
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When ALTER-ing a keyspace one may as well change its vnode/tablet
flavor, which is not currently supported, so prohibit this change
explicitly
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Replacing `restrict_replication_simplestrategy` config option with
2 config options: `replication_strategy_{warn,fail}_list`, which
allow us to impose soft limits (issue a warning) and hard limits (not
execute CQL) on replication strategy when creating/altering a keyspace.
The reason to rather replace than extend `restrict_replication_simplestrategy` config
option is that it was not used and we wanted to generalize it.
Only soft guardrail is enabled by default and it is set to SimpleStrategy,
which means that we'll generate a CQL warning whenever replication strategy
is set to SimpleStrategy. For new cloud deployments we'll move
SimpleStrategy from warn to the fail list.
Guardrails violations will be tracked by metrics.
Resolves#5224
Refs #8892 (the replication strategy part, not the RF part)
Closesscylladb/scylladb#15399
Replacing `minimum_keyspace_rf` config option with 4 config options:
`{minimum,maximum}_replication_factor_{warn,fail}_threshold`, which
allow us to impose soft limits (issue a warning) and hard limits (not
execute CQL) on RF when creating/altering a keyspace.
The reason to rather replace than extend `minimum_keyspace_rf` config
option is to be aligned with Cassandra, which did the same, and has the
same parameters' names.
Only min soft limit is enabled by default and it is set to 3, which means
that we'll generate a CQL warning whenever RF is set to either 1 or 2.
RF's value of 0 is always allowed and means that there will not be any
replicas on a given DC. This was agreed with PM.
Because we don't allow to change guardrails' values when scylla is
running (per PM), there're no tests provided with this PR, and dtests will be
provided separately.
Exceeding guardrails' thresholds will be tracked by metrics.
Resolves#8619
Refs #8892 (the RF part, not the replication-strategy part)
Closes#14262
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNsynXayKim2XAFr@scylladb.com>
This reverts commit 70b5360a73. It generates
a failure in group0_test .test_concurrent_group0_modifications in debug
mode with about 4% probability.
Fixes#15050
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-ID: <ZNSWF/cHuvcd+g1t@scylladb.com>
After changing the prepare_ methods of migration_manager to
functions, the migration_manager& parameter of
schema_altering_statement::prepare_schema_mutations has been
unused by all classes inheriting from schema_altering_statement.
The migration_manager service is responsible for schema convergence
in the cluster - pushing schema changes to other nodes and pulling
schema when a version mismatch is observed. However, there is also
a part of migration_manager that doesn't really belong there -
creating mutations for schema updates. These are the functions with
prepare_ prefix. They don't modify any state and don't exchange any
messages. They only need to read the local database.
We take these functions out of migration_manager and make them
separate functions to reduce the dependency of other modules
(especially query_processor and CQL statements) on
migration_manager. Since all of these functions only need access
to storage_proxy (or even only replica::database), doing such a
refactor is not complicated. We just have to add one parameter,
either storage_proxy or database and both of them are easily
accessible in the places where these functions are called.