The `system.group0_history` table provides useful descriptions for each
command committed to Raft group 0. One way of applying a command to
group 0 is by calling `migration_manager::announce`. This function has
the `description` parameter set to empty string by default. Some calls
to `announce` use this default value which causes `null` values in
`system.group0_history`. We want `system.group0_history` to have an
actual description for every command, so we change all default
descriptions to reasonable ones.
Going further, We remove the default value for the `description`
parameter of `migration_manager::announce` to avoid using it in the
future. Thanks to this, all commands in `system.group0_history` will
have a non-null description.
Fixes#13370Closes#14979
* github.com:scylladb/scylladb:
migration_manager: announce: remove the default value of description
test: always pass empty description to migration_manager::announce
migration_manager: announce: provide descriptions for all calls
The system.group0_history table provides useful descriptions
for each command committed to Raft group 0. One way of applying
a command to group 0 is by calling migration_manager::announce.
This function has the description parameter set to empty string
by default. Some calls to announce use this default value which
causes null values in system.group0_history. We want
system.group0_history to have an actual description for every
command, so we change all default descriptions to reasonable ones.
We can't provide a reasonable description to announce in
query_processor::execute_thrift_schema_command because this
function is called in multiple situations. To solve this issue,
we add the description parameter to this function and to
handler::execute_schema_command that calls it.
While in SQL DISTINCT applies to the result set, in CQL it applies
to the table being selected, and doesn't allow GROUP BY with clustering
keys. So reject the combination like Cassandra does.
While this is not an important issue to fix, it blocks un-xfailing
other issues, so I'm clearing it ahead of fixing those issues.
An issue is unmarked as xfail, and other xfails lose this issue
as a blocker.
Fixes#12479Closes#14970
Add a constructor that builds context out of const manager reference.
The existing one needs to get engine and instance cache and does it via
query_processor. This change lets removing those exports and finally --
drop the wasm::manager -> cql3::query_processor friendship
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
When the q.p. stops it also "stops" the wasm manager. Move this call
into main. The cql test env doesn't need this change, it stops the whole
sharded service which stops instances on its own
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
The wasm::manager is just cql3::wasm_context renamed. It now sits in
lang/wasm* and is started as a sharded service in main (and cql test
env). This move also needs some headers shuffling, but it's not severe
This change is required to make it possible for the wasm::manager to be
shared (by reference) between q.p. and replica::database further
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
There are three wasm-only fields on q.p. -- engine, cache and runner.
This patch groups them on a single wasm_context structure to make it
earier to manipulate them in the next patches
The 'friend' declaration it temporary, will go away soon
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Currently, when one tries to access a column that an untyped_result_set
does not contain, a `std::bad_variant_access` exception is thrown. This
exception's message provides very little context and it can be difficult
to even figure out where this message is coming from.
In order to improve the situation, a new exception `missing_column` is
introduced which includes the missing column's name in its error
message. The exception derives from `std::bad_variant_access` for
compatibility with existing code that may want to catch it.
The `migration_manager` service is responsible for schema convergence in
the cluster - pushing schema changes to other nodes and pulling schema
when a version mismatch is observed. However, there is also a part of
`migration_manager` that doesn't really belong there - creating
mutations for schema updates. These are the functions with `prepare_`
prefix. They don't modify any state and don't exchange any messages.
They only need to read the local database.
We take these functions out of `migration_manager` and make them
separate functions to reduce the dependency of other modules (especially
`query_processor` and CQL statements) on `migration_manager`. Since all
of these functions only need access to `storage_proxy` (or even only
`replica::database`), doing such a refactor is not complicated. We just
have to add one parameter, either `storage_proxy` or `database` and both
of them are easily accessible in the places where these functions are
called.
This refactor makes `migration_manager` unneeded in a few functions:
- `alternator::executor::create_keyspace`,
- `cql3::statements::alter_type_statement::prepare_announcement_mutations`,
- `cql3::statements::schema_altering_statement::prepare_schema_mutations`,
- `cql3::query_processor::execute_thrift_schema_command:`,
- `thrift::handler::execute_schema_command`.
We remove the `migration_manager&` parameter from all these functions.
Fixes#14339Closes#14875
* github.com:scylladb/scylladb:
cql3: query_processor::execute_thrift_schema_command: remove an unused parameter
cql3: schema_altering_statement::prepare_schema_mutations: remove an unused parameter
cql3: alter_type_statement::prepare_announcement_mutations: change parameters
alternator: executor::create_keyspace: remove an unused parameter
service: migration_manager: change the prepare_ methods to functions
After changing the prepare_ methods of migration_manager to
functions, the migration_manager& parameter of
query_processor::execute_thrift_schema_command and
thrift::handler::execute_schema_command (that calls
query_processor::execute_thrift_schema_command) has been unused.
After changing the prepare_ methods of migration_manager to
functions, the migration_manager& parameter of
schema_altering_statement::prepare_schema_mutations has been
unused by all classes inheriting from schema_altering_statement.
After changing the prepare_ methods of migration_manager to
functions, the migration_manager& parameter of
alter_type_statement::prepare_announcement_mutations has become
unneeded. However, the function needs access to
service::storage_proxy and data_dictionary::database. Passing
storage_proxy& to it is enough.
We were missing support in the "CAST(x AS type)" function for the counter
type. This patch adds this support, as well as extensive testing that it
works in Scylla the same as Cassandra.
We also un-xfail an existing test translated from Cassandra's unit
test. But note that this old test did not cover all the edge-cases that
the new test checks - some missing cases in the implementation were
not caught by the old test.
Fixes#14501
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
Code in functions.cc creates the different TYPEasblob() and blobasTYPE()
functions for all type names TYPE. The functions for the "counter" type
were skipped, supposedly because "counters are not supported yet". But
counters are supported, so let's add the missing functions.
The code fix is trivial, the tests that verify that the result behaves
like Cassandra took more work.
After this patch, unimplemented::cause::COUNTERS is no longer used
anywhere in the code. I wanted to remove it, but noticed that
unimplemented::cause is a graveyard of unused causes, so decided not
to remove this one either. We should clean it up in a separate patch.
Fixes#14742
Also includes tests for tangently-related issues:
Refs #12607
Refs #14319
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
The migration_manager service is responsible for schema convergence
in the cluster - pushing schema changes to other nodes and pulling
schema when a version mismatch is observed. However, there is also
a part of migration_manager that doesn't really belong there -
creating mutations for schema updates. These are the functions with
prepare_ prefix. They don't modify any state and don't exchange any
messages. They only need to read the local database.
We take these functions out of migration_manager and make them
separate functions to reduce the dependency of other modules
(especially query_processor and CQL statements) on
migration_manager. Since all of these functions only need access
to storage_proxy (or even only replica::database), doing such a
refactor is not complicated. We just have to add one parameter,
either storage_proxy or database and both of them are easily
accessible in the places where these functions are called.
when we convert timestamp into string it must look like: '2017-12-27T11:57:42.500Z'
it concerns any conversion except JSON timestamp format
JSON string has space as time separator and must look like: '2017-12-27 11:57:42.500Z'
both formats always contain milliseconds and timezone specification
Fixes#14518Fixes#7997Closes#14726
Before choosing a function, we prepare the arguments that can be
prepared without a receiver. Preparing an argument makes
its type known, which allows to choose the best overload
among many possible functions.
The function that prepared the argument passes the unprepared
argument by mistake. Let's fix it so that it actually uses
the prepared argument.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Closes#14786
We allow inserting column values using a JSON value, eg:
```cql
INSERT INTO mytable JSON '{ "\"myKey\"": 0, "value": 0}';
```
When no JSON value is specified, the query should be rejected.
Scylla used to crash in such cases. A recent change fixed the crash
(https://github.com/scylladb/scylladb/pull/14706), it now fails
on unwrapping an uninitialized value, but really it should
be rejected at the parsing stage, so let's fix the grammar so that
it doesn't allow JSON queries without JSON values.
A unit test is added to prevent regressions.
Refs: https://github.com/scylladb/scylladb/pull/14707
Fixes: https://github.com/scylladb/scylladb/issues/14709
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Closes#14785
The grammar mistakenly allows nothing to be parsed as an
intValue (itself accepted in LIMIT and similar clauses).
Easily fixed by removing the empty alternative. A unit test is
added.
Fixes#14705.
Closes#14707
`expression`'s default constructor is dangerous as an it can leak
into computations and generate surprising results. Fix that by
removing the default constructor.
This is made somewhat difficult by the parser generator's reliance
on default construction, and we need to expand our workaround
(`uninitialized<>`) capabilities to do so.
We also remove some incidental uses of default-constructed expressions.
Closes#14706
* github.com:scylladb/scylladb:
cql3: expr: make expression non-default-constructible
cql3: grammar: don't default-construct expressions
cql3: grammar: improve uninitialized<> flexibility
cql3: grammar: adjust uninitialized<> wrapper
test: expr_test: don't invoke expression's default constructor
cql3: statement_restrictions: explicitly initialize expressions in index match code
cql3: statement_restrictions: explicitly intitialize some expression fields
cql3: statement_restrictions: avoid expression's default constructor when classifying restrictions
cql3: expr: prepare_expression: avoid default-constructed expression
cql3: broadcast_tables: prepare new_value without relying on expression default constructor
SELECT MUTATION FRAGMENTS is a new select statement sub-type, which allows dumping the underling mutations making up the data of a given table. The output of this statement is mutation-fragments presented as CQL rows. Each row corresponds to a mutation-fragment. Subsequently, the output of this statement has a schema that is different than that of the underlying table. The output schema is derived from the table's schema, as following:
* The table's partition key is copied over as-is
* The clustering key is formed from the following columns:
- mutation_source (text): the kind of the mutation source, one of: memtable, row-cache or sstable; and the identifier of the individual mutation source.
- partition_region (int): represents the enum with the same name.
- the copy of the table's clustering columns
- position_weight (int): -1, 0 or 1, has the same meaning as that in position_in_partition, used to disambiguate range tombstone changes with the same clustering key, from rows and from each other.
* The following regular columns:
- metadata (text): the JSON representation of the mutation-fragment's metadata.
- value (text): the JSON representation of the mutation-fragment's value.
Data is always read from the local replica, on which the query is executed. Migrating queries between coordinators is frobidden.
More details in the documentation commit (last commit).
Example:
```cql
cqlsh> CREATE TABLE ks.tbl (pk int, ck int, v int, PRIMARY KEY (pk, ck));
cqlsh> DELETE FROM ks.tbl WHERE pk = 0;
cqlsh> DELETE FROM ks.tbl WHERE pk = 0 AND ck > 0 AND ck < 2;
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 0, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 1, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (0, 2, 0);
cqlsh> INSERT INTO ks.tbl (pk, ck, v) VALUES (1, 0, 0);
cqlsh> SELECT * FROM ks.tbl;
pk | ck | v
----+----+---
1 | 0 | 0
0 | 0 | 0
0 | 1 | 0
0 | 2 | 0
(4 rows)
cqlsh> SELECT * FROM MUTATION_FRAGMENTS(ks.tbl);
pk | mutation_source | partition_region | ck | position_weight | metadata | mutation_fragment_kind | value
----+-----------------+------------------+----+-----------------+--------------------------------------------------------------------------------------------------------------------------+------------------------+-----------
1 | memtable:0 | 0 | | | {"tombstone":{}} | partition start | null
1 | memtable:0 | 2 | 0 | 0 | {"marker":{"timestamp":1688122873341627},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122873341627}}} | clustering row | {"v":"0"}
1 | memtable:0 | 3 | | | null | partition end | null
0 | memtable:0 | 0 | | | {"tombstone":{"timestamp":1688122848686316,"deletion_time":"2023-06-30 11:00:48z"}} | partition start | null
0 | memtable:0 | 2 | 0 | 0 | {"marker":{"timestamp":1688122860037077},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122860037077}}} | clustering row | {"v":"0"}
0 | memtable:0 | 2 | 0 | 1 | {"tombstone":{"timestamp":1688122853571709,"deletion_time":"2023-06-30 11:00:53z"}} | range tombstone change | null
0 | memtable:0 | 2 | 1 | 0 | {"marker":{"timestamp":1688122864641920},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122864641920}}} | clustering row | {"v":"0"}
0 | memtable:0 | 2 | 2 | -1 | {"tombstone":{}} | range tombstone change | null
0 | memtable:0 | 2 | 2 | 0 | {"marker":{"timestamp":1688122868706989},"columns":{"v":{"is_live":true,"type":"regular","timestamp":1688122868706989}}} | clustering row | {"v":"0"}
0 | memtable:0 | 3 | | | null | partition end | null
(10 rows)
```
Perf simple query:
```
/build/release/scylla perf-simple-query -c1 -m2G --duration=60
```
Before:
```
median 141596.39 tps ( 62.1 allocs/op, 13.1 tasks/op, 43688 insns/op, 0 errors)
median absolute deviation: 137.15
maximum: 142173.32
minimum: 140492.37
```
After:
```
median 141889.95 tps ( 62.1 allocs/op, 13.1 tasks/op, 43692 insns/op, 0 errors)
median absolute deviation: 167.04
maximum: 142380.26
minimum: 141025.51
```
Fixes: https://github.com/scylladb/scylladb/issues/11130Closes#14347
* github.com:scylladb/scylladb:
docs/operating-scylla/admin-tools: add documentation for the SELECT * FROM MUTATION_FRAGMENTS() statement
test/topology_custom: add test_select_from_mutation_fragments.py
test/boost/database_test: add test for mutation_dump/generate_output_schema_from_underlying_schema
test/cql-pytest: add test_select_mutation_fragments.py
test/cql-pytest: move scylla_data_dir fixture to conftest.py
cql3/statements: wire-in mutation_fragments_select_statement
cql3/restrictions/statement_restrictions: fix indentation
cql3/restrictions/statement_restrictions: add check_indexes flag
cql3/statments/select_statement: add mutation_fragments_select_statement
cql3: add SELECT MUTATION FRAGMENTS select statement sub-type
service/pager: allow passing a query functor override
service/storage_proxy: un-embed coordinator_query_options
replica: add mutation_dump
replica: extract query_state into own header
replica/table: add make_nonpopulating_cache_reader()
replica/table: add select_memtables_as_mutation_sources()
tools,mutation: extract the low-level json utilities into mutation/json.hh
tools/json_writer: fold SstableKey() overloads into callers
tools/json_writer: allow writing metadata and value separately
tools/json_writer: split mutation_fragment_json_writer in two classes
tools/json_writer: allow passing custom std::ostream to json_writer
Since ec77172b4b (" Merge 'cql3: convert
the SELECT clause evaluation phase to expressions' from Avi Kivity"),
we rewrite non-aggregating selectors to include an aggregation, in order
to have the rest of the code either deal with no aggregation, or
all selectors aggregating, with nothing in between. This is done
by wrapping column selectors with "first" function calls: col ->
first(col).
This broke non-aggregating selectors that included the ttl() or
writetime() pseudo functions. This is because we rewrote them as
writetime(first(col)), and writetime() isn't a function that operates
on any values; it operates on mutations and so must have access to
a column, not an expression.
Fix by detecting this scenario and rewriting the expression as
first(writetime(col)).
Unit and integration tests are added.
Fixes#14715.
Closes#14716
Allowing caller to turn off checking for indexes. Useful if the
restrictions are applied on a pseudo-table, which has no corresponding
table object, and therefore no index manager (or indexes for that
matter).
Not wired in yet. SELECT * FROM MUTATION_FRAGMENTS($table) is a new
select statement sub-type, which allows dumping the underling mutations
making up the data of a given table. The output of this statement is
mutation-fragments presented as CQL rows. Each row corresponds to a
mutation-fragment. Subsequently, the output of this statement has a
schema that is different than that of the underlying table.
Data is always read from the local replica, on which the query is
executed. Migrating queries between coordinators is not allowed.
SELECT * FROM MUTATION_FRAGMENTS($table) is a new select statement
sub-type. More information will be provided in the patch which introduces
it. This patch adds only the Cql.g changes and what is further strictly
necessary.
prepare_expression() already validates the types and computes
the index of the field; no need to redo that work when
evaluating the expression.
The tests are adjusted to also prepare the expression.
Closes#14562
There is no obvious default expression, so better not to allow
default construction of expressions to prevent unintended values
from leaking in. Resolves a FIXME.
Use uninitialized<expression> for that. Since it's heavily used,
alias it as "uexpression".
To prevent uninitialized<> from leaking into the rest of the
system, change do_with_parser() to unwrap it. We add an
unwrap_uninitialized_t template type alias for that.
Lots of std::move()s are sprinkled around to make things compile,
as uninitialized<T> refuses to convert to T without them.
uninitialized<> is used to work around the parser generator's propensity
to default-construct return values by supplying a default constructor
to otherwise non-default-constructible types. Make it easier to initialize
it not only from the wrapped type, but also from types convertible to
the wrapped type.
This is useful to initialize an uninitialized<expression> from an
expression element (say a binary_operator), without an explicit
conversion.
The grammar generator relies on everything having a default
constuctor, and to accomodate it we have an uninitialized<>
template that fakes a default constructor where one doesn't
exist. For convenience we have implicit conversion operators
from uninitialized<T> to T. Currently, we have them for both
rvalue-reference and normal reference wrappers.
It turns out that C++ isn't clever enough to deal with both
of them when templates are involved. When it needs a T but
as an uninitialized_wrapper<T>&&, it sees both conversion
operators and can't pick one.
Aid it by removing the non-rvalue conversion operator. The
rvalue conversion operator is more efficient, and is all that
is needed, since we don't use values more than once in the grammar.
Sprinkle std::move()s on the rest of the grammar to keep it
compiling. In a few places the odd "$production" syntax
is changed to the more common "var=production ... { var }".
This is the last step of deprecation dance of DTCS.
In Scylla 5.1, users were warned that DTCS was deprecated.
In 5.2, altering or creation of tables with DTCS was forbidden.
5.3 branch was already created, so this is targetting 5.4.
Users that refused to move away from DTCS will have Scylla
falling back to the default strategy, either STCS or ICS.
See:
WARN 2023-07-14 09:49:11,857 [shard 0] schema_tables - Falling back to size-tiered compaction strategy after the problem: Unable to find compaction strategy class 'DateTieredCompactionStrategy
Then user can later switch to a supported strategy with
alter table.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Closes#14559
The index match code has some default-initialized expressions. These won't
compile when we remove expression's default constructor, so replace them
with the current default value, an empty conjunction.
An empty conjunction doesn't make any special sense here; the code
should be refactored not to rely on this random initial value. But this
is delicate code and the refactoring shouldn't be done in the middle of
an unrelated series.
_partition_key_restrictions, _clustering_columns_restrictions, and
_nonprimary_key_restrictions are currently default-initialized. As
we're about to remove expression's default constructor, we need
to initialize them with something.
Use conjunction({}). Not only is this what the default constructor does,
that's what those fields' manipulators assume - they adjust field x
using make_conjunction(y, x). This dates to expression's roots as
a replacement for restrictions.
We have some gnarly code that classifies restrictions by the column
they restrict. This uses std::unordered_map::operatorp[], which uses
the value's default constructor. This happens to be "expression", and
as we're about to remove the default constructor, this won't do.
Fix by using try_emplace(), which makes the code nicer and more
efficient. It could be further improved, but it's better to demolish it
instead.
We're about to remove expression's default constructor, so adjust
the usertype_constructor code that checks whether a field has an
initializer or whether we must supply a NULL to not rely on it.
A broadcast_table modification query consists of the the key, the new value,
and the condition. When preparing it, we construct the query with
a default new_value expression, and pass it to
operation::prepare_for_broadcast_tables() to fill .new_value.
Since we're removing expression's default constructor, this won't work.
So instead nothing to a (renamed)
operation::prepare_new_value_for_broadcast_tables(), and use the return
value to fill the query.
fmtlib uses `{}` as the placeholder for the formatted argument, not
`{}}`.
so let's correct it.
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>
Closes#14586
We have plenty of code marked with #if 0. Once it was an indication
of missing functionality, but the code has evolved so much it's
useless as an indication and only a distraction.
Delete it.
Closes#14511
Currently we hold group0_guard only during DDL statement's execute()
function, but unfortunately some statements access underlying schema
state also during check_access() and validate() calls which are called
by the query_processor before it calls execute. We need to cover those
calls with group0_guard as well and also move retry loop up. This patch
does it by introducing new function to cql_statement class take_guard().
Schema altering statements return group0 guard while others do not
return any guard. Query processor takes this guard at the beginning of a
statement execution and retries if service::group0_concurrent_modification
is thrown. The guard is passed to the execute in query_state structure.
Fixes: #13942
Message-Id: <ZJ2aeNIBQCtnTaE2@scylladb.com>
SELECT JSON uses selector_factories to obtain the names of the
fields to insert into the json object, and we want to drop
selector_factories entirely. Switch instead to the ":metadata" mode
of printing expressions, which does what we want.
Unfortunately, the switch changes how system functions are converted
into field names. A function such as unixtimestampof() is now rendered
as "system.unixtimestampof()"; before it did not have the keyspace
prefix.
This is a compatiblity problem, albeit an obscure one. Since the new
behavior matches Cassandra, and the odds of hitting this are very low,
I think we can allow the change.
The replica needs to know which columns we're interested in. Iterate
and recurse into all selector expressions to collect all mentioned columns.
We use the same algorithm that create_factories_and_collect_column_definitions()
uses, even though it is quadratic, to avoid causing surprises.
When constructing a selection_with_processing, split the
selectors into an inner loop and an outer loop with split_aggregation().
We can then reimplement add_input_row() and get_output_row() as follows:
- add_input_row(): evaluate the inner loop expressions and store
the results in temporaries
- get_output_row(): evaluate the outer loop expressions, pulling in
values from those temporaries.
reset(), which is called between groups, simply copies the initial
values rathered by split_aggregation() into the temporaries.
The only complexity comes from add_column_for_post_query_processing(),
which essentially re-does the work of split_aggregation(). It would
be much better if we added the column before split_aggregation() was
called, but some refactoring has to take place before that happens.