Commit Graph

27701 Commits

Author SHA1 Message Date
Avi Kivity
40fdbf9558 db: schema_tables: coroutinize drop_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
7d46300af2 db: schema_tables: coroutinize column_mapping_exists() 2021-08-01 20:13:15 +03:00
Avi Kivity
74b2200f4d db: schema_tables: coroutinize get_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
f19ca7aaaa db: schema_tables: coroutinize read_table_mutations() 2021-08-01 20:13:15 +03:00
Avi Kivity
81a2be17b6 db: schema_tables: coroutinize create_views_from_schema_partition() 2021-08-01 20:13:15 +03:00
Avi Kivity
15f2fd2a23 db: schema_tables: coroutinize create_views_from_table_row() 2021-08-01 20:13:15 +03:00
Avi Kivity
0843d441ff db: schema_tables: unpeel lw_shared_ptr in create_Tables_from_tables_partition()
The tables local is a lw_shared_ptr which is created and then refeferenced
before returning. It can be unpeeled to the pointed-to type, resulting in
one less allocation.
2021-08-01 20:13:15 +03:00
Avi Kivity
66054d24c4 db: schema_tables: coroutinize create_tables_from_tables_partition() 2021-08-01 20:13:15 +03:00
Avi Kivity
82ba3c5f4a db: schema_tables: coroutinize create_table_from_name() 2021-08-01 20:13:15 +03:00
Avi Kivity
862f491605 db: schema_tables: coroutinize read_table_mutations() 2021-08-01 20:13:15 +03:00
Avi Kivity
91c1a29808 db: schema_tables: coroutinize merge_keyspaces() 2021-08-01 20:13:15 +03:00
Avi Kivity
78fc05922b db: schema_tables: coroutinize do_merge_schema()
It is now using an internal thread, so unpeel is and replace
future::get() with co_await.
2021-08-01 20:13:15 +03:00
Avi Kivity
9680d9e76c db: schema_tables: futurize and coroutinize merge_functions()
Right now, merge_functions() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
9cbae212bf db: schema_tables: futurize and coroutinize user_types_to_drop::drop
user_types_to_drop::drop is a function object returning void, and expecting
to be called in a thread. Make it return a future and convert the
only value it is initialized to to a coroutine.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
e5f28fc746 db: schema_tables: futurize and coroutinize merge_types()
Right now, merge_types() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

The [[nodiscard]] attribute is moved from the function to the
return type, since the function now returns a future which is
nodiscard anyway.

The lambda returned is not coroutinized (yet) since it's part
of the user_types_to_drop inner function that still returns void
and expects to be called in a thread.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
c9584d50ee db: schema_tables: futurize and coroutinize merge_tables_and_views()
Right now, merge_tables_and_views() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
80fe158387 db: schema_tables: coroutinize store_column_mapping() 2021-08-01 20:13:15 +03:00
Avi Kivity
ee8b02f437 db: schema_tables: futurize and coroutinize read_tables_for_keyspaces()
Right now, read_tables_for_keyspaces() expects to be called in a thread.
Remove that requirement by converting it into a coroutine and returning
a future.

De-threading helps reduce errors where something expects to be called
in a thread, but isn't.
2021-08-01 20:13:15 +03:00
Avi Kivity
cd1003daad db: schema_tables: coroutinize read_table_names_of_keyspace() 2021-08-01 20:13:15 +03:00
Avi Kivity
000f7eabd5 db: schema_tables: coroutinize recalculate_schema_version() 2021-08-01 20:13:15 +03:00
Avi Kivity
95d33e9e86 db: schema_tables: coroutinize merge_schema() 2021-08-01 20:13:15 +03:00
Avi Kivity
25548f46dd db: schema_tables: introduce and use with_merge_lock()
Rather than open-coding merge_lock()/merge_unlock() pairs, introduce
and use a helper. This helps in coroutinization, since coroutines
don't support RAII with destructors that wait.
2021-08-01 20:13:15 +03:00
Avi Kivity
7b731ae2c6 db: schema_tables: coroutinize update_schema_version_and_announce() 2021-08-01 20:13:15 +03:00
Avi Kivity
385e0dcc2e db: schema_tables: coroutinize read_keyspace_mutation() 2021-08-01 20:13:15 +03:00
Avi Kivity
ef5df86b1f db: schema_tables: coroutinize read_schema_partition_for_table() 2021-08-01 20:13:15 +03:00
Avi Kivity
8841c2ba10 db: schema_tables: coroutinize read_schema_partition_for_keyspace()
Two reference parameters are copied rather than changing the signature,
to avoid a compile-the-world. It can be cleaned up post-merge.
2021-08-01 20:09:00 +03:00
Avi Kivity
d1876488f7 db: schema_tables: coroutinize query_partition_mutation() 2021-08-01 19:17:13 +03:00
Avi Kivity
35f9caf6a9 db: schema_tables: coroutinize read_schema_for_keyspaces() 2021-08-01 19:17:09 +03:00
Avi Kivity
7c0476251a db: schema_tables: coroutinize convert_schema_to_mutations() 2021-08-01 19:16:55 +03:00
Avi Kivity
921216e8e6 db: schema_tables: coroutinize calculate_schema_digest() 2021-08-01 19:16:50 +03:00
Avi Kivity
3dab308ddf db: schema_tables: coroutinize save_system_schema() 2021-08-01 19:16:40 +03:00
Avi Kivity
3089558f8d tools: toolchain: update to Fedora 34 with clang 12 and libstdc++ 11.2 2021-07-31 15:25:13 +03:00
Piotr Sarna
1c7af8d46f cql-pytest: adjust a test case for Cassandra 4
One of the test cases stopped working against Cassandra 4, but that's
just because it returns a slightly different error type.
The test case is adjusted to work on both Scylla and new Cassandra.

Message-Id: <222a7f63a3e9739c6fc646173306fcdb3da25890.1627655555.git.sarna@scylladb.com>
2021-07-30 17:36:23 +03:00
Avi Kivity
0876248c2b Merge "cql3: cache function calls evaluation for non-deterministic functions" from Pavel S
"
`function_call` AST nodes are created for each function
with side effects in a CQL query, i.e. non-deterministic
functions (`uuid()`, `now()` and some others timeuuid-related).

These nodes are evaluated either when a query itself is executed
or query restrictions are computed (e.g. partition/clustering
key ranges for LWT requests).

We need to cache the calls since otherwise when handling a
`bounce_to_shard` request for an LWT query, we can possibly
enter an infinite bouncing loop (in case a function is used
to calculate partition key ranges for a query), since the
results can be different each time.

Furthermore, we don't support bouncing more than one time.
Returning `bounce_to_shard` message more than one time
will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Function calls are indexed by the order in which they appear
within a statement while parsing. There is no need to
include any kind of statement identifier to the cache key
since `query_options` (which holds the cache) is limited
to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Add a test written in `cql-pytest` framework to verify
that both prepared and unprepared lwt statements handle
`bounce_to_shard` messages correctly in such scenario.

Fixes: #8604

Tests: unit(dev, debug)

NOTE: the patchset uses `query_options` as a container for
cached values. This doesn't look clean and `service::query_state`
seems to be a better place to store them. But it's not
forwarded to most of the CQL code and would mean that a huge number
of places would have to be amended.
The series presents a trade-off to avoid forwarding `query_state`
everywhere (but maybe it's the thing that needs to be done, nonetheless).
"

* 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla:
  cql-pytest: add a test for non-pure CQL functions
  cql3: cache function calls evaluation for non-deterministic functions
  cql3: rename `variable_specifications` to `prepare_context`
2021-07-30 14:21:11 +03:00
Pekka Enberg
21cfd090f7 Update tools/python3 submodule
* tools/python3 afe2e7f...279aae1 (1):
  > Drop filename start with '..' in pip modules
2021-07-30 13:58:45 +03:00
Avi Kivity
c3c82415c3 cql3: term: make term::raw, term::multi_column_raw forward declarable
As preparation for converting term::raw an expression, make it
forward declarable so that we can have a term::raw that is an
expression, and an expression that is a term::raw, without driving
the compiler insane.

Closes #9101
2021-07-30 13:50:28 +03:00
Pavel Emelyanov
4f4b863e6a test.py: Always disable boost colored output
Tests' output is always redirected to a log file. Enabling colored
output makes it very hard to read.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210730083731.17813-1-xemul@scylladb.com>
2021-07-30 12:22:31 +03:00
Piotr Sarna
60072045db Merge 'cql3: replace cql3::selection::selectable::raw ...
hierarchy with expressions' from Avi Kivity

Currently, the grammar has two parallel hierarchies. One hierarchy is
used in the WHERE clause, and is based on a combination of `term`
and expressions. The other is used in the SELECT clause, and is
using the cql3::selection::selectable hierarchy. There is some overlap
between the hierarchies: both can name columns. Logically, however,
they overlap completely - in SQL anything you can select you can
filter on, and vice versa. So merging the two hierarchies is important if
we want to enrich CQL. This series does that, partially (see below),
converting the SELECT clause to expressions.

There is another hierarchy split: between the "raw", pre-prepare object
hierarchy, and post-prepare non-raw. This series limits itself to converting
the raw hierarchy and leaves the non-raw hierarchy alone.

An important design choice is not to have this raw/non-raw split in expressions.
Note that most of the hierarchy is completely parallel: addition is addition
both before prepare and after prepare (but see [1]). The main difference
is around identifiers - before preparation they are unresolved, and after
preparation they become `column_definition` objects. We resolve that by
having two separate types: `unresolved_identifier` for the pre-prepare phase,
and the existing `column_value` for post-prepare phase.

Alternative choices would be to keep a separate expression::raw variant, or
to template the expression variant on whether it is raw or not. I think it would
cause undue bloat and confusion.

Note the series introduces many on_internal_error() calls. This is because
there is not a lot of overlap in the hierarchies today; you can't have a cast in
the WHERE clause, for example. These on_internal_error() calls cannot be
triggered since the grammar does not yet allow such expressions to be
expressed. As we expand the grammar, they will have to be replaced with
working implementations.

Lastly, field selection is expressible in both hierarchies. This series does not yet
merge the two representations (`column_value.sub` vs `field_selection`), but it
should be easy to do so later.

[1] the `+` operator can also be translated to list concatenation, which we may
  choose to represent by yet another type.

Test: unit(dev)

Closes #9087

* github.com:scylladb/scylla:
  cql3: expression: update find_atom, count_if for function_call, cast, field_selection
  cql3: expressions: fix printing of nested expressions
  cql3: selection: replace selectable::raw with expression
  cql3: expression: convert selectable::with_field_selection::raw to expression
  cql3: expression: convert selectable::with_cast::raw to expression
  cql3: expression: convert selectable::with_anonymous_function::raw to expression
  cql3: expression: convert selectable::with_function_call::raw to expressions
  cql3: selectable: make selectable::raw forward-declarable
  cql3: expressions: convert writetime_or_ttl::raw to expression
  cql3: expression: add convenience constructor from expression element to nested expression
  utils: introduce variant_element.hh
  cql3: expression: use nested_expression in binary_operator
  cql3: expression: introduce nested_expression class
  Convert column_identifier_raw's use as selectable to expressions
  make column_identifier::raw forward declarable
  cql3: introduce selectable::with_expression::raw
2021-07-30 09:57:39 +02:00
Pavel Solodovnikov
eaf70df203 cql-pytest: add a test for non-pure CQL functions
Introduce a test using `cql-pytest` framework to assert that
both prepared an unprepared LWT statements (insert with
`IF NOT EXISTS`) with a non-deterministic function call
work correctly in case its evaluation affects partition
key range computation (hence the choice of `cas_shard()`
for lwt query).

Tests: cql-pytest/test_non_deterministic_functions.py

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 01:22:50 +03:00
Pavel Solodovnikov
3b6adf3a62 cql3: cache function calls evaluation for non-deterministic functions
And reuse these values when handling `bounce_to_shard` messages.

Otherwise such a function (e.g. `uuid()`) can yield a different
value when a statement re-executed on the other shard.

It can lead to an infinite number of `bounce_to_shard` messages
sent in case the function value is used to calculate partition
key ranges for the query. Which, in turn, will cause crashes
since we don't support bouncing more than one time and the second
hop will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 01:22:39 +03:00
Tomasz Grabiec
7c28f77412 Merge 'Convert all remaining int tri-compares to std::strong_ordering' from Avi Kivity
Convert all known tri-compares that return an int to return std::strong_ordering.
Returning an int is dangerous since the caller can treat it as a bool, and indeed
this series uncovered a minor bug (#9103).

Test: unit (dev)

Fixes #1449

Closes #9106

* github.com:scylladb/scylla:
  treewide: remove redundant "x <=> 0" compares
  test: mutation_test: convert internal tri-compare to std::strong_ordering
  utils: int_range: change to std::strong_ordering
  test: change some internal comparators to std::strong_ordering
  utils: big_decimal: change to std::strong_ordering
  utils: fragment_range: change to std::strong_ordering
  atomic_cell: change compare_atomic_cell_for_merge() to std::strong_ordering
  types: drop scaffolding erected around lexicographical_tri_compare
  sstables: keys: change to std::strong_ordering internally
  bytes: compare_unsigned(): change to std::strong_ordering
  uuid: change comparators to std::strong_ordering
  types: convert abstract_type::compare and related to std::strong_ordering
  types: reduce boilerplate when comparing empty value
  serialized_tri_compare: change to std::strong_ordering
  compound_compat: change to std::strong-ordering
  types: change lexicographical_tri_compare, prefix_equality_tri_compare to std::strong_ordering
2021-07-29 21:43:54 +02:00
Takuya ASADA
3ecdd15777 dist/debian: keep sysconfdir.conf for scylla-housekeeping on 'remove'
Same as 4309785, dpkg does not re-install confffiles when it removed by
user, we are missing sysconfdir.conf for scylla-housekeeping on rollback.
To prevent this, we need to stop removing drop-in file directory on
'remove'.

Fixes #9109

Closes #9110
2021-07-29 12:32:21 +03:00
Avi Kivity
e44d3cc0ea Merge "Remove global storage service instance" from Pavel E
"
There are few places that call global storage service, but all
are easily fixable without significant changes.

1. alternator -- needs token metadata, switch to using proxy
2. api -- calls methods from storage service, all handlers are
   registered in main and can capture storage service from there
3. thrift -- calls methods from storage service, can carry the
   reference via controller
4. view -- needs tokens, switch to using (global) proxy
5. storage_service -- (surprisingly) can use "this"

tests: unit(dev), dtest(simple_boot_shutdown, dev)
"

* 'br-unglobal-storage-service' of https://github.com/xemul/scylla:
  storage_service: Make it local
  storage_service: Remove (de)?init_storage_service()
  storage_service: Use container() in run_with(out)_api_lock
  storage_service: Unmark update_topology static
  storage_service: Capture this when appropriate
  view: Use proxy to get token metadata from
  thrift: Use local storage service in handlers
  thrift: Carry sharded<storage_service>& down to handler
  api: Capture and use sharded<storage_service>& in handlers
  api: Carry sharded<storage_service>& down to some handlers
  alternator: Take token metadata from server's storage_proxy
  alternator: Keep storage_proxy on server
2021-07-29 11:47:16 +03:00
Avi Kivity
8d2255d82c Merge "Parallelize multishard_combining_reader_as_mutation_source test" from Pavel E
"
This is the 3rd slowest test in the set. There are 3 cases out
there that are hard-coded to be sequential. However, splitting
them into boost test cases helps running this test faster in
--parallel-cases mode. Timings for debug mode:

         Total before the patch: 25 min
     Sequential after the patch: 25 min
                     Basic case:  5 min
      Evict-paused-readers case:  5 min
    Single-mutation-buffer case: 15 min

tests: unit.multishard_combining_reader_as_mutation_source(debug)
"

* 'br-parallel-mcr-test' of https://github.com/xemul/scylla:
  test: Split test_multishard_combining_reader_as_mutation_source into 3
  test: Fix indentation after previous patch
  test: Move out internals of test_multishard_combining_reader_as_mutation_source
2021-07-29 11:39:02 +03:00
Raphael S. Carvalho
c399601833 table: kill move_sstables_from_staging()
not used anywhere.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210728175403.86867-1-raphaelsc@scylladb.com>
2021-07-29 10:42:36 +03:00
Raphael S. Carvalho
eb16268768 table: Guarantee serialization of every sstable set updates
Continuing the work from e4eb7df1a1, let's guarantee
serialization of sstable set updates by making all sites acquire
the mutation permit. Then table no longer rely on serialization
mechanism of row cache's update functions.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20210728174740.78826-1-raphaelsc@scylladb.com>
2021-07-29 10:42:18 +03:00
Pavel Emelyanov
f9132b582b storage_service: Make it local
There are 3 places that can now declare local instance:

- main
- cql_test_env
- boost gossiper test

The global pointer is saved in debug namespace for debugging.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-29 05:12:36 +03:00
Pavel Emelyanov
055025eaa9 storage_service: Remove (de)?init_storage_service()
One of them just re-wraps arguments in std::ref and calls for
global storage service. The other one is dead code which also
calls the global s._s. Remove both and fix the only caller.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-29 05:12:36 +03:00
Pavel Emelyanov
2ffbe894b9 storage_service: Use container() in run_with(out)_api_lock
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-29 05:12:36 +03:00
Pavel Emelyanov
cd44a808be storage_service: Unmark update_topology static
And use container() to reshard to shard 0. This removes one
more call for global storage service instance.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-29 05:12:36 +03:00