Commit Graph

163 Commits

Author SHA1 Message Date
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Piotr Sarna
feea7cb920 Merge 'cql3: disentangle column_identifier from selectable' from Avi Kivity
column_identifier serves two purposes: a value type used to denote an
identifier (which may or may not map to a table column), and `selectable`
implementation used for selecting table columns. This stands in the way
of further refactoring - the unification of the WHERE clause prepare path
(prepare_expression()) and the SELECT clause prepare path
(prepare_selectable()).

Reduce the entanglement by moving the selectable-specific parts to a new
type, selectable_column, and leaving column_identifier as a pure value type.

Closes #9729

* github.com:scylladb/scylla:
  cql3: move selectable_column to selectable.cc
  cql3: column_identifier: split selectable functionality off from column_identifier
2021-12-14 10:37:32 +01:00
Avi Kivity
3f862f9ece cql3: move selectable_column to selectable.cc
Move selectable_column to selectable.cc (and to the cql3::selection
namespace). This cleans up column_identifier.hh so it is now a pure
vocabulary header.
2021-12-10 19:51:57 +02:00
Nadav Har'El
c6f2afb93d Merge 'cql3: Allow to skip EQ restricted columns in ORDER BY' from Jan Ciołek
In queries like:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c1 ASC, c2 ASC)
```
we can skip the requirement to specify ordering for `c1` column.

The `c1` column is restricted by an `EQ` restriction, so it can have
at most one value anyway, there is no need to sort.

This commit makes it possible to write just:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c2 ASC)
```

I reorganized the ordering code, I feel that it's now clearer and easier to understand.
It's possible to only introduce a small change to the existing code, but I feel like it becomes a bit too messy.
I tried it out on the [`orderby_disorder_small`](https://github.com/cvybhu/scylla/commits/orderby_disorder_small) branch.

The diff is a bit messy because I moved all ordering functions to one place,
it's better to read [select_statement.cc](https://github.com/cvybhu/scylla/blob/orderby_disorder/cql3/statements/select_statement.cc#L1495-L1658) lines 1495-1658 directly.

In the new code it would also be trivial to allow specifying columns in any order, we would just have to sort them.
For now I commented out the code needed to do that, because the point of this PR was to fix #2247.
Allowing this would require some more work changing the existing tests.

Fixes: #2247

Closes #9518

* github.com:scylladb/scylla:
  cql-pytest: Enable test for skipping eq restricted columns in order by
  cql3: Allow to skip EQ restricted columns in ORDER BY
  cql3: Add has_eq_restriction_on_column function
  cql3: Reorganize orderings code
2021-12-09 21:11:56 +03:00
Jan Ciolek
7bbfa48bc5 cql3: Add has_eq_restriction_on_column function
Adds a function that checks whether a given expression has eq restrction
on the specified column.

It finds restrictions like
col = ...
or
(col, col2) = ...

IN restrictions don't count, they aren't EQ restrictions

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-12-09 12:06:43 +01:00
Avi Kivity
edaa0c468d cql3: expr: standardize on struct tag for expression components
Expression components are pure data, so emphasize this by using
the struct tag consistently. This is just a cosmetic change.

Closes #9740
2021-12-07 15:46:25 +02:00
Piotr Sarna
0bd139e81c Merge 'cql3: expr: detemplate and deinline find_in_expression()
... and count_if()' from Avi Kivity

The expression code provides some utilities to examine and manipulate
expressions at prepare time. These are not (or should not be) in the fast
path and so should be optimized for compile time and code footprint
rather than run time.

This series does so by detemplating and deinlining find_in_expression()
and count_if().

Closes #9712

* github.com:scylladb/scylla:
  cql3: expr: adjust indentation in recurse_until()
  cql3: expr: detemplate count_if()
  cql3: expr: detemplate count_if()
  cql3: expr: rewrite count_if() in terms of recurse_until()
  cql3: expr: deinline recurse_until()
  cql3: expr: detemplate find_in_expression
2021-12-03 15:41:07 +01:00
Jan Ciolek
be14904416 cql3: Don't allow unset values inside UDT
Scylla doesn't support unset values inside UDT.
The old code used to convert unset to null, which seems incorrect.

There is an extra space in the error message to retain compatability with Cassandra.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-12-03 14:46:21 +01:00
Avi Kivity
2c613b027d cql3: expr: adjust indentation in recurse_until()
Whitespace changes only.
2021-11-30 17:57:53 +02:00
Avi Kivity
f7f77df143 cql3: expr: detemplate count_if()
No functional changes. This prepare-path function does not need to
be inlined.
2021-11-30 17:52:15 +02:00
Avi Kivity
3a96b74e49 cql3: expr: detemplate count_if()
count_if() is a prepare-path function and does not need to be
a template. Type-erase it with noncopyable_function.
2021-11-30 17:50:34 +02:00
Avi Kivity
6f9e56e678 cql3: expr: rewrite count_if() in terms of recurse_until()
Counting is just recursing without early termination, and counting
as a side effect.
2021-11-30 17:49:00 +02:00
Avi Kivity
c01188c414 cql3: expr: deinline recurse_until()
As a prepare-path function, it has no business being inline.
2021-11-30 17:41:16 +02:00
Avi Kivity
d0177d4b85 cql3: expr: detemplate find_in_expression
find_in_expression() is not in a fast path but is quite large
and inlined due to being a template. Detemplate it into a
recurse_until() utility function, and keep only the minimal
code in a template.

The recurse_until is still inline to simplify review, but
will be deinlined in the next patch.
2021-11-30 17:37:24 +02:00
Avi Kivity
595cc328b1 Merge 'cql3: Remove term, replace with expression' from Jan Ciołek
This PR finally removes the `term` class and replaces it with `expression`.

* There was some trouble with `lwt_cache_id` in `expr::function_call`.
  The current code works the following way:
  * for each `function_call` inside a `term` that describes a pk restriction, `prepare_context::add_pk_function_call` is called.
  * `add_pk_function_call` takes a `::shared_ptr<cql3::functions::function_call>`, sets its `cache_id` and pushes this shared pointer onto a vector of all collected function calls
  * Later when some condiition is met we want to clear cache ids of all those collected function calls. To do this we iterate through shared pointers collected in `prepare_context` and clear cache id for each of them.

  This doesn't work with `expr::function_call` because it isn't kept inside a shared pointer.
  To solve this I put the `lwt_cache_id` inside a shared pointer and then `prepare_context` collects these shared pointers to cache ids.

  I also experimented with doing this without any shared pointers, maybe we could just walk through the expression and clear the cache ids ourselves. But the problem is that expressions are copied all the time, we could clear the cache in one place, but forget about a copy. Doing it using shared pointers more closely matches the original behaviour.
The experiment is on the [term2-pr3-backup-altcache](https://github.com/cvybhu/scylla/tree/term2-pr3-backup-altcache) branch
* `shared_ptr<term>` being `nullptr` could mean:
  * It represents a cql value `null`
  * That there is no value, like `std::nullopt` (for example in `attributes.hh`)
  * That it's a mistake, it shouldn't be possible

  A good way to distinguish between optional and mistake is to look for `my_term->bind_and_get()`, we then know that it's not an optional value.

* On the other hand `raw_value` cased to bool means:
   * `false` - null or unset
   * `true` - some value, maybe empty

I ran a simple benchmark on my laptop to see how performance is affected:
```
build/release/test/perf/perf_simple_query --smp 1 -m 1G --operations-per-shard 1000000 --task-quota-ms 10
```
* On master (a21b1fbb2f) I get:
  ```
  176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)

  median 176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)
  median absolute deviation: 0.00
  maximum: 176506.60
  minimum: 176506.60
  ```
* On this branch I get:
  ```
  172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)

  median 172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)
  median absolute deviation: 0.00
  maximum: 172225.30
  minimum: 172225.30
  ```

Closes #9481

* github.com:scylladb/scylla:
  cql3: Remove remaining mentions of term
  cql3: Remove term
  cql3: Rename prepare_term to prepare_expression
  cql3: Make prepare_term return an expression instead of term
  cql3: expr: Add size check to evaluate_set
  cql3: expr: Add expr::contains_bind_marker
  cql3: expr: Rename find_atom to find_binop
  cql3: expr: Add find_in_expression
  cql3: Remove term in operations
  cql3: Remove term in relations
  cql3: Remove term in multi_column_restrictions
  cql3: Remove term in term_slice, rename to bounds_slice
  cql3: expr: Remove term in expression
  cql3: expr: Add evaluate_IN_list(expression, options)
  cql3: Remove term in column_condition
  cql3: Remove term in select_statement
  cql3: Remove term in update_statement
  cql3: Use internal cql format in insert_prepared_json_statement cache
  types: Add map_type_impl::serialize(range of <bytes, bytes>)
  cql3: Remove term in cql3/attributes
  cql3: expr: Add constant::view() method
  cql3: expr: Implement fill_prepare_context(expression)
  cql3: expr: add expr::visit that takes a mutable expression
  cql3: expr: Add receiver to expr::bind_variable
2021-11-30 16:39:39 +02:00
Jan Ciolek
51a8a1f89b cql3: Remove remaining mentions of term
There were a few places where term was still mentioned.
Removed/replaced term with expression.

search_and_replace is still done only on LHS of binary_operator
because the existing code would break otherwise.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:57:00 +01:00
Jan Ciolek
e458340821 cql3: Remove term
term isn't used anywhere now. We can remove it and all classes that derive from it.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
dcd3199037 cql3: Rename prepare_term to prepare_expression
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
219f1a4359 cql3: Make prepare_term return an expression instead of term
prepare_term is now the only function that uses terms.
Change it so that it returns expression instead of term
and remove all occurences of expr::to_expression(prepare_term(...))

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
c84e941df9 cql3: expr: Add size check to evaluate_set
In old code sets::delayed_value::bind() contained a check that each serialized value is less than certain size.
I missed this when implementing evaluate(), so it's brought back to ensure identical behaviour.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
7bc65868eb cql3: expr: Add expr::contains_bind_marker
Add a function that checks whether there is a bind marker somewhere inside an expression.
It's important to note, that even when there are no bind markers, there can be other things that prevent immediate evaluation of an expression.
For example an expression can contain calls to nonpure functions.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
080286cb96 cql3: expr: Rename find_atom to find_binop
Soon there will be other functions that
also search in expression, find_atom would be confusing then.
find_binop is a more descriptive name.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
7cabed9ebf cql3: expr: Add find_in_expression
find_in_expression is a function that looks into the expression
and finds the given expression variant for which the predicate function returns true.
If nothing is found returns nullptr.

For example:
find_in_expression<binary_operator>(e, [](const binary_operator&) {return true;})
Will return the first binary operator found in the expression.

It is now used in find_atom, and soon will be used in other similar functions.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
e37906ae34 cql3: expr: Remove term in expression
Some struct inside the expression variant still contained term.
Replace those terms with expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:44 +01:00
Jan Ciolek
fd1596171e cql3: expr: Add evaluate_IN_list(expression, options)
evaluate_IN_list was only defined for a term,
but now we are removing term so it should be also defined for an expression.
The internal code is the same - this function used to convert the term to expression
and then did all operations on expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 20:55:09 +02:00
Jan Ciolek
e5391f1eed types: Add map_type_impl::serialize(range of <bytes, bytes>)
Adds two functions that take a range over pairs of serialized values
and return a serialized map value.

There are 2 functions - one operating on bytes and one operating on managed_bytes.
The version with managed_bytes is used in expression.cc, used to be a local static function.
The bytes version will be used in type_json.cc in the next commit.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 15:14:52 +02:00
Jan Ciolek
a82351dc79 cql3: expr: Add constant::view() method
Add a method that returns raw_value_view to expr::constant.

It's added for convenience - without it in many places
we would have to write my_value.value.to_view().

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 15:14:52 +02:00
Jan Ciolek
c2eb3a58b8 cql3: expr: Implement fill_prepare_context(expression)
Adds a new function - expr::fill_prepare_context.
This function has the same functionality as term::fill_prepare_context, which will be removed soon.

fill_prepare_context used to take its argument with a const qualifier, but it turns out that the argume>
It sets the cache ids of function calls corresponding to partition key restrictions.
New function doesn't have const to make this clear and avoid surprises.

Added expr::visit that takes an argument without const qualifier.

There were some problems with cache_ids in function_call.
prepare_context used to collect ::shared_ptr<functions::function_call>
of some function call, and then this allowed it to clear
cache ids of all involved functions on demand.

To replicate this prepare_context now collects
shared pointers to expr::function_call cache ids.

It currently collects both, but functions::function_call will be removed soon.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 15:14:52 +02:00
Jan Ciolek
edaa3b5dc2 cql3: expr: add expr::visit that takes a mutable expression
Currently expr::visit can only take a const expression as an argument.
For cases where we want to visit the expression and modify it a new function is needed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 15:14:52 +02:00
Jan Ciolek
9c40516071 cql3: expr: Add receiver to expr::bind_variable
bind_variable used to have only the type of bound value.
Now this type is replaced with receiver, which describes information about column corresponding to this value.
A receiver contains type, column name, etc.

Receiver is needed in order to implement fill_prepare_context in the next commit.
It's an argument of prepare_context::add_variable_specification.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 15:14:52 +02:00
Avi Kivity
9424f6e12f cql3: replace seastar::sprint() with fmt::format()
sprint() is obsolete. Note some calls where to helper functions that
use sprint(), not to sprint() directly, so both the helpers and
the callers were modified.
2021-10-27 17:02:00 +03:00
Avi Kivity
fd8beeaea9 treewide: handle switch statements that return
A switch statement where every case returns triggers a gcc
warning if the surrounding function doesn't return/abort.

Fix by adding an abort(). The abort() will never trigger since we
have a warning on unhandled switch cases.
2021-10-10 18:16:50 +03:00
Avi Kivity
b08c299713 cql3: expr: correct type of captured map value_type
A map's value_type has const key, but in two places we omitted
the const. This causes construction of a new value, plus gcc
complaining that we're refering to a temporary.

Fix by using the correct type.
2021-10-06 14:57:43 +03:00
Avi Kivity
c72906a2ee cql3: expr: drop nested_expression
Now that expression can be nested in its component types
directly, we can remove nested_expression. Most of the patch
adjusts uses to drop the dereference that was needed for
nested_expression.
2021-09-28 23:49:21 +03:00
Avi Kivity
448c06f150 cql3: expr: make expression forward declarable, easier to use
Make expression a class, holding a unique_ptr to a variant,
instead of just a variant.

This has some advantages:
 - the constructor can be properly constrained
 - the type can be forward-declared
 - the type name is just "expression", rather than
   a huge variant. This makes compiler error messages easier
   to read.
 - the internal indirection allows removal of nested_expression
   (later in the series)
2021-09-28 23:49:21 +03:00
Avi Kivity
be44b579a1 cql3: expr: introduce as/as_if/is
Simple wrappers for std::get, std::get_if, std::holds_alternative.

The new names are shorter and IMO more readable.

Call sites are updated.

We will later replace the implementation.
2021-09-28 23:49:11 +03:00
Avi Kivity
e7db3def4f cql3: expr: introduce expr::visit, replacing std::visit
The new expr::visit() is just a wrapper around std::visit(),
but has better constraints. A call to expr::visit() with a
visitor that misses an overload will produce an error message
that points at the missing type. This is done using the new
invocable_on_expression concept. Note it lists the expression
types one by one rather than using template magic, since
otherwise we won't get the nice messages.

Later, we will change the implementation when expression becomes
our own type rather than std::variant.

Call sites are updated.
2021-09-28 23:48:42 +03:00
Jan Ciolek
c672c0b42d cql3: expr: Convert evaluate_IN_list to use evaluate(expression)
evaluate_IN_list used term::bind(), but now it's possible
to make it use term::to_expression() and then evaluate(expression)

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
7ab14ca9c1 cql3: expr: Use only evaluate(expression) to evaluate term
Finally we don't need term::bind() to evaluate a term.
We can just convert the term to expression and call evaluate(expression).

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
ea02fd82bc cql3: expr: Implement evaluate(expr::function_call)
function_call can be evaluated now.
The code matches the one from functions::function_call::bind.

I needed to add cache id to function_call in order for it ot work properly.
See the blurb in struct function_call for more information.

New code corresponds to bind() in cql3/functions/functions.cc.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
4a035b07d3 cql3: expr: Implement evaluate(expr::usertype_constructor)
usertype_constructor can now be evaluated.

To evaluate an usertype_constructor we need to know the type,
because the fields have to be in the correct order.
Type has been added to usertype_constructor.

New code corresponds to old bind() of user_types::delayed_value in cql3/user_types.cc.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
f7ee40aa01 cql3: expr: Implement evaluate(expr::collection_constructor)
collection_constructor can now be evaluated.
There is a bit of a problem, because we don't know the type of an empty collection_constructor,
but luckily empty collection constructors get converted to constants during preparation.

For some reason in the original code when a collection contains unset_value,
the whole collection is automatically evaluated to unset_value. I didn't change this behaviour.

New code corresponds to old bind() of lists::delayed_value in cql3/lists.cc, sets::delayed_value etc.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
0f20d301d8 cql3: expr: Implement evaluate(expr::tuple_constructor)
Tuple constructors can now be evaluated.
New code corresponds to old bind() of tuples::delayed_value::marker in cql3/tuples.cc

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
5589f348e7 cql3: expr: Implement evaluate(expr::bind_variable)
Implement evaluating a bind_variable.
To be able to evaluate a bind_variable we need to know the type of the bound value.
This is why a data_type has been added to the bind_variable struct.

There are some quirks when evaluating a bind_variable.
The first problem occurs when the variable has been sent with an older cql serialization format and contains collections.
In that case the value has to be reserialized to use the newest cql serialization format.

The second problem occurs when there is a set or a map in the value.
The set value sent by the driver might not have the elements in the correct order, contain duplicates etc.
When a set or map is detected in the value it is reserialized as well.

collection_type_impl::reserialize doesn't work for this purpose, because it uses data_value which does not perform sorting or removal.

New code corresponds to old bind() of lists::marker in cql3/lists.cc, sets::marker etc.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
f0e238f0a6 cql3: expr: Add evaluate(expression, query_options)
Add a function that takes an expression and evaluates it to a constant.
Evaluating specific expression variants will be implemented in the following commits.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
c40f227c14 cql3: Implement term::to_expression for marker classes
Implement to_expression for non terminals that represent a bind marker.
For now each bind marker has a shape describing where it is used, but hopefully this can be removed in the future.

In order to evaluate a bind_variable we need to know its type.
The type is needed to pass to constant and to validate the value.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
499c9235fc cql3: expr: Add data_type to *_constructor structs
It is useful to have a data_type in *_constructor structs when evaluating.
The resulting constant has a data_type, so we have to find it somehow.

For tuple_constructor we don't have to create a separate tuple_type_impl instance.
For collection_constructor we know what the type is even in case of an empty collection.
For usertype_constructor we know the name, type and order of fields in the user type.

Additionally without a data_type we wouldn't know whether the type is reversed or not.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
f86a1270b0 cql3: Add term::to_expression method
Add a method that converts given term to the matching expression.
It will be used as an intermediate step when implementing evaluate(expression).
evaluate(term) will convert the term to the expression and then call evaluate(expression).

For terminals this is simply calling get() to serialize the value.
For non-terminals the implementation is more complicated and will be implemeted in the following commits.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00
Jan Ciolek
746e9c620f cql3: Reorganize term and expression includes
Make term.hh include expression.hh instead of the other way around.
expression can't be forward declared.
expression is needed in term.hh to declare term::to_expression().

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-24 11:05:53 +02:00