column_value::sub has been replaced by the subscript struct
everywhere, so we can finally remove it.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
All handlers for subscript have finally been implemented
and subscript can now be added to expression without
any trouble.
All the commented out code that waited for this moment
can now be uncommented.
Every such piece of code had a `TODO(subscript)` note
and by grepping this phrase we can make sure that
we didn't forget any of them.
Right now there is two ways to express a subscripted
column - either by a column_value with a sub field
or by using a subscript struct.
The grammar still uses the old column_value way,
but column_value.sub will be removed soon
and everything will move to the subscript struct.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Even though the new subscript allows for subscripting anything,
the only thing that is really allowed to be subscripted is a column.
Add a utility function that extracts the column_value
from an expression with is a column_value or subscript.
It will came in handy in the following commits.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Add a struct called subscript, which will be used in expression
variant to represent subscripted values e.g col[x], val[sub].
It will replace the sub field of column_value.
Having a separate struct in AST for this purpose
is cleaner and allows to express subscripting
values other than column_value.
It is not added to the expression variant yet, because
that would require immediately implementing all visitors.
The following commits will implement individual visitors
and then subscript will finally be added to expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
expr::visit was missing std::forward on the visitor.
In cases where the visitor was passed as an rvalue it wouldn't
be properly forwarded to std::visit.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
expr::visit had a bug where if we wanted to return
a reference in the visitor, the reference would be
to a temporary stack location instead of the passed
argument.
So trying to do something like this:
```
const bind_variable& ref = visit(overloaded_functor {
[](const bind_variable& bv) -> const bind_variable& { return bv; },
[](const auto&) -> const bind_variable& { ... }
}, e);
std::cout << ref << std::endl;
```
Would actually print a random location on stack instead
of valid value inside of e.
Additionally trying to return a non-const reference
doesn't even compile.
The problem was that the return type of expr::visit
was defined as `auto`, which can be `int`, but not `int&`.
This has been changed to `decltype(auto)` which can be both `int` and `int&`
New version of `expr::visit` works for `const expression&` and `expression&`
no matter what the visitor returns.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.
As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.
data_dictionary::database::real_database() is called from several
places, for these reasons:
- calling yet-to-be-converted code
- callers with a legitimate need to access data (e.g. system_keyspace)
but with the ::database accessor removed from query_processor.
We'll need to find another way to supply system_keyspace with
data access.
- to gain access to the wasm engine for testing whether used
defined functions compile. We'll have to find another way to
do this as well.
The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.
Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
column_identifier serves two purposes: a value type used to denote an
identifier (which may or may not map to a table column), and `selectable`
implementation used for selecting table columns. This stands in the way
of further refactoring - the unification of the WHERE clause prepare path
(prepare_expression()) and the SELECT clause prepare path
(prepare_selectable()).
Reduce the entanglement by moving the selectable-specific parts to a new
type, selectable_column, and leaving column_identifier as a pure value type.
Closes#9729
* github.com:scylladb/scylla:
cql3: move selectable_column to selectable.cc
cql3: column_identifier: split selectable functionality off from column_identifier
Move selectable_column to selectable.cc (and to the cql3::selection
namespace). This cleans up column_identifier.hh so it is now a pure
vocabulary header.
In queries like:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c1 ASC, c2 ASC)
```
we can skip the requirement to specify ordering for `c1` column.
The `c1` column is restricted by an `EQ` restriction, so it can have
at most one value anyway, there is no need to sort.
This commit makes it possible to write just:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c2 ASC)
```
I reorganized the ordering code, I feel that it's now clearer and easier to understand.
It's possible to only introduce a small change to the existing code, but I feel like it becomes a bit too messy.
I tried it out on the [`orderby_disorder_small`](https://github.com/cvybhu/scylla/commits/orderby_disorder_small) branch.
The diff is a bit messy because I moved all ordering functions to one place,
it's better to read [select_statement.cc](https://github.com/cvybhu/scylla/blob/orderby_disorder/cql3/statements/select_statement.cc#L1495-L1658) lines 1495-1658 directly.
In the new code it would also be trivial to allow specifying columns in any order, we would just have to sort them.
For now I commented out the code needed to do that, because the point of this PR was to fix#2247.
Allowing this would require some more work changing the existing tests.
Fixes: #2247Closes#9518
* github.com:scylladb/scylla:
cql-pytest: Enable test for skipping eq restricted columns in order by
cql3: Allow to skip EQ restricted columns in ORDER BY
cql3: Add has_eq_restriction_on_column function
cql3: Reorganize orderings code
Adds a function that checks whether a given expression has eq restrction
on the specified column.
It finds restrictions like
col = ...
or
(col, col2) = ...
IN restrictions don't count, they aren't EQ restrictions
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
find_in_expression() is not in a fast path but is quite large
and inlined due to being a template. Detemplate it into a
recurse_until() utility function, and keep only the minimal
code in a template.
The recurse_until is still inline to simplify review, but
will be deinlined in the next patch.
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
prepare_term is now the only function that uses terms.
Change it so that it returns expression instead of term
and remove all occurences of expr::to_expression(prepare_term(...))
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Add a function that checks whether there is a bind marker somewhere inside an expression.
It's important to note, that even when there are no bind markers, there can be other things that prevent immediate evaluation of an expression.
For example an expression can contain calls to nonpure functions.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Soon there will be other functions that
also search in expression, find_atom would be confusing then.
find_binop is a more descriptive name.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
find_in_expression is a function that looks into the expression
and finds the given expression variant for which the predicate function returns true.
If nothing is found returns nullptr.
For example:
find_in_expression<binary_operator>(e, [](const binary_operator&) {return true;})
Will return the first binary operator found in the expression.
It is now used in find_atom, and soon will be used in other similar functions.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Some struct inside the expression variant still contained term.
Replace those terms with expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
evaluate_IN_list was only defined for a term,
but now we are removing term so it should be also defined for an expression.
The internal code is the same - this function used to convert the term to expression
and then did all operations on expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Add a method that returns raw_value_view to expr::constant.
It's added for convenience - without it in many places
we would have to write my_value.value.to_view().
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Adds a new function - expr::fill_prepare_context.
This function has the same functionality as term::fill_prepare_context, which will be removed soon.
fill_prepare_context used to take its argument with a const qualifier, but it turns out that the argume>
It sets the cache ids of function calls corresponding to partition key restrictions.
New function doesn't have const to make this clear and avoid surprises.
Added expr::visit that takes an argument without const qualifier.
There were some problems with cache_ids in function_call.
prepare_context used to collect ::shared_ptr<functions::function_call>
of some function call, and then this allowed it to clear
cache ids of all involved functions on demand.
To replicate this prepare_context now collects
shared pointers to expr::function_call cache ids.
It currently collects both, but functions::function_call will be removed soon.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Currently expr::visit can only take a const expression as an argument.
For cases where we want to visit the expression and modify it a new function is needed.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
bind_variable used to have only the type of bound value.
Now this type is replaced with receiver, which describes information about column corresponding to this value.
A receiver contains type, column name, etc.
Receiver is needed in order to implement fill_prepare_context in the next commit.
It's an argument of prepare_context::add_variable_specification.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Now that expression can be nested in its component types
directly, we can remove nested_expression. Most of the patch
adjusts uses to drop the dereference that was needed for
nested_expression.
Make expression a class, holding a unique_ptr to a variant,
instead of just a variant.
This has some advantages:
- the constructor can be properly constrained
- the type can be forward-declared
- the type name is just "expression", rather than
a huge variant. This makes compiler error messages easier
to read.
- the internal indirection allows removal of nested_expression
(later in the series)
Simple wrappers for std::get, std::get_if, std::holds_alternative.
The new names are shorter and IMO more readable.
Call sites are updated.
We will later replace the implementation.
The new expr::visit() is just a wrapper around std::visit(),
but has better constraints. A call to expr::visit() with a
visitor that misses an overload will produce an error message
that points at the missing type. This is done using the new
invocable_on_expression concept. Note it lists the expression
types one by one rather than using template magic, since
otherwise we won't get the nice messages.
Later, we will change the implementation when expression becomes
our own type rather than std::variant.
Call sites are updated.
function_call can be evaluated now.
The code matches the one from functions::function_call::bind.
I needed to add cache id to function_call in order for it ot work properly.
See the blurb in struct function_call for more information.
New code corresponds to bind() in cql3/functions/functions.cc.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Add a function that takes an expression and evaluates it to a constant.
Evaluating specific expression variants will be implemented in the following commits.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Implement to_expression for non terminals that represent a bind marker.
For now each bind marker has a shape describing where it is used, but hopefully this can be removed in the future.
In order to evaluate a bind_variable we need to know its type.
The type is needed to pass to constant and to validate the value.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
It is useful to have a data_type in *_constructor structs when evaluating.
The resulting constant has a data_type, so we have to find it somehow.
For tuple_constructor we don't have to create a separate tuple_type_impl instance.
For collection_constructor we know what the type is even in case of an empty collection.
For usertype_constructor we know the name, type and order of fields in the user type.
Additionally without a data_type we wouldn't know whether the type is reversed or not.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Add a method that converts given term to the matching expression.
It will be used as an intermediate step when implementing evaluate(expression).
evaluate(term) will convert the term to the expression and then call evaluate(expression).
For terminals this is simply calling get() to serialize the value.
For non-terminals the implementation is more complicated and will be implemeted in the following commits.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Make term.hh include expression.hh instead of the other way around.
expression can't be forward declared.
expression is needed in term.hh to declare term::to_expression().
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
We need to be able to access elements of a constant.
Adds functions to easily do it.
Those functions check all preconditions required to access elements
and then use partially_deserialize_* or similar.
It's much more convenient than using partially_deserialize directly.
get_list_of_tuples_elements is useful with IN restrictions like
(a, b) IN [(1, 2), (3, 4)].
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
A list representing IN values might contain NULLs before evaluation.
We can remove them during evaluation, because nothing equals NULL.
If we don't remove them, there are gonna be errors, because a list can't contain NULLs.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
To convert a terminal to expr::constant we need know the value type.
Implement getting value type for terminals in user_types.hh.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Adds the functions:
constant evaluate(term*, const query_options&);
raw_value_view evaluate(term*, const query_options&);
These functions take a term, bind it and convert the terminal
to constant or raw_value_view.
In the future these functions will take expression instead of term.
For that to happen bind() has to be implemented on expression,
this will be done later.
Also introduces terminal::get_value_type().
In order to construct a constant from terminal we need to know the type.
It will be implemented in the following commits.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Adds constant to the expression variant:
struct constant {
raw_value value;
data_type type;
};
This struct will be used to represent constant values with known bytes and type.
This corresponds to the terminal from current design.
bool is removed from expression, now constant is used instead.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
This reverts commit e9343fd382, reversing
changes made to 27138b215b. It causes a
regression in v2 serialization_format support:
collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3)
Fixes#9360
We need to be able to access elements of a constant.
Adds functions to easily do it.
Those functions check all preconditions required to access elements
and then use partially_deserialize_* or similar.
It's much more convenient than using partially_deserialize directly.
get_list_of_tuples_elements is useful with IN restrictions like
(a, b) IN [(1, 2), (3, 4)].
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
A list representing IN values might contain NULLs before evaluation.
We can remove them during evaluation, because nothing equals NULL.
If we don't remove them, there are gonna be errors, because a list can't contain NULLs.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>