Restrictions like
col IN (1)
get converted to
col = 1
as an optimization/simplification.
This used to be done in prepare_binary_operator,
but it fits way better inside of
validate_and_prepare_new_restriction.
When it was being done in prepare_binary_operator
the conversion happened before validation checks
and the error messages would describe an equality
restriction despite the user making an IN restriction.
Now the conversion happens after all validation
is finished, which ensures that all checks are
being done on the original expression.
Fixes: #10631
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
Currently expr::to_restriction is the only place where
prepare_binary_operator is called.
In case of a single-value IN restriction like:
mycol IN (1)
this expression is converted to
mycol = 1
by expr::to_restriction.
Once restriction is removed expr::to_restriction
will be removed as well so its functionality has to
be moved somewhere else.
Move handling single value INs inside
prepare_binary_operator.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
An expr::constant is an expression that happens to represent a constant,
so it's too heavyweight to be used for evaluation. Right now the extra
weight is just a type (which causes extra work by having to maintain
the shared_ptr reference count), but it will grow in the future to include
source location (for error reporting) and maybe other things.
Prior to e9b6171b5 ("Merge 'cql3: expr: unify left-hand-side and
right-hand-side of binary_operator prepares' from Avi Kivity"), we had
to use expr::constant since there was not enough type infomation in
expressions. But now every expression carries its type (in programming
language terms, expressions are now statically typed), so carrying types
in values is not needed.
So change evaluate() to return cql3::raw_value. The majority of the
patch just changes that. The rest deals with some fallout:
- cql3::raw_value gains a view() helper to convert to a raw_value_view,
and is_null_or_unset() to match with expr::constant and reduce further
churn.
- some helpers that worked on expr::constant and now receive a
raw_value now need the type passed via an additional argument. The
type is computed from the expression by the caller.
- many type checks during expression evaluation were dropped. This is
a consequence of static typing - we must trust the expression prepare
phase to perform full type checking since values no longer carry type
information.
Closes#10797
This unifies the left-hand-side and right-hand-side of expression preparation.
The contents of the visitor in prepare_binop_lhs() is moved to the
visitor in try_prepare_expression(). This usually replaces an
on_internal_error() branch.
An exception is tuple_constructor, which is valid in both the left-hand-side
and right-hand-side (e.g. WHERE (x, y) IN (?, ?, ?)). We previously
enhanced this case to support not having a a column_specification, so
we just delete the branch from prepare_binop_lhs.
When encountering a subscript as the left-hand-side of a binary operator,
we assume the subscripted value is a column and process it directly.
As a step towards de-specializing the left-hand-side of binary operators,
use recursive descent into prepare_binop_lhs() instead. This requires
generating a column_specification for arbitrary expressions, so we
add a column_specification_of() function for that. Currently it will
return a good representation for columns (the only input allowed by
the grammar) and a bad representation (the text representation of the
expression) for other expressions. We'll have to improve that when we
relax the grammar.
Currently the only expression form that can appear on both the left
hand side of an expression and the right hand side is a tuple constructor,
so consequently it must support both modes of type processing - either
deriving the type from the expression, or imposing a type on the expression.
As an example, in
WHERE (A, B) = (:a, :b)
the first tuple derives its type from the column types, while the
second tuple has the type of the first tuple imposed on it.
So, we adjust tuple_constructor_prepare_nontuple to support both forms.
This means allowing the receiver not to be present, and calculating the
tuple type if that is the case.
resolve_column() is part of the prepare stage, and tries to
resolve a column name in a query against the table's columns.
If it fails, it prints the containing binary_expression as
context. However, that's unnecessary - the unresolved
column name is sufficient context. So print that.
The motivation is to unify preparation of binary_operator
left-hand-side and right-hand-side - prepare_expression()
doesn't have the extra parameter and it wouldn't make sense
to add it, as expressions might not be children of binary_operators.
Currently prepare_expression is never used where a schema is needed -
it is called for the right-hand-side of binary operators (where we
don't accept columns) or for attributes like WRITETIME or TTL. But
when we unify expression preparation it will need to handle columns
too, and these need the schema to look up the column.
So pass the schema as a parameter. It is optional (a pointer) since
not all contexts will have a schema (for example CREATE AGGREGATE).
In CQL (and SQL) types flow in different directions in expression
components. In an expression
A[:x] = :y
The type of A is known, the type of :x is derived from the type of A,
and the type of :y is derived from the type of A[:x].
Currently prepare_expression() only supports the second mode - an
expression's type is dictated by its caller via the column_specification
parameter. But this means it can only be used to evaluate the
right-hand-side of binary expressions, since the left-hand-side uses
the first mode, where the type is derived from the column, not
imposed by the caller.
To support both modes, make the column_specification parameter optional
(it is already a pointer so just accept null) and also make the returned
expression optional, to indicate failure to infer the type if the
column_specification was not given.
This patch only arranges for the new calling convention (as a new
try_prepare_expression call), it does not actually implement anything.
Currently, preparing a cast drops the cast completely (as the
types are verified to be binary compatibile). This means we lose
the casted-to type. Since we wish to keep type infomation, keep the
cast in the prepared expression tree (and therefore the casted-to
type).
Once we do that, we must extend evaluate() to support cast
expressions.
Infer the type of a list index as int32_type.
The error message when a non-subscriptable type is provided is
changed, so the corresponding test is changed too.
We used to allow nulls in lists of IN values,
i.e. a query like this would be valid:
SELECT * FROM tab WHERE pk IN (1, null, 2);
This is an old feature that isn't really used
and is already forbidden in Cassandra.
Additionally the current implementation
doesn't allow for nulls inside the list
if it's sent as a bound value.
So something like:
SELECT * FROM tab WHERE pk IN ?;
would throw an error if ? was (1, null, 2).
This is inconsistent.
Allowing it made writing code cumbersome because
this was the only case where having a null
inside of a collection was allowed.
Because of it there needed to be
separate code paths to handle regular lists
and lists of NULL values.
Forbidding it makes the code nicer and consistent
at the cost of a feature that isn't really
important.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
shape_type was used in prepare_expression to differentiate
between a few cases and create the correct receivers.
This was used by the relation class.
Now creating the correct receiver has been delegated to the caller
of prepare_expression and all bind_variables can be handled
in the same simple way.
shape_type is not needed anymore.
Not having it is better because it simplifies things.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
This function was used by multi_column_relation.hh,
but now it isn't needed anymore.
The only way to prepare a bind_variable is now the standard prepare_expression.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
Add a function that allows to prepare
a binary_operator received from the parser.
It resolves columns on the LHS, calculates type of LHS,
and prepares RHS with the correct type.
It will be used by expr::to_restriction.
Some basic type checks are performed, but more throughout
checks will be required in expr::to_restriction to fully
validate a relation.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
The situation with preparing bind_variable is a bit strange,
there are four shapes of bind variables and receiver behaviour
is not in line with other types.
To prepare a bind_variable for a list of IN values for an int column
the current code requires us to pass a receiver of type int.
This is counterintuitive, to prepare a string we pass
a receiver with string type, so to prepare list<int> we should
pass a receiver of type list<int>, not just int.
This commit changes the behaviour in two ways:
- Shape of bind_variable doesn't matter anymore
- The bind_variable gets the receiver passed to prepare_expression,
no more list<receiver> magic.
Other variants of bind_variable_x_prepare_expression are not removed yet
because they are needed by prepare_expression_mutlti_column.
They will be removed later, along with bind_variable::shape_type.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
The standard CQL list type doesn't allow for nulls inside the collection.
However lists of IN values are the exception where bind nullsare allowed,
for example in restrictions like: p IN (1, 2, null)
To be able to use list_prepare_expression with lists of IN values
a flag is added to specify whether nulls should be allowed.
Signed-off-by: cvybhu <jan.ciolek@scylladb.com>
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.
Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.
The changes we applied mechanically with a script, except to
licenses/README.md.
Closes#9937
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.
data_dictionary::database::real_database() is called from several
places, for these reasons:
- calling yet-to-be-converted code
- callers with a legitimate need to access data (e.g. system_keyspace)
but with the ::database accessor removed from query_processor.
We'll need to find another way to supply system_keyspace with
data access.
- to gain access to the wasm engine for testing whether used
defined functions compile. We'll have to find another way to
do this as well.
The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.
Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
There were a few places where term was still mentioned.
Removed/replaced term with expression.
search_and_replace is still done only on LHS of binary_operator
because the existing code would break otherwise.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>