Commit Graph

276 Commits

Author SHA1 Message Date
Jan Ciolek
52bbc1065c cql3: allow lists of IN elements to be NULL
Requests like `col IN NULL` used to cause
an error - Invalid null value for colum col.

We would like to allow NULLs everywhere.
When a NULL occurs on either side
of a binary operator, the whole operation
should just evaluate to NULL.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes #11775
2022-10-13 15:11:32 +02:00
Jan Ciolek
a2c359a741 cql3: Make CONTAINS KEY NULL return false
A binary operator like this:
{1: 2, 3: 4} CONTAINS KEY NULL
used to evaluate to `true`.

This is wrong, any operation involving null
on either side of the operator should evaluate
to NULL, which is interpreted as false.

This change is not backwards compatible.
Some existing code might break.

partially fixes: #10359

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-05 18:15:44 +02:00
Jan Ciolek
bbfef4b510 cql3: Make CONTAINS NULL return false
A binary operator like this:
[1, 2, 3] CONTAINS NULL
used to evaluate to `true`.

This is wrong, any operation involving null
on either side of the operator should evaluate
to NULL, which is interpreted as false.

This change is not backwards compatible.
Some existing code might break.

partially fixes: #10359

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-10-05 18:15:15 +02:00
Jan Ciolek
ac152af88c expression: Add for_each_boolean factor
boolean_factors is a function that takes an expression
and extracts all children of the top level conjunction.

The problem is that it returns a vector<expression>,
which is inefficent.

Sometimes we would like to iterate over all boolean
factors without allocations. for_each_boolean_factor
is implemented for this purpose.

boolean_factors() can be implemented using
for_each_boolean_factor, so it's done to
reduce code duplication.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-09-25 16:34:22 +03:00
Michał Radwański
10e241988e cql/expr/expression, index/secondary_index_manager: needs_filtering and
index_supports_expression rewrite to accomodate for indexes over
collections
2022-08-14 10:29:52 +03:00
Karol Baryła
ac97086855 cql3, index: Use entries() indexes on collections for queries
Previous commit added the ability to use GSI over non-frozen collections in queries,
but only the keys() and values() indexes. This commit adds support for the missing
index type - entries() index.

Signed-off-by: Karol Baryła <karol.baryla@scylladb.com>
Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-08-14 10:29:52 +03:00
Karol Baryła
7966841d37 cql3, index: Use keys() and values() indexes on collections for queries.
Previous commits added the possibility of creating GSI on non-frozen collections.
This (and next) commit allow those indexes to actually be used by queries.
This commit enables both keys() and values() indexes, as they are pretty similar.
2022-08-14 10:29:52 +03:00
Avi Kivity
8085b9f57a cql3: expr: add boolean_factors() function to factorize an expression
When analyzing a WHERE clause, we want to separate individual
factors (usually relations), and later partition them into
partition key, clustering key, and regular column relations. The
first step is separation, for which this helper is added.

Currently, it is not required since the grammar supplies the
expression in separated form, but this will not work once it is
relaxed to allow any expression in the WHERE clause.

A unit test is added.
2022-07-22 20:14:48 +03:00
Avi Kivity
1efb2fecbe cql3: expression: define operator==() for expressions
This is useful for tests, to check that expression manipulations
yield the expected results.
2022-07-22 20:14:48 +03:00
Avi Kivity
13a64d8ab2 Merge 'Remove all remaining restrictions classes' from Jan Ciołek
This PR removes all code that used classes `restriction`, `restrictions` and their children.

There were two fields in `statement_restrictions` that needed to be dealt with: `_clustering_columns_restrictions` and `_nonprimary_key_restrictions`.

Each function was reimplemented to operate on the new expression representaiion and eventually these fields weren't needed anymore.

After that the restriction classes weren't used anymore and could be deleted as well.

Now all of the code responsible for analyzing WHERE clause and planning a query works on expressions.

Closes #11069

* github.com:scylladb/scylla:
  cql3: Remove all remaining restrictions code
  cql3: Move a function from restrictions class to the test
  cql3: Remove initial_key_restrictions
  cql3: expr: Remove convert_to_restriction
  cql3: Remove _new from _new_nonprimary_key_restrictions
  cql3: Remove _nonprimary_key_restrictions field
  cql3: Reimplement uses of _nonprimary_key_restrictions using expression
  cql3: Keep a map of single column nonprimary key restrictions
  cql3: Remove _new from _new_clustering_columns_restrictions
  cql3: Remove _clustering_columns_restrictions from statement_restrictions
  cql3: Use a variable instead of dynamic cast
  cql3: Use the new map of single column clustering restrictions
  cql3: Keep a map of single column clustering key restrictions
  cql3: Return an expression in get_clustering_columns_restrctions()
  cql3: Reimplement _clustering_columns_restrictions->has_supporting_index()
  cql3: Don't create single element conjunction
  cql3: Add expr::index_supports_some_column
  cql3: Reimplement has_unrestricted_components()
  cql3: Reimplement _clustering_columns_restrictions->need_filtering()
  cql3: Reimplement num_prefix_columns_that_need_not_be_filtered
  cql3: Use the new clustering restrictions field instead of ->expression
  cql3: Reimplement _clustering_columns_restrictions->size() using expressions
  cql3: Reimplement _clustering_columns_restrictions->get_column_defs() using expressions
  cql3: Reimplement _clustering_columns_restrictions->is_all_eq() using expressions
  cql3: expr: Add has_only_eq_binops function
  cql3: Reimplement _clustering_columns_restrictions->empty() using expressions
2022-07-20 18:01:15 +03:00
Jan Ciolek
599bcd6ea7 cql3: Remove all remaining restrictions code
The classes restriction, restrictions and its children
aren't used anywhere now and can be safely removed.

Some includes need to be modified for the code to compile.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-20 09:10:31 +02:00
Jan Ciolek
4f92c64e1b cql3: expr: Remove convert_to_restriction
This function isn't used anywhere anymore
and can be removed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-20 09:10:31 +02:00
Jan Ciolek
c7495fa59e cql3: Add expr::index_supports_some_column
Add a function that checks if there is an index
which supports one of the columns present in
the given expression.

This functionality will soon be needed for
clustering and nonprimary columns so it's
good to separate into a reusable function.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-19 15:38:20 +02:00
Jan Ciolek
6cf0981aa6 cql3: expr: Add has_only_eq_binops function
Add a function which checks that an expression
contains only binary operators with '='.

Right now this check is done only in a single place,
but soon the same check will have to be done
for clustering columns as well, so the code
is moved to a separate function to prevent duplication.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-18 17:45:06 +02:00
Jadw1
59498caeca db,cql3: Move part of cql3's function into db
Moving `function`, `function_name` and `aggregate_function` into
db namespace to avoid including cql3 namespace into query-request.
For now, only minimal subset of cql3 function was moved to db.
2022-07-18 15:25:41 +02:00
Jan Ciolek
38e115edf7 cql3: Move single element IN restrictions handling
Restrictions like
col IN (1)
get converted to
col = 1
as an optimization/simplification.

This used to be done in prepare_binary_operator,
but it fits way better inside of
validate_and_prepare_new_restriction.

When it was being done in prepare_binary_operator
the conversion happened before validation checks
and the error messages would describe an equality
restriction despite the user making an IN restriction.

Now the conversion happens after all validation
is finished, which ensures that all checks are
being done on the original expression.

Fixes: #10631

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
cb504b2d6e cql3: Check for disallowed operators early
Move checking for disallowed operators
earlier in the code flow.
This is needed to pass some tests that
expect one error message instead of the other.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
debd7399fd cql3: Reorganize to_restriction code
expr::to_restriction is currently used to
take a restriction from the WHERE clause,
prepare it, perform some validation checks
and finally convert it to an instance of
the restriction class.

Soon we will get rid of the restriction class.

In preparation for that expr::to_restriction
is split into two independent parts:
* The part that prepares and validates a binary_operator
* The part that converts a binary_operator to restriction

Thanks to this split getting rid of restriction class
will be painless, we will just stop using the
second part.

This commit splits expr::to_restriction into two functions;
* validate_and_prepare_new_restriction
* convert_to_restriction
that handle each of those parts.

All helper validation methods in the anonymous namespace
are copied from the to_restriction.cc file.

to_restriction.cc isn't the best filename for the new functionality,
so it has been renamed to restrictions.hh/cc.
In the future all the code regarding restrictions could be
put there to reduce clutter in expression.hh/cc

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:16 +02:00
Jan Ciolek
5be574fe51 cql3: Fix IS NOT NULL check in to_restriction
expr::to_restriction performs a check to see if
the restriction is of form: `col IS NOT NULL`

There is a mistake in this check.
It uses is<null>(prepared_binop.rhs)
to determine if the right hand side of binary operator
is a null, but the binary operator is already prepared.

During preparation expr::null is converted to expr::constant
and that wouldn't be detected by this check.

The check has been changed to check for null constant instead
of expr::null.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-11 15:47:15 +02:00
cvybhu
80dda2bb97 cql3: expr: Fix handling reversed types in limits()
There was a bug which caused incorrect results of limits()
for columns with reversed clustering order.
Such columns have reversed_type as their type and this
needs to be taken into account when comparing them.

It was introduced in 6d943e6cd0.
This commit replaced uses of get_value_comparator
with type_of. The difference between them is that
get_value_comparator applied ->without_reversed()
on the result type.

Because the type was reversed, comparisons like
1 < 2 evaluated to false.

This caused the test testIndexOnKeyWithReverseClustering
to fail, but sadly it wasn't caught by CI because
the CI itself has a bug that makes it skip some tests.
The test passes now, although it has to be run manually
to check that.

Fixes: #10918

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>

Closes #10994
2022-07-10 09:24:06 +03:00
Jan Ciolek
83f27fc8c1 cql3: expr: Add contains_multi_column_restriction
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:29:11 +02:00
Jan Ciolek
fd0798c8a2 cql3: Add expr::value_for
value_for is a method from the restriction class
which finds the value for a given column.

Under the hood it makes use of possible_lhs_values.
It will be needed to implement some functionality
that was implemented using restrictions before.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:29:11 +02:00
Jan Ciolek
9b6b1f69aa cql3: Handle single value INs inside prepare_binary_operator
Currently expr::to_restriction is the only place where
prepare_binary_operator is called.

In case of a single-value IN restriction like:
mycol IN (1)
this expression is converted to
mycol = 1
by expr::to_restriction.

Once restriction is removed expr::to_restriction
will be removed as well so its functionality has to
be moved somewhere else.

Move handling single value INs inside
prepare_binary_operator.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:29:09 +02:00
Jan Ciolek
24b0a61d51 cql3: Add get_columns_in_commons
Add a function that finds common columns
between two expressions.

It's used in error messages in the original
restrictions code so it must be included
in the new code as well for compatibility.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:29:09 +02:00
Jan Ciolek
177ba9b9db cql3: expr: Add is_empty_restriction
Add a function to check whether
an expression restricts anything at all.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:29:09 +02:00
Jan Ciolek
228b344d9c cql3: Replicate column sorting functionality using expressions
Restrictions code keeps restrictions for each column
in a map sorted by their position in the schema.
Then there are methods that allow to access
the restricted column in the correct order.

To replicate this in upcoming code
we need functions that implement this functionality.

The original comparator can be found in:
cql3/restrictions/single_column_restrictions.hh
For primary key columns this comparator compares their
positions in the schema.
For non-primary columns the position is assumed to
be clustering_key_size(), which seems pretty random.

To avoid passing the schema to the comparator
for nonprimary columns I just assume the
position is u32::max(). This seems to be
as good of a choice as clustering_key_size().
Orignally Cassandra used -1:
bc8a260471/src/java/org/apache/cassandra/config/ColumnDefinition.java (L79-L86)

We never end up comparing columns of different kind using this comparator anyway.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 16:28:41 +02:00
Jan Ciolek
e37ddd5b89 cql3: Remove single_column_restriction class
Now that all uses of this class have been
replaced by the generic restriction
the class is not used anywhere and can be removed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 15:53:19 +02:00
Jan Ciolek
3e3d2f939c cql3: Replace uses of single_column_restriction with restriction
single_column_restriction is a class used to represent
restrictions in a single column.

The class is very simple - it's basically an expression
with some additional information.

As a step towards removing all restriction classes
all uses of this class are replaced by uses of
the generic restriction class.

All functionality of this class has been implemented
using free standing functions operating on expressions.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 15:52:10 +02:00
Jan Ciolek
afc482f0a5 cql3: expr: Add get_the_only_column
Add a function that gets the only column
from a single column restriction expression.

The code would be very similiar to
is_single_column_restriction, so a new
function is introducted to reduce duplication.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 15:50:48 +02:00
Jan Ciolek
9c3b0299a1 cql3: expr: Add is_single_column_restriction
Add a function that checks whether an expression
contains restrictions on exactly one column.

This a "single_column_restriction"
in the same way that instances of
"class single_column_restriction" are.

It will be used later to distinguish cases
later once this class is removed

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-07-01 15:49:37 +02:00
Jan Ciolek
cb3b179945 cql3: expr: Add for_each_expression
for_each_expression is a function that
can be used to iterate over all expressions
inside an expression recursively and perform
some operation on each of them.

For example:
for_each_expression<column_vaue>(e, [](const column_value& cval) {std::cout << cval << '\n';});
Will print all column values in an expression

It's awkward to do this using recurse_until or find_in_expression
because these functions are meant for slightly different purposes.
Having a dedicated function for this purpose will make the code
cleaner and easier to understand.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-06-30 10:03:53 +02:00
Avi Kivity
4d587e0c3d cql3: raw_value: deduplicate view() and to_view()
Commit e739f2b779 ("cql3: expr: make evaluate() return a
cql3::raw_value rather than an expr::constant") introduced
raw_value::view() as a synonym to raw_value::to_view() to reduce
churn. To fix this duplication, we now remove raw_value::to_view().

raw_value::to_view() was picked for removal because is has fewer
call sites, reducing churn again.

Closes #10819
2022-06-17 09:32:58 +02:00
Avi Kivity
e739f2b779 cql3: expr: make evaluate() return a cql3::raw_value rather than an expr::constant
An expr::constant is an expression that happens to represent a constant,
so it's too heavyweight to be used for evaluation. Right now the extra
weight is just a type (which causes extra work by having to maintain
the shared_ptr reference count), but it will grow in the future to include
source location (for error reporting) and maybe other things.

Prior to e9b6171b5 ("Merge 'cql3: expr: unify left-hand-side and
right-hand-side of binary_operator prepares' from Avi Kivity"), we had
to use expr::constant since there was not enough type infomation in
expressions. But now every expression carries its type (in programming
language terms, expressions are now statically typed), so carrying types
in values is not needed.

So change evaluate() to return cql3::raw_value. The majority of the
patch just changes that. The rest deals with some fallout:

 - cql3::raw_value gains a view() helper to convert to a raw_value_view,
   and is_null_or_unset() to match with expr::constant and reduce further
   churn.
 - some helpers that worked on expr::constant and now receive a
   raw_value now need the type passed via an additional argument. The
   type is computed from the expression by the caller.
 - many type checks during expression evaluation were dropped. This is
   a consequence of static typing - we must trust the expression prepare
   phase to perform full type checking since values no longer carry type
   information.

Closes #10797
2022-06-15 08:47:24 +02:00
Avi Kivity
6d943e6cd0 cql3: expr: drop column_maybe_subscripted
column_maybe_subscripted is a variant<column_value*, subscript*> that
existed for two reasons:
  1. evaluation of subscripts and of columns took different paths.
  2. calculation of the type of column or column[sub] took different paths.

Now that all evaluations go through evaluate(), and the types are
present in the expression itself, there is no need for column_maybe_subscripted
and it is replaced with plain expressions.
2022-06-12 19:21:28 +03:00
Avi Kivity
2aa9199e9a cql3: expr: possible_lhs_values(): open-code get_value_comparator()
get_value_comparator() is going away soon, so open-code it here. It's
not doing much anyway.
2022-06-12 19:14:50 +03:00
Avi Kivity
b1c12073b1 cql3: expr: rationalize lhs/rhs argument order
Some functions accept the right-hand-side as the first argument
and the left-hand-side as the second argument. This is now confusing,
but at least safe-ish, as the arguments have different types. It's
going to become dangerous when we switch to expressions for both sides,
so let's rationalize it by always starting with lhs.

Some parameters were annotated with _lhs/_rhs when it was not clear.
2022-06-12 18:55:24 +03:00
Avi Kivity
9beac1df53 cql3: expr: don't rely on grammar when comparing tuples
The grammar only allows comparing tuples of clustering columns, which
are non-null, but let's not rely on that deep in expression evaluation
as it can be relaxed.
2022-06-12 18:41:03 +03:00
Avi Kivity
9a4f2a8cc3 cql3: expr: wire column_value and subscript to evaluate()
With everything standardized on evaluation_inputs(), it's a matter of calling
get_value().
2022-06-12 18:21:04 +03:00
Avi Kivity
30721fdc4a cql3: get_value(subscript): remove gratuitous pointer
While extracting get_value(subscript) we inherited a pointer due
to the calling convention, we can now remove it.
2022-06-12 18:18:59 +03:00
Avi Kivity
dd2fec9cb1 cql3: expr: reindent get_value(subscript)
Whitespace only change.
2022-06-12 18:04:12 +03:00
Avi Kivity
31b9e2a565 cql3: expr: extract get_value(subscript) from get_value(column_maybe_subscripted)
We wish to wire get_value(subscript) into evaluate (and get rid of
column_maybe_subscripted).
2022-06-12 18:03:03 +03:00
Avi Kivity
248433d7e0 cql3: prepare_expr: prepare subscript type
The type of a subscript expression is the value comparator
of the expression (column) being subscripted, according to out
wierd naming.
2022-06-12 17:39:08 +03:00
Avi Kivity
b5287db8ea cql3: expr: drop internal 'column_value_eval_bag'
is_satisfied_by() used an internal column_value_eval_bag type that
was more awkwardly named (and more awkward to use due to more nesting)
than evaluation_inputs. Drop it and use evaluation_inputs throughout.

The thunk is_satisified_by(evaluation_inputs) that just called
is_satisified_by(column_value_eval_bag) is dropped.
2022-06-12 17:12:41 +03:00
Avi Kivity
55085906ca cql3: expr: change evalute() to accept evaluation_inputs
Currently, evaluate() accepts only query_options, which makes
it not useful to evaluate columns. As a result some callers
(column_condition) have to call it directly on the right-hand-side
of binary expressions instead of evaluating the binary expression
itself.

Change it to accept evaluation_input as a parameter, but keep
the old signature too, since it is called from many places that
don't have rows.
2022-06-12 16:51:42 +03:00
Avi Kivity
2ecdb219fb cql3: expr: make evaluate(<expression subtype>) static
They aren't called from anywhere outside expression.cc, and
we're playing with the signatures, so hide them to avoid
rebuilds.
2022-06-12 16:13:20 +03:00
Avi Kivity
c80999fab4 cql3: expr: push is_satisfied_by regular and static column extraction to callers
is_satisfied_by() rearranges the static and regular columns from
query::result_row_view form (which is a use-once iterator) to
std::vector<managed_bytes_opt> (which uses the standard value
representation, and allows random access which expression
evaluation needs). Doing it in is_saitisfied_by() means that it is
done every time an expression is evaluated, which is wasteful. It's
also done even if the expression doesn't need it at all.

Push it out to callers, which already eliminates some calls.

We still pass cql3::expr::selection, which is a layering violation,
but that is left to another time.

Note that in view.cc's check_if_matches(), we should have been
able to move static_and_regular_columns calculation outside the
loop. However, we get crashes if we do. This is likely due to a
preexisting bug (which the zero iterations loop avoids). However,
in selection.cc, we are able to avoid the computation when the code
claims it is only handling partition keys or clustering keys.
2022-06-12 16:12:41 +03:00
Avi Kivity
4b715226fe cql3: expr: convert is_satisfied_by() signature to evaluation_inputs
Callers are converted, but the internals are kept using the old
conventions until more APIs are converted.

Although the new API allows passing no query_options, the view code
keeps passing dummy query_options and improvement is left as a FIXME.
2022-06-12 12:53:44 +03:00
Avi Kivity
7a9b645d64 cql3: expr: introduce evaluation_inputs
An expression may refer to values provided externally: the partition
and clusterinng keys, the static and regular row (all providing
column values), and the query options (providing values for bind
variables). Currently, different evaluation functions
(evaluate(), get_value(), and is_satisfied_by()) receive different
subsets of these values.

As a first step towards unifying the various ways to evaluate an
expression, collect the parameters in a single structure. Since
different evaluation contexts have different subsets, make everything
optional (via a pointer). Note that callers are expected to verify
using the grammar or prepare phase that they don't refer to values
that are not provided.

The cql3::selection::selection parameter is provided to translate
from query::result_row_view to schema column indexes. This is pretty
bad since it means the translation needs to be done for every
evaluation and is therefore a candidate for removal, but is kept here
since that's how it's currently done.
2022-06-12 12:47:23 +03:00
Avi Kivity
7debf6780c cql3: expr: drop prepare_binop_lhs()
It is now just a thin wrapper around try_prepare_expression(), so
replace it with that.
2022-06-01 18:58:14 +03:00
Avi Kivity
76e0dc66e5 cql3: expr: move implementation of prepare_binop_lhs() to try_prepare_expression()
This unifies the left-hand-side and right-hand-side of expression preparation.
The contents of the visitor in prepare_binop_lhs() is moved to the
visitor in try_prepare_expression(). This usually replaces an
on_internal_error() branch.

An exception is tuple_constructor, which is valid in both the left-hand-side
and right-hand-side (e.g. WHERE (x, y) IN (?, ?, ?)). We previously
enhanced this case to support not having a a column_specification, so
we just delete the branch from prepare_binop_lhs.
2022-06-01 18:58:14 +03:00