Commit Graph

196 Commits

Author SHA1 Message Date
Nadav Har'El
fbb2a41246 expressions: don't dereference invalid map subscript in filter
If we have the filter expression "WHERE m[?] = 2", the existing code
simply assumed that the subscript is an object of the right type.
However, while it should indeed be the right type (we already have code
that verifies that), there are two more options: It can also be a NULL,
or an UNSET_VALUE. Either of these cases causes the existing code to
dereference a non-object as an object, leading to bizarre errors (as
in issue #10361) or even crashes (as in issue #10399).

Cassandra returns a invalid request error in these cases: "Unsupported
unset map key for column m" or "Unsupported null map key for column m".
We decided to do things differently:

 * For NULL, we consider m[NULL] to result in NULL - instead of an error.
   This behavior is more consistent with other expressions that contain
   null - for example NULL[2] and NULL<2 both result in NULL as well.
   Moreover, if in the future we allow more complex expressions, such
   as m[a] (where a is a column), we can find the subscript to be null
   for some rows and non-null for other rows - and throwing an "invalid
   query" in the middle of the filtering doesn't make sense.

 * For UNSET_VALUE, we do consider this an error like Cassandra, and use
   the same error message as Cassandra. However, the current implementation
   checks for this error only when the expression is evaluated - not
   before. It means that if the scan is empty before the filtering, the
   error will not be reported and we'll silently return an empty result
   set. We currently consider this ok, but we can also change this in the
   future by binding the expression only once (today we do it on every
   evaluation) and validating it once after this binding.

Fixes #10361
Fixes #10399

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-04-24 16:05:34 +03:00
Nadav Har'El
808a93d29b expressions: fix invalid dereference in map subscript evaluation
When we have an filter such as "WHERE m[2] = 3" (where m is a map
column), if a row had a null value for m, our expression evaluation
code incorrectly dereferences an unset optional, and continued
processing the result of this dereference which resulted in undefined
behavior - sometimes we were lucky enough to get "marshaling error"
but other times Scylla crashed.

The fix is trivial - just check before dereferencing the optional value
of the map. We return null in that case, which means that we consider
the result of null[2] to be null. I think this is a reasonable approach
and fits our overall approach of making null dominate expressions (e.g.,
the value of "null < 2" is also null).

The test test_filtering.py::test_filtering_null_map_with_subscript,
which used to frequently fail with marshaling errors or crashes, now
passes every time so its "xfail" mark is removed.

Fixes #10417

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2022-04-24 14:58:56 +03:00
cvybhu
5c199cad45 cql3: expr: possible_lhs_values: Handle subscript
This commit makes subscript an invalid argument to possible_lhs_values.
Previously this function simply ignored subscripts
and behaved as if it was called on the subscripted column
without a subscript.

This behaviour is unexpected and potentially
dangerous so it would be better to forbid
passing subscript to possible_lhs_values entirely.

Trying to handle subscript correctly is impossible
without refactoring the whole function.
The first argument is a column for which we would
like to know the possible values.
What are possible values of a subscripted column c where c[0] = 1?
All lists that have 1 on 0th position?

If we wanted to handle this nicely we would have to
change the arguments.
Such refectoring is best left until the time
when this functionality is actually needed,
right now it's hard to predict what interface
will be needed then.

Signed-off-by: cvybhu <jan.ciolek@scylladb.com>

Closes #10228
2022-04-11 19:05:09 +03:00
Jan Ciolek
2e7009f427 cql3: expr: is_supported_by: Return false for subscripted values
is_supported_by checks whether a given restriction
can be supported by some index.

Currently when a subscripted value, e.g `m[1]` is encountered,
we ignore the fact that there is a subscript and ask
whether an index can support the `m` itself.

This looks like unintentional behaviour leftover
from the times when column_value had a sub field,
which could be easily forgotten about.

Scylla doesn't support indexes on collection elements at all,
so simply returning false there seems like a good idea.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes #10227
2022-03-15 20:19:33 +02:00
Jan Ciolek
e086201420 cql3: expr: Remove sub from column_value
column_value::sub has been replaced by the subscript struct
everywhere, so we can finally remove it.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 22:02:39 +01:00
Jan Ciolek
b80f9e6cf8 cql3: Create a subscript in single_column_relation
When `val[sub]` is parsed, it used to be the case
that column_value with a sub field was created.

Now this has been changed to creating a subscript struct.

This is the only place where a subscripted value can be created.

All the code regarding subscripts now operates using only the
subscript struct, so we will be able to remove column_value::sub soon.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 22:02:39 +01:00
Jan Ciolek
cf6e81e731 cql3: expr: Add subscript to expression
All handlers for subscript have finally been implemented
and subscript can now be added to expression without
any trouble.

All the commented out code that waited for this moment
can now be uncommented.
Every such piece of code had a `TODO(subscript)` note
and by grepping this phrase we can make sure that
we didn't forget any of them.

Right now there is two ways to express a subscripted
column - either by a column_value with a sub field
or by using a subscript struct.

The grammar still uses the old column_value way,
but column_value.sub will be removed soon
and everything will move to the subscript struct.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 22:02:29 +01:00
Jan Ciolek
ec6f93d0c7 cql3: expr: Handle subscript in test_assignment
test_assignment can't be passed a column_value,
so a subscript won't work as well.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
ab89fc316b cql3: expr: Handle subscript in prepare_expression
column_value can't be prepared, so subscript can't be prepared as well.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
c39498537c cql3: expr: Handle subscript in fill_prepare_context
fill_prepare_context collects useful information about
the expression involved in query restrictions.

We should collect this information from subscript as well,
just like we do from column_value and its sub.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
811685ad6a cql3: expr: Handle subscript in evaluate
A column_value can't be evaluated,
so a subscripted column can't evaluated be as well.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
db8990436a cql3: expr: Handle subscript in extract_single_column_restrictions_for_column
extract_single_column_restrictions_for_column finds all restrictions
for a column and puts them in a vector.

In case we encounter col[sub] we treat it as a restriction on col
and add it to the result.

This seems to make some sense and is in line with the current behaviour
which doesn't check whether a column is subscripted at all.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
2eaa39e1c8 cql3: expr: Handle subscript in search_and_replace
Prepare a handler for subscript in search_and_replace.
Some of the code must be commented out for now
because subscript hasn't been added to expression yet.

It will uncommented later.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
e2d983f659 cql3: expr: Handle subscript in recurse_until
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
2d4174dc46 cql3: expr: Implement operator<< for subscript
expression can be printed using operator<<.
We need to handle subscript there.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
02c3b78e25 cql3: expr: Handle subscript in possible_lhs_values
possible_lhs_values returns set of possible values
for a column given some restrictions.

Current behaviour in case of a subscripted column
is to just ignore the subscript and treat
the restriction as if it were on just the column.

This seems wrong, or at least confusing,
but I won't change it in this patch to preserve the existing behaviour.

Trying to change this to something more reasonable
breaks other code which assumes that possible_lhs_values
returns a list of values.
(See partition_ranges_from_EQs() in cql3/restrictions/statement_restrictions.cc)

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
07fbf74a97 cql3: expr: Handle subscript in is_supported_by
is_supported_by checks whether the given expression
is supported by some index.

The current behaviour seems wrong, but I kept
it to avoid making changes in a refactor PR.

Scylla doesn't have indexes on map entries yet,
so for a subscript the answer is always no.
I think we should just return false there.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
fb59f488df cql3: expr: Handle subscript in is_satisifed_by
For the most part subscript can be handled
in the same way as column_value.

column_value has a sub argument and all
called functions evaluate lhs value using
get_value() which is prepared to handle
subscripted columns.

These functions now take column_maybe_subscripted
so we can pass &subscript to them without a problem.

The difference is in CONTAINS, CONTAINS_KEY and LIKE.

contains() and contains_key() throw an exception
when the passed column has a subscript, so now
we just throw an exception immediately.

like() doesn't have a check for subscripted value,
but from reading its code it's clear that
it's not ready to handle such values,
so an exception is now thrown as well.
It shouldn't break any tests because when one tries
to perform a query like:

`select * from t where m[0] like '%' allow filtering;`

an exception is throw somewhere earlier in the code.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
1edaa3ef0d cql3: expr: Remove unused attribute
Functions that were previously marked as unused to make the code
compile are now used and we can remove the markings.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
cf839807ac cql3: expr: Use column_maybe_subscripted in is_one_of()
is_one_of() used to take column_value which could be subscripted as an argument.
column_value.sub will be removed so this function needs to take column_maybe_subscripted now.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
75c8b2ec6c cql3: expr: Use column_maybe_subscripted in limits()
limits() used to take column_value which could be subscripted as an argument.
column_value.sub will be removed so this function needs to take column_maybe_subscripted now.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
bc8c298be3 cql3: expr: Use column_maybe_subscripted in equal()
equal() used to take column_value which could be subscripted as an argument.
column_value.sub will be removed so this function needs to take column_maybe_subscripted now.

To get lhs value the code uses get_value() which is ready to handle subscripted columns.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
ca423a455e cql3: expr: add get_subscripted_column(column_maybe_subscripted)
Add a function that extracts the column_value
from column_maybe_subscripted.

There were already overloads for expression and subscript,
but this one will be needed as well.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
6d42ff580d cql3: expr: Add as_column_maybe_subscripted
Add a convenience function that allows to convert
a reference to expression to column_maybe_subscripted.

It will be useful in a moment.

For now part of it must be commented out
because subscript is not in the expression variant yet.

It will be uncommented once subscript is finally added
to expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
d577af0f0c cql3: expr: Make get_value_comparator work with column_maybe_subscripted
There is get_value_comparator(column_value) but soon
we will also need get_value_comparator(column_maybe_subscripted).

Implement it by copying code from get_value_comparator(column_value).

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
a8287a158a cql3: expr: Make get_value work with column_maybe_subscripted
There is a get_value(column_value), but soon we will also
need get_value(column_maybe_subscripted).

Implement get_value(column_maybe_subscripted) by checking
whether the argument is a column_value or subscript
and calling the right code.

Code for handling the subscript case is copied from
get_value(column_value) where sub has value.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
feee6e4ffb cql3: expr: Add column_maybe_subscripted
column_maybe_subscripted is a variant that
can be either a column_value or a subscript.

It will be used as an argument to functions
which used to take column_value.

Right now column_value has a sub field,
but this will be removed soon once
the subscript struct takes over.

Changing the argument type is a smaller change
than rewriting all these functions, although
if they were rewritten the resulting code
would probably be nicer.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:41 +01:00
Jan Ciolek
4d7438d30a cql3: expr: Add get_subscripted_column
Even though the new subscript allows for subscripting anything,
the only thing that is really allowed to be subscripted is a column.

Add a utility function that extracts the column_value
from an expression with is a column_value or subscript.

It will came in handy in the following commits.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 21:56:30 +01:00
Jan Ciolek
a5bcd4f7f2 cql3: expr: Add subscript struct
Add a struct called subscript, which will be used in expression
variant to represent subscripted values e.g col[x], val[sub].
It will replace the sub field of column_value.
Having a separate struct in AST for this purpose
is cleaner and allows to express subscripting
values other than column_value.

It is not added to the expression variant yet, because
that would require immediately implementing all visitors.

The following commits will implement individual visitors
and then subscript will finally be added to expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-27 01:32:59 +01:00
Jan Ciolek
7234cc851c cql3: expr: add std::forward in expr::visit
expr::visit was missing std::forward on the visitor.
In cases where the visitor was passed as an rvalue it wouldn't
be properly forwarded to std::visit.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-18 14:19:49 +01:00
Jan Ciolek
8676f60724 cql3: expr: Fix expr::visit so that it works with references
expr::visit had a bug where if we wanted to return
a reference in the visitor, the reference would be
to a temporary stack location instead of the passed
argument.

So trying to do something like this:
```
    const bind_variable& ref = visit(overloaded_functor {
    [](const bind_variable& bv) -> const bind_variable& { return bv; },
    [](const auto&) -> const bind_variable& { ... }
    }, e);
std::cout << ref << std::endl;
```

Would actually print a random location on stack instead
of valid value inside of e.

Additionally trying to return a non-const reference
doesn't even compile.

The problem was that the return type of expr::visit
was defined as `auto`, which can be `int`, but not `int&`.
This has been changed to `decltype(auto)` which can be both `int` and `int&`

New version of `expr::visit` works for `const expression&` and `expression&`
no matter what the visitor returns.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2022-02-17 17:29:28 +01:00
Piotr Sarna
5a13ff09e9 expression: fix get_value for mismatched column definitions
As observed in #10026, after schema changes it somehow happened
that a column defition that does not match any of the base table
columns was passed to expression verification code.
The function that looks up the index of a column happens to return
-1 when it doesn't find anything, so using this returned index
without checking if it's nonnegative results in accessing invalid
vector data, and a segfault or silent memory corruption.
Therefore, an explicit check is added to see if the column was actually
found. This serves two purposes:
 - avoiding segfaults/memory corruption
 - making it easier to investigate the root cause of #10026

Closes #10039
2022-02-07 18:40:48 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
ae3a360725 database: Move database, keyspace, table classes to replica/ directory
The database, keyspace, and table classes represent the replica-only
part of the objects after which they are named. Reading from a table
doesn't give you the full data, just the replica's view, and it is not
consistent since reconciliation is applied on the coordinator.

As a first step in acknowledging this, move the related files to
a replica/ subdirectory.
2022-01-06 17:07:30 +02:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Piotr Sarna
feea7cb920 Merge 'cql3: disentangle column_identifier from selectable' from Avi Kivity
column_identifier serves two purposes: a value type used to denote an
identifier (which may or may not map to a table column), and `selectable`
implementation used for selecting table columns. This stands in the way
of further refactoring - the unification of the WHERE clause prepare path
(prepare_expression()) and the SELECT clause prepare path
(prepare_selectable()).

Reduce the entanglement by moving the selectable-specific parts to a new
type, selectable_column, and leaving column_identifier as a pure value type.

Closes #9729

* github.com:scylladb/scylla:
  cql3: move selectable_column to selectable.cc
  cql3: column_identifier: split selectable functionality off from column_identifier
2021-12-14 10:37:32 +01:00
Avi Kivity
3f862f9ece cql3: move selectable_column to selectable.cc
Move selectable_column to selectable.cc (and to the cql3::selection
namespace). This cleans up column_identifier.hh so it is now a pure
vocabulary header.
2021-12-10 19:51:57 +02:00
Nadav Har'El
c6f2afb93d Merge 'cql3: Allow to skip EQ restricted columns in ORDER BY' from Jan Ciołek
In queries like:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c1 ASC, c2 ASC)
```
we can skip the requirement to specify ordering for `c1` column.

The `c1` column is restricted by an `EQ` restriction, so it can have
at most one value anyway, there is no need to sort.

This commit makes it possible to write just:
```cql
SELECT * FROM t WHERE p = 0 AND c1 = 0 ORDER BY (c2 ASC)
```

I reorganized the ordering code, I feel that it's now clearer and easier to understand.
It's possible to only introduce a small change to the existing code, but I feel like it becomes a bit too messy.
I tried it out on the [`orderby_disorder_small`](https://github.com/cvybhu/scylla/commits/orderby_disorder_small) branch.

The diff is a bit messy because I moved all ordering functions to one place,
it's better to read [select_statement.cc](https://github.com/cvybhu/scylla/blob/orderby_disorder/cql3/statements/select_statement.cc#L1495-L1658) lines 1495-1658 directly.

In the new code it would also be trivial to allow specifying columns in any order, we would just have to sort them.
For now I commented out the code needed to do that, because the point of this PR was to fix #2247.
Allowing this would require some more work changing the existing tests.

Fixes: #2247

Closes #9518

* github.com:scylladb/scylla:
  cql-pytest: Enable test for skipping eq restricted columns in order by
  cql3: Allow to skip EQ restricted columns in ORDER BY
  cql3: Add has_eq_restriction_on_column function
  cql3: Reorganize orderings code
2021-12-09 21:11:56 +03:00
Jan Ciolek
7bbfa48bc5 cql3: Add has_eq_restriction_on_column function
Adds a function that checks whether a given expression has eq restrction
on the specified column.

It finds restrictions like
col = ...
or
(col, col2) = ...

IN restrictions don't count, they aren't EQ restrictions

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-12-09 12:06:43 +01:00
Avi Kivity
edaa0c468d cql3: expr: standardize on struct tag for expression components
Expression components are pure data, so emphasize this by using
the struct tag consistently. This is just a cosmetic change.

Closes #9740
2021-12-07 15:46:25 +02:00
Piotr Sarna
0bd139e81c Merge 'cql3: expr: detemplate and deinline find_in_expression()
... and count_if()' from Avi Kivity

The expression code provides some utilities to examine and manipulate
expressions at prepare time. These are not (or should not be) in the fast
path and so should be optimized for compile time and code footprint
rather than run time.

This series does so by detemplating and deinlining find_in_expression()
and count_if().

Closes #9712

* github.com:scylladb/scylla:
  cql3: expr: adjust indentation in recurse_until()
  cql3: expr: detemplate count_if()
  cql3: expr: detemplate count_if()
  cql3: expr: rewrite count_if() in terms of recurse_until()
  cql3: expr: deinline recurse_until()
  cql3: expr: detemplate find_in_expression
2021-12-03 15:41:07 +01:00
Jan Ciolek
be14904416 cql3: Don't allow unset values inside UDT
Scylla doesn't support unset values inside UDT.
The old code used to convert unset to null, which seems incorrect.

There is an extra space in the error message to retain compatability with Cassandra.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-12-03 14:46:21 +01:00
Avi Kivity
2c613b027d cql3: expr: adjust indentation in recurse_until()
Whitespace changes only.
2021-11-30 17:57:53 +02:00
Avi Kivity
f7f77df143 cql3: expr: detemplate count_if()
No functional changes. This prepare-path function does not need to
be inlined.
2021-11-30 17:52:15 +02:00
Avi Kivity
3a96b74e49 cql3: expr: detemplate count_if()
count_if() is a prepare-path function and does not need to be
a template. Type-erase it with noncopyable_function.
2021-11-30 17:50:34 +02:00
Avi Kivity
6f9e56e678 cql3: expr: rewrite count_if() in terms of recurse_until()
Counting is just recursing without early termination, and counting
as a side effect.
2021-11-30 17:49:00 +02:00
Avi Kivity
c01188c414 cql3: expr: deinline recurse_until()
As a prepare-path function, it has no business being inline.
2021-11-30 17:41:16 +02:00
Avi Kivity
d0177d4b85 cql3: expr: detemplate find_in_expression
find_in_expression() is not in a fast path but is quite large
and inlined due to being a template. Detemplate it into a
recurse_until() utility function, and keep only the minimal
code in a template.

The recurse_until is still inline to simplify review, but
will be deinlined in the next patch.
2021-11-30 17:37:24 +02:00
Avi Kivity
595cc328b1 Merge 'cql3: Remove term, replace with expression' from Jan Ciołek
This PR finally removes the `term` class and replaces it with `expression`.

* There was some trouble with `lwt_cache_id` in `expr::function_call`.
  The current code works the following way:
  * for each `function_call` inside a `term` that describes a pk restriction, `prepare_context::add_pk_function_call` is called.
  * `add_pk_function_call` takes a `::shared_ptr<cql3::functions::function_call>`, sets its `cache_id` and pushes this shared pointer onto a vector of all collected function calls
  * Later when some condiition is met we want to clear cache ids of all those collected function calls. To do this we iterate through shared pointers collected in `prepare_context` and clear cache id for each of them.

  This doesn't work with `expr::function_call` because it isn't kept inside a shared pointer.
  To solve this I put the `lwt_cache_id` inside a shared pointer and then `prepare_context` collects these shared pointers to cache ids.

  I also experimented with doing this without any shared pointers, maybe we could just walk through the expression and clear the cache ids ourselves. But the problem is that expressions are copied all the time, we could clear the cache in one place, but forget about a copy. Doing it using shared pointers more closely matches the original behaviour.
The experiment is on the [term2-pr3-backup-altcache](https://github.com/cvybhu/scylla/tree/term2-pr3-backup-altcache) branch
* `shared_ptr<term>` being `nullptr` could mean:
  * It represents a cql value `null`
  * That there is no value, like `std::nullopt` (for example in `attributes.hh`)
  * That it's a mistake, it shouldn't be possible

  A good way to distinguish between optional and mistake is to look for `my_term->bind_and_get()`, we then know that it's not an optional value.

* On the other hand `raw_value` cased to bool means:
   * `false` - null or unset
   * `true` - some value, maybe empty

I ran a simple benchmark on my laptop to see how performance is affected:
```
build/release/test/perf/perf_simple_query --smp 1 -m 1G --operations-per-shard 1000000 --task-quota-ms 10
```
* On master (a21b1fbb2f) I get:
  ```
  176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)

  median 176506.60 tps ( 77.0 allocs/op,  12.0 tasks/op,   45831 insns/op)
  median absolute deviation: 0.00
  maximum: 176506.60
  minimum: 176506.60
  ```
* On this branch I get:
  ```
  172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)

  median 172225.30 tps ( 75.1 allocs/op,  12.1 tasks/op,   46106 insns/op)
  median absolute deviation: 0.00
  maximum: 172225.30
  minimum: 172225.30
  ```

Closes #9481

* github.com:scylladb/scylla:
  cql3: Remove remaining mentions of term
  cql3: Remove term
  cql3: Rename prepare_term to prepare_expression
  cql3: Make prepare_term return an expression instead of term
  cql3: expr: Add size check to evaluate_set
  cql3: expr: Add expr::contains_bind_marker
  cql3: expr: Rename find_atom to find_binop
  cql3: expr: Add find_in_expression
  cql3: Remove term in operations
  cql3: Remove term in relations
  cql3: Remove term in multi_column_restrictions
  cql3: Remove term in term_slice, rename to bounds_slice
  cql3: expr: Remove term in expression
  cql3: expr: Add evaluate_IN_list(expression, options)
  cql3: Remove term in column_condition
  cql3: Remove term in select_statement
  cql3: Remove term in update_statement
  cql3: Use internal cql format in insert_prepared_json_statement cache
  types: Add map_type_impl::serialize(range of <bytes, bytes>)
  cql3: Remove term in cql3/attributes
  cql3: expr: Add constant::view() method
  cql3: expr: Implement fill_prepare_context(expression)
  cql3: expr: add expr::visit that takes a mutable expression
  cql3: expr: Add receiver to expr::bind_variable
2021-11-30 16:39:39 +02:00
Jan Ciolek
51a8a1f89b cql3: Remove remaining mentions of term
There were a few places where term was still mentioned.
Removed/replaced term with expression.

search_and_replace is still done only on LHS of binary_operator
because the existing code would break otherwise.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:57:00 +01:00