Commit Graph

377 Commits

Author SHA1 Message Date
Avi Kivity
7090f4c43b cql3: expr: evaluate() column_mutation_attribute
Enhance evaluation_inputs with timestamps and ttls, and use
them to evaluate writetime/ttl.

The data structure is compatible with the current way of doing
things (see result_set_builder::_timestamps, result_set_build::_ttls).
We use std::span<> instead of std::vector<> as it is more general
and a tiny bit faster.

The algorithm is taken from writetime_or_ttl_selector::add_input().
2023-06-18 22:41:09 +03:00
Nadav Har'El
97d444bbf7 Merge 'cql3/expression: implement evaluate(field_selection) ' from Jan Ciołek
Implement `expr:valuate()` for `expr::field_selection`.

`field_selection` is used to represent access to a struct field.
For example, with a UDT value:
```
CREATE TYPE my_type (a int, b int);
```
The expression `my_type_value.a` would be represented as a `field_selection`, which selects the field `a`.

Evaluating such an expression consists of finding the right element's value in a serialized UDT value and returning it.

Note that it's still not possible to use `field_selection` inside the `WHERE` clause. Enabling it would require changes to the grammar, as well as query planning, Current `statement_restrictions` just reacts with `on_internal_error` when it encounters a `field_selection`.
Nonetheless it's a step towards relaxing the grammar, and now it's finally possible to evaluate all kinds of prepared expressions (#12906)

Fixes: https://github.com/scylladb/scylladb/issues/12906

Closes #14235

* github.com:scylladb/scylladb:
  boost/expr_test: test evaluate(field_selection)
  cql3/expr: fix printing of field_selection
  cql3/expression: implement evaluate(field_selection)
  types/user: modify idx_of_field to use bytes_view
  column_identifer: add column_identifier_raw::text()
  types: add read_nth_user_type_field()
  types: add read_nth_tuple_element()
2023-06-18 11:08:25 +03:00
Jan Ciolek
ee660f2d61 cql3/expr: fix printing of field_selection
expression printing has two modes: debug and user.
The user mode should output standard CQL that can be
parsed back to an expression.
In debug mode there can be some additional information
that helps with debugging stuff.

The code for printing `field_selection` didn't distinguish
between user mode and debug mode. It just always printed
in debug mode, with extra parenthesis around the field selection.

Let's change it so that it emits valid CQL in user mdoe.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-16 01:21:02 +02:00
Jan Ciolek
f79f3ea3ae cql3/expression: implement evaluate(field_selection)
Implement expr:valuate() for expr::field_selection.

`field_selection` is used to represent access to a struct field.
For example, with a UDT value:
```
CREATE TYPE my_type (a int, b int);
```
The expression `my_type_value.a` would be represented as
a field_selection, which selects the field 'a'.

Evaluating such an expression consists of finding the
right element's value in a serialized UDT value
and returning it.

Fixes: https://github.com/scylladb/scylladb/issues/12906

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-16 01:21:00 +02:00
Avi Kivity
6c55bdc417 cql3: expr: match counter arguments to function parameters expecting bigint
assignment_testable is used to convey type information to function overload
selection. The implementation for `selector` recognizes that counters are
really bigints and special cases them. The equivalent implementation for
expressions doesn't, so bring over that nuance here too.

With this, things like sum(counter_column) match the overload for
sum(bigint) rather than failing.
2023-06-13 21:04:49 +03:00
Avi Kivity
2c1e36d0ac cql3: expr: avoid function constant-folding if a thread is needed
Our prepare phase performs constant-folding: if an expression
is composed of constants, and is pure, it is evalauted during
the preparation phase rather than during query execution.

This however can't work for user-defined functions as these require
running in a thread, and we aren't running in a thread during
prepration time. Skip the optimization in this case.
2023-06-13 21:04:49 +03:00
Avi Kivity
8d3d8eeedb cql3: add optional type annotation to assignment_testable
Before this series, function overload resolution peeked
at function arguments to see if they happened to be selectors,
and if so grabbed their type. If they did not happen to be
selectors, we woudln't know their type, but as it happened
all generic functions are aggregates, and aggregates are only
legal in the SELECT clause, so that didn't matter.

In a previous patch, we changed assignment_testable to carry
an optional type and wired it to selector, so we wouldn't
need to dynamic_cast<selector>.

Now, we wire the optional type to assignment_testable_expression,
so overload resolution of generic functions can happen during
expression preparation.

The code that bridges the function argument expressions to
assignment_testable is extracted into a function, since it's
too complicated to be written as a transform.
2023-06-13 21:04:49 +03:00
Avi Kivity
2cb15d0829 cql3: expr: wire unresolved_identifier to test_assignment() 2023-06-13 21:04:49 +03:00
Avi Kivity
b7bbcdd178 cql3: expr: support preparing column_mutation_attribute
Fairly straightforward. A unit test is added.
2023-06-13 21:04:49 +03:00
Avi Kivity
73b6b6e3d1 cql3: expr: support preparing SQL-style casts
We convert the cast to a function, just like the existing
with_function selectable.
2023-06-13 21:04:49 +03:00
Avi Kivity
521a128a2a cql3: expr: support preparing field_selection expressions
The field_selection structure is augmented with the field
index so that does not need to be done at evaluation time,
similar to the current with_field_selection selectable.
2023-06-13 21:04:49 +03:00
Avi Kivity
ecfe4ad53a cql3: expr: make the two styles of cast expressions explicit
CQL supports two cast styles:

 - C-style: (type) expr, used for casts between binary-compatible types
  and for type hinting of bind variables
 - SQL-tyle: (expr AS type), used for real type convertions

Currently, the expression system differentiates them by the cast::type
field, which is a data_type for SQL-style casts and a cql3_type::raw
for C-style casts, but that won't work after the prepare phase is applied
to SQL-style casts when the type field will be prepared into a data_type.

Prepare for this by adding a separate enum to distinguish between the
two styles.
2023-06-13 21:04:49 +03:00
Avi Kivity
c0f59f0789 cql3: eliminate dynamic_cast<selector> from functions::get()
Type inference for function calls is a bit complicated:
 - a function argument can be inferred from the signature: a call to
   my_func(:arg) will infer :arg's type from the function signature
 - a function signature can be inferred from its argument types:
   a call to max(my_column) will select the correct max() signature
   (as max is generic) from my_column's type

Currently, functions::get() implements this by invoking
dynamic_cast<selector*> on the argument. If the caller of
functions::get() is the SELECT clause preparation, then the
cast will succeed and we'll be able to find the type. If not,
we fail (and fall back to inferring the argument types from a
non-generic function signature).

Since we're about to move selectors to expressions, the dynamic_cast
will fail, so we must replace it with a less fragile approach.

The fix is to augment assignment_testable (the interface representing
a function argument) with an intentionally-awkwardly-named
assignment_testable_type_opt(), that sees whether we happen to know
the type for the argument in order to implement signature-from-argument
inference.

A note about assignment_testable: this is a bridge interface
that is the least common denominator of anything that calls functions.
Since we're moving towards expressions, there are fewer implementations of
the interface as the code evolves.
2023-06-13 21:04:49 +03:00
Avi Kivity
5983e9e7b2 cql3: test_assignment: pass optional schema everywhere
test_assignment() and related functions check for type compatibility between
a right-hand-side and a left-hand-side.

It started its life with a limited functionality for INSERT and UPDATE,
but now it's about to be used for cast expression in selectors, which
can cast a column_value. A column_value is still an unresolved_identifier
during the prepare phase, and cannot be resolved without a schema.

To prepare for this, pass an optional schema everywhere.

Ultimately, test_assignment likely needs to be folded into prepare_expr(),
but before that prepare_expr() has to be used everywhere.
2023-06-13 21:04:49 +03:00
Avi Kivity
8dc22293bf cql3: expr: prepare_expr(): allow aggregate functions
prepare_expr() began its life as a replacement for the WHERE clause,
so it shares its restrictions, one of which is not supporting aggregate
functions.

In previous patches, we added an explicit check to all users, so we can
now remove the check here, so that we can later prepare selectors.

In addition to dropping the check, we drop the dynamic_cast<scalar_function>,
as it can now fail. It turns out it's unnecessary since everything is available
from the base class.

Note we don't allow constant folding involving aggregate functions: first,
our evaluator doesn't support it, and second, we don't have the iteration count
at prepare time.
2023-06-13 21:04:49 +03:00
Avi Kivity
b7a90d51d2 cql3: add checks for aggregation functions after prepare
Since we don't yet prepare selectors, all calls to prepare_expr()
are adjusted.

Note that missing a check isn't fatal - it will be trapped at runtime
because evaluate(aggregate) will throw.
2023-06-13 21:04:49 +03:00
Avi Kivity
6db916e5b6 cql3: expr: add verify_no_aggregate_functions() helper
Aggregate functions are only allowed in certain contexts (the
SELECT clause and the HAVING clause, which we don't yet have).

prepare_expr() currently rejects aggregate functions, but that means
we cannot use it to prepare selectors.

To prepare for the use of prepare_expr() in selectors, we'll have to
move the check out of prepare_expr(). This helper is the beginning of
that change.

I considered adding a parameter to prepare_expr(), but that is even
more noisy than adding a call to the helper.
2023-06-13 21:04:49 +03:00
Avi Kivity
54f3050225 cql3: expr: extract column_mutation_attribute_type
column_mutation_attribute_type() returns int32_type or long_type
depending on whether TTL or WRITETIME is requested.

Will be used later when we prepare column_mutation_attribute
expressions.
2023-06-13 21:04:49 +03:00
Avi Kivity
d2f4bd8b85 cql3: expr: add fmt formatter for column_mutation_attribute_kind
It's easier to use for logging.
2023-06-13 21:04:49 +03:00
Avi Kivity
79bfe04d2a cql3: remove abstract_marker vestiges
Removed by e458340821 ("cql3: Remove term")

Closes #14192
2023-06-12 10:41:04 +03:00
Avi Kivity
26c8470f65 treewide: use #include <seastar/...> for seastar headers
We treat Seastar as an external library, so fix the few places
that didn't do so to use angle brackets.

Closes #14037
2023-06-06 08:36:09 +03:00
Jan Ciolek
55fb91bf10 exceptions: remove relation field from unrecognized_entity_exception
The exception unrecognized_entity_exception used to have two fields:
* entity - the name that wasn't recognized
* relation_str - part of the WHERE clause that contained this entity

In 4e0a089f3e the places that throw
this exception were modified, the thrower started passing unrecognized
column name to both fields - entity and relation_str. It was easier to
do things this way, accessing the whole WHERE clause can be problematic.

The problem is that this caused error messages to get weird, e.g:
"Undefined name x in where clause ('x')".
x is not the WHERE clause, it's the unrecognized name.

Let's remove the `relation_str` field as it isn't used anymore,
it only causes confusion. After this change the message would be:
"Unrecognized name x"
Which makes much more sense.

Refs #10632

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes #13944
2023-05-24 19:35:26 +03:00
Jan Ciolek
1bcb4c024c cql3/expr: print expressions in user-friendly way by default
When a CQL expression is printed, it can be done using
either the `debug` mode, or the `user` mode.

`user` mode is basically how you would expect the CQL
to be printed, it can be printed and then parsed back.

`debug` mode is more detailed, for example in `debug`
mode a column name can be displayed as
`unresolved_identifier(my_column)`, which can't
be parsed back to CQL.

The default way of printing is the `debug` mode,
but this requires us to remember to enable the `user`
mode each time we're printing a user-facing message,
for example for an invalid_request_exception.

It's cumbersome and people forget about it,
so let's change the default to `user`.

There issues about expressions being printed
in a `strange` way, this fixes them.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>

Closes #13916
2023-05-18 20:57:00 +03:00
Jan Ciolek
8a256f63db cql3/prepare_expr: force token() receiver name to be partition key token
Let's say that we have a prepared statement with a token restriction:
```cql
SELECT * FROM some_table WHERE token(p1, p2) = ?
```

After calling `prepare` the drivers receives some information
about the prepared statment, including names of values bound
to each bind marker.

In case of a partition token restriction (`token(p1, p2) = ?`)
there's an expectation that the name assigned to this bind marker
will be `"partition key token"`.

In a recent change the code handling `token()` expressions has been
unified with the code that handles generic function calls,
and as a result the name has changed to `token(p1, p2)`.

It turns out that the Java driver relies on the name being
`"partition key token"`, so a change to `token(p1, p2)`
broke some things.

This patch sets the name back to `"partition key token"`.
To achieve this we detect any restrictions that match
the pattern `token(p1, p2, p3) = X` and set the receiver
name for X to `"partition key token"`.

Fixes: #13769

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-05-09 12:32:57 +02:00
Jan Ciolek
be8ef63bf5 cql3: remove expr::token
Let's remove expr::token and replace all of its functionality with expr::function_call.

expr::token is a struct whose job is to represent a partition key token.
The idea is that when the user types in `token(p1, p2) < 1234`,
this will be internally represented as an expression which uses
expr::token to represent the `token(p1, p2)` part.

The situation with expr::token is a bit complicated.
On one hand side it's supposed to represent the partition token,
but sometimes it's also assumed that it can represent a generic
call to the token() function, for example `token(1, 2, 3)` could
be a function_call, but it could also be expr::token.

The query planning code assumes that each occurence of expr::token
represents the partition token without checking the arguments.
Because of this allowing `token(1, 2, 3)` to be represented
as expr::token is dangerous - the query planning
might think that it is `token(p1, p2, p3)` and plan the query
based on this, which would be wrong.

Currently expr::token is created only in one specific case.
When the parser detects that the user typed in a restriction
which has a call to `token` on the LHS it generates expr::token.
In all other cases it generates an `expr::function_call`.
Even when the `function_call` represents a valid partition token,
it stays a `function_call`. During preparation there is no check
to see if a `function_call` to `token` could be turned into `expr::token`.
This is a bit inconsistent - sometimes `token(p1, p2, p3)` is represented
as `expr::token` and the query planner handles that, but sometimes it might
be represented as `function_call`, which the query planner doesn't handle.

There is also a problem because there's a lot of duplication
between a `function_call` and `expr::token`. All of the evaluation
and preparation is the same for `expr::token` as it's for a `function_call`
to the token function. Currently it's impossible to evaluate `expr::token`
and preparation has some flaws, but implementing it would basically
consist of copy-pasting the corresponding code from token `function_call`.

One more aspect is multi-table queries. With `expr::token` we turn
a call to the `token()` function into a struct that is schema-specific.
What happens when a single expression is used to make queries to multiple
tables? The schema is different, so something that is representad
as `expr::token` for one schema would be represented as `function_call`
in the context of a different schema.
Translating expressions to different tables would require careful
manipulation to convert `expr::token` to `function_call` and vice versa.
This could cause trouble for index queries.

Overall I think it would be best to remove expr::token.

Although having a clear marker for the partition token
is sometimes nice for query planning, in my opinion
the pros are outweighted by the cons.
I'm a big fan of having a single way to represent things,
having two separate representations of the same thing
without clear boundaries between them causes trouble.

Instead of having expr::token and function_call we can
just have the function_call and check if it represents
a partition token when needed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:11:31 +02:00
Jan Ciolek
16bc1c930f cql3/prepare_expr: make get_lhs_receiver handle any function_call
get_lhs_receiver looks at the prepared LHS of a binary operator
and creates a receiver corresponding to this LHS expression.
This receiver is later used to prepare the RHS of the binary operator.

It's able to handle a few expression types - the ones that are currently
allowed to be on the LHS.
One of those types is `expr::token`, to handle restrictions like `token(p1, p2) = 3`.

Soon token will be replaced by `expr::function_call`, so the function will need
to handle `function_calls` to the token function.

Although we expect there to be only calls to the `token()` function,
as other functions are not allowed on the LHS, it can be made generic
over all function calls, which will help in future grammar extensions.

The functions call that it can currently get are calls to the token function,
but they're not validated yet, so it could also be something like `token(pk, pk, ck)`.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:53 +02:00
Jan Ciolek
d3a958490e cql3/expr: properly print token function_call
Printing for function_call is a bit strange.
When printing an unprepared function it prints
the name and then the arguments.

For prepared function it prints <anonymous function>
as the name and then the arguments.
Prepared functions have a name() method, but printing
doesn't use it, maybe not all functions have a valid name(?).

The token() function will soon be represent as a function_call
and it should be printable in a user-readable way.
Let's add an if which prints `token(arg1, arg2)`
instead of `<anonymous function>(arg1, arg2)` when printing
a call to the token function.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:53 +02:00
Jan Ciolek
096efc2f38 cql3/expr: split possible_lhs_values into column and token variants
The possible_lhs_values takes an expression and a column
and finds all possible values for the column that make
the expression true.

Apart from finding column values it's also capable of finding
all matching values for the partition key token.
When a nullptr column is passed, possible_lhs_values switches
into token values mode and finds all values for the token.

This interface isn't ideal.
It's confusing to pass a nullptr column when one wants to
find values for the token. It would be better to have a flag,
or just have a separate function.

Additionally in the future expr::token will be removed
and we will use expr::is_partition_token_for_schema
to find all occurences of the partition token.
expr::is_partition_token_for_schema takes a schema
as an argument, which possible_lhs_values doesn't have,
so it would have to be extended to get the schema from
somewhere.

To fix these two problems let's split possible_lhs_values
into two functions - one that finds possible values for a column,
which doesn't require a schema, and one that finds possible values
for the partition token and requires a schema:

value_set possible_column_values(const column_definition* col, const expression& e, const query_options& options);
value_set possible_partition_token_values(const expression& e, const query_options& options, const schema& table_schema);

This will make the interface cleaner and enable smooth transition
once expr::token is removed.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:53 +02:00
Jan Ciolek
f2e5f654f2 cql3/expr: fix error message in possible_lhs_values
In possible_lhs_values there was a message talking
about is_satisifed_by. It looks like a badly
copy-pasted message.

Change it to possibel_lhs_values as it should be.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:52 +02:00
Avi Kivity
dc3c28516d cql3: expr: reimplement is_satisfied_by() in terms of evaluate()
It calls evaluate() internally anyway.

There's a scary if () in there talking about tokens, but everything
appears to work.
2023-04-29 13:04:52 +02:00
Jan Ciolek
ad5c931102 cql3/expr: add a schema argument to expr::replace_token
Just like has_token, replace_token will use
expr::is_partition_token_for_schema to find all instance
of the partition token to replace.

Let's prepare for this change by adding a schema argument
to the function before making the big change.

It's unsued at the moment, but having a separate commit
should make it easier to review.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:52 +02:00
Jan Ciolek
d50db32d14 cql3/expr: add a comment for expr::has_partition_token
Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:52 +02:00
Jan Ciolek
18879aad6f cql3/expr: add a schema argument to expr::has_token
In the future expr::token will be removed and checking
whether there is a partition token inside an expression
will be done using expr::is_partition_token_for_schema.

This function takes a schema as an argument,
so all functions that will call it also need
to get the schema from somewhere.

Right now it's an unused argument, but in the future
it will be used. Adding it in a separate commit
makes it easier to review.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:52 +02:00
Jan Ciolek
7af010095e cql3/expr: add expr::is_partition_token_for_schema
Add a function to check whether the expression
represents a partition token - that is a call
to the token function with consecutive partition
key columns as the arguments.

For example for `token(p1, p2, p3)` this function
would return `true`, but for `token(1, 2, 3)` or `token(p3, p2, p1)`
the result would be `false`.

The function has a schema argument because a schema is required
to get the list of partition columns that should be passed as
arguments to token().

Maybe it would be possible to infer the schema from the information
given earlier during prepare_expression, but it would be complicated
and a bit dangerous to do this. Sometimes we operate on multiple tables
and the schema is needed to differentiate between them - a token() call
can represent the base table's partition token, but for an index table
this is just a normal function call, not the partition token.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:51 +02:00
Jan Ciolek
694d9298aa cql3/expr: add expr::is_token_function
Add a function that can be used to check
whether a given expression represents a call
to the token() function.

Note that a call to token() doesn't mean
that the expression represents a partition
token - it could be something like token(1, 2, 3),
just a normal function_call.

The code for checking has been taken from functions::get.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:51 +02:00
Jan Ciolek
f7cac10fe0 cql3/expr: implement preparing function_call without a receiver
Currently trying to do prepare_expression(function_call)
with a nullptr receiver fails.

It should be possible to prepare function calls without
a known receiver.

When the user types in: `token(1, 2, 3)`
the code should be able to figure out that
they are looking for a function with name `token`,
which takes 3 integers as arguments.

In order to support that we need to prepare
all arguments that can be prepared before
attempting to find a function.

Prepared expressions have a known type,
which helps to find the right function
for the given arguments.

Additionally the current code for finding
a function requires all arguments to be
assignment_testable, which requires to prepare
some expression types, e.g column_values.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-29 13:04:51 +02:00
Jan Ciolek
b3d05f3525 cql3/expr: make it possible to prepare expr::constant
try_prepare_expression(constant) used to throw an error
when trying to prepeare expr::constant.

It would be useful to be able to do this
and it's not hard to implement.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-28 14:34:59 +02:00
Jan Ciolek
bf36cde29a cql3/expr: implement test_assignment for column_value
Make it possible to do test_assignment for column_values.
It's implemented using the generic expression assignment
testing function.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-28 14:34:59 +02:00
Jan Ciolek
fd174bda60 cql3/expr: implement test_assignment for expr::constant
test_assignment checks whether a value of some type
can be assigned to a value of different type.

There is no implementation of test_assignment
for expr::constant, but I would like to have one.

Currently there is a custom implementation
of test_assignment for each type of expression,
but generally each of them boils down to checking:
```
type1->is_value_compatible_with(type2)
```

Instead of implementing another type-specific funtion
I added expresion_test_assignment and used it to
implement test_assignment for constant.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-04-28 14:34:56 +02:00
Kefu Chai
f5b05cf981 treewide: use defaulted operator!=() and operator==()
in C++20, compiler generate operator!=() if the corresponding
operator==() is already defined, the language now understands
that the comparison is symmetric in the new standard.

fortunately, our operator!=() is always equivalent to
`! operator==()`, this matches the behavior of the default
generated operator!=(). so, in this change, all `operator!=`
are removed.

in addition to the defaulted operator!=, C++20 also brings to us
the defaulted operator==() -- it is able to generated the
operator==() if the member-wise lexicographical comparison.
under some circumstances, this is exactly what we need. so,
in this change, if the operator==() is also implemented as
a lexicographical comparison of all memeber variables of the
class/struct in question, it is implemented using the default
generated one by removing its body and mark the function as
`default`. moreover, if the class happen to have other comparison
operators which are implemented using lexicographical comparison,
the default generated `operator<=>` is used in place of
the defaulted `operator==`.

sometimes, we fail to mark the operator== with the `const`
specifier, in this change, to fulfil the need of C++ standard,
and to be more correct, the `const` specifier is added.

also, to generate the defaulted operator==, the operand should
be `const class_name&`, but it is not always the case, in the
class of `version`, we use `version` as the parameter type, to
fulfill the need of the C++ standard, the parameter type is
changed to `const version&` instead. this does not change
the semantic of the comparison operator. and is a more idiomatic
way to pass non-trivial struct as function parameters.

please note, because in C++20, both operator= and operator<=> are
symmetric, some of the operators in `multiprecision` are removed.
they are the symmetric form of the another variant. if they were
not removed, compiler would, for instance, find ambiguous
overloaded operator '=='.

this change is a cleanup to modernize the code base with C++20
features.

Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13687
2023-04-27 10:24:46 +03:00
Nadav Har'El
bd09dc308c cql3: fix printing of column_specification::name in some error messages
column_specification::name is a shared pointer, so it should be
dereferenced before printing - because we want to print the name, not
the pointer.

Fix a few instances of this mistake in prepare_expr.cc. Other instances
were already correct.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2023-04-25 10:46:56 +03:00
Avi Kivity
3e0aacc8b5 db, cql3: functions: pass function parameters as a span instead of a vector
Spans are more flexible and can be constructed from any contiguous
container (such as small_vector), or a subrange of such a container.
This can save allocations, so change the signature to accept a span.

Spans cannot be constructed from std::initializer_list, so one such
call site is changed to use construct a span directly from the single
argument.
2023-04-19 20:38:55 +03:00
Kefu Chai
c580e30ec7 cql3: expr: return more accurate error message for invalidated token() args
before this change, we just print out the addresses of the elements
in `column_defs`, if the arguments passed to `token()` function are
not valid. this is not quite helpful from the user's perspective. as
user would be more interested in the values. also, we could print
more accurate error message for different error.

in this change, following Cassandra 4.1's behavior, three cases are
identified, and corresponding errors are returned respectively:

* duplicated partition keys
* wrong order of partition key
* missing keys

where, if the partition key order is wrong, instead of printing the
keys specified by user, the correct order is printed in the error
message for helping user to correct the `token()` function.

for better performance, the checks are performed only if the keys
do not match, based on the assumption that the error handling path
is not likely to be executed.

tests are added accordingly. they tested with Canssandra 4.1.1 also.

Fixes #13468
Signed-off-by: Kefu Chai <kefu.chai@scylladb.com>

Closes #13470
2023-04-14 11:46:18 +03:00
Avi Kivity
41a2856f78 cql3: expr: fix serialize_listlike() reference-to-temporary with gcc
serialize_listlike() is called with a range of either managed_bytes
or managed_bytes_opt. If the former, then iterating and assigning
to a loop induction variable of type managed_byted_opt& will bind
the reference to a temporary managed_bytes_opt, which gcc dislikes.

Fix by performing the binding in a separate statement, which allows
for lifetime extension.
2023-03-21 13:42:49 +02:00
Nadav Har'El
53c8c43d8a Merge 'cql3: improve support for C-style parenthesis casts' from Jan Ciołek
CQL supports type casting using C-style casts.
For example it's possible to do: `blob_column = (blob)funcReturningInt()`

This functionality is pretty limited, we only allow such casts between types that have a compatible binary representation. Compatible means that the bytes will stay unchanged after the conversion.
This means that it's legal to cast an int to blob (int is just a 4 byte blob), but it's illegal to cast a bigint to int (change 4 bytes -> 8 bytes).
This simplifies things, to cast we can just reinterpret the value as the other type.

Another use of C-style casts are type hints. Sometimes it's impossible to infer the exact type of an expression from the context. In such cases the type can be specified by casting the expression to this type.
For example: `overloadedFunction((int)?)`
Without the cast it would be impossible to guess what should be the bind marker's type. The function is overloaded, so there are many possible argument types. The type hint specifies that the bind marker has type int.

An interesting thing is that such casts don't have to be explicit. CQL allows to put an int value in a place where a blob value is expected and it will be automatically converted without any explicit casting.

---

I started looking at our implementation of casts because of #12900. In there the author expressed the need to specify a type hint for bind marker used to pass the WASM code. It could be either `(text)?` for text WASM, or `(blob)?` for binary WASM. This specific use of type hints wasn't supported because there was no `receiver` and the implementation of `prepare_expression` didn't handle that. Preparing casts without a receiver should be easy to implement - we can infer the type of the expression by looking at the type to which the expression is cast.

But while reading `prepare_expression` for `expr::cast` I noticed that the code there is a bit strange. The implementation prepared the expression to cast using the original `receiver` instead of a receiver with the cast type. This caused some issues because of which casting didn't work as expected.
For example it was possible to do:
```cql
blob_column = (blob)funcReturningInt()
```
But this didn't work at all:
```cql
blob_column = (blob)(int)12323
```
It tried to prepare `untyped_contant(12323)` with a `blob` receiver, which fails.

This makes `expr::cast` useless for casting. Casting when the representation is compatible is already implicit. I couldn't find a single case where adding a cast would change the behavior in any way.
There was some use for it as a type hint to choose a specific overload of a function, but it was worthless for casting.

Cassandra has the same issue, I created a `cql-pytest` test and it showed that we behave in the same way as Cassandra does.

I decided to improve this. By preparing the expression using a receiver with the cast type, `expr::cast` becomes actually useful for casting values. Things like `(blob)(int)12323` now work without any issues.
This diverges from the behavior in Cassandra, but it's an extension, not a breaking incompatibility.

---

This PR improves `prepare_expression` for `expr::cast` in the following ways:
1) Support for more complex casts by preparing the expression using a different receiver. This makes casts like `(blob)(int)123` possible
2) Support preparing `expr::cast` without a receiver. Type inference chooses the cast type as the type of the expression.
3) Add pytest tests for C-style casts

`2)` Is needed for #12900, the other changes is just something I decided to do since I was already working on this piece of code.

Closes #13053

* github.com:scylladb/scylladb:
  expr_test: more tests for preparing bind variables with type hints
  prepare_expr: implement preparing expr::cast with no receiver
  prepare_expr: use :user formatting in cast_prepare_expression
  prepare_expr: remove std::get<> in cast_prepare_expression
  prepare_expr: improve cast_prepare_expression
  prepare_expr: improve readability in cast_prepare_expression
  cql-pytest: test expr::cast in test_cast.py
2023-03-12 15:07:54 +02:00
Jan Ciolek
a08eb5cb76 prepare_expr: implement preparing expr::cast with no receiver
Type inference in cast_prepare_expression was very limited.
Without a receiver it just gave up and said that it can't
infer the type.

It's possible to infer the type - an expression that
casts something to type bigint also has type bigint.

This can be implemented by creating a fake receiver
when the caller didn't specify one.
Type of this fake receiver will be c.type
and c.arg will be prepared using this receiver.

Note that the previous change (changing receiver
to cast_type_receiver in prepare_expression) is required
to keep the behaviour consistent.
Without it we would sometimes prepare c.arg using the
original receiver, and sometimes using a receiver
with type c.type.

Currently it's impossible to test this change
on live code. Every place that uses expr::cast
specifies a receiver.
A unit test is all that can be done at the moment
to ensure correctness.

In the future this functionality will be used in UDFs.
In https://github.com/scylladb/scylladb/pull/12900
it was requested to be able to use a type hint
to specify whether WASM code of the function
will be sent in binary or text form.

The user can convey this by typing
either `(blob)?` or `(text)?`.
In this case there will be no receiver
and type inference would fail.

After this change it will work - it's now possible
to prepare either of those and get an expression
with a known type.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-03-09 18:31:45 +01:00
Jan Ciolek
9f8340d211 prepare_expr: use :user formatting in cast_prepare_expression
By default expressions are printed using the {:debug} formatting,
wich is intended for internal use. Error messages should use the
{:user} formatting instead.

cast_prepare_expression uses the default formatting in a few places
that are user facing, so let's change it to use {:user} formatting.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-03-09 18:31:45 +01:00
Jan Ciolek
12560b5745 prepare_expr: remove std::get<> in cast_prepare_expression
A few times throughout cast_prepare_expression there's
a line which uses std::get<> to get the raw type of the cast.
`std::get<shared_ptr<cql3_type::raw>>(c.type)`

This is a dangerous thing to do. It might turn out that the variant
holds a different alternative and then it'll start throwing bad_variant_access.

In this case this would happen if someone called cast_prepare_expression
on an expression that is already prepared.

It's possible to modify the code in a way that avoids doing the std::get
altogether.
It makes the code more resilient and gives me a piece of mind.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-03-09 18:31:45 +01:00
Jan Ciolek
7c384de476 prepare_expr: improve cast_prepare_expression
Preparing expr::cast had some artificial limitations.
Things like this worked:
`blob_col = (blob)funcReturnsInt()`
But this didn't:
`blob_col = (blob)(int)1234`

This is caused by the line:
`prepare_expression(c.arg, db, keyspace, schema_opt, receiver)`

Here the code prepares the expression to be cast using the original
receiver which was passed to cast_prepare_expression.

In the example above this meant that it tried to prepare
untyped_constant(1234) using a receiver with type blob.
This failed because an integer literal is invalid for a blob column.

To me it looks like a mistake. What it should do instead
is prepare the int literal using the type (int) and then
see if int can be cast to blob, by checking if these types
have compatible binary representation.

This can be achieved by using `cast_type_receiver` instead of `receiver`.

Making this small change makes it possible to use the cast
in many situations where it was previously impossible.
The tests have to be updated to reflect the change,
some of them ow deviate from Cassandra, so they have
to be marked scylla_only.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-03-09 18:31:41 +01:00
Jan Ciolek
63a7235017 prepare_expr: improve readability in cast_prepare_expression
cast_prepare_expression takes care of preparing expr::cast,
which is responsible for CQL C-style casts.

At the first glance it can be hard to figure out what exactly
does it do, so I added some comments to make things clearer.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-03-08 03:24:17 +01:00