Commit Graph

51 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
d768e9fac5 cql3, related: switch to data_dictionary
Stop using database (and including database.hh) for schema related
purposes and use data_dictionary instead.

data_dictionary::database::real_database() is called from several
places, for these reasons:

 - calling yet-to-be-converted code
 - callers with a legitimate need to access data (e.g. system_keyspace)
   but with the ::database accessor removed from query_processor.
   We'll need to find another way to supply system_keyspace with
   data access.
 - to gain access to the wasm engine for testing whether used
   defined functions compile. We'll have to find another way to
   do this as well.

The change is a straightforward replacement. One case in
modification_statement had to change a capture, but everything else
was just a search-and-replace.

Some files that lost "database.hh" gained "mutation.hh", which they
previously had access to through "database.hh".
2021-12-15 13:54:23 +02:00
Jan Ciolek
dcd3199037 cql3: Rename prepare_term to prepare_expression
prepare_term now takes an expression and returns a prepared expression.
It should be renamed to prepare_expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
219f1a4359 cql3: Make prepare_term return an expression instead of term
prepare_term is now the only function that uses terms.
Change it so that it returns expression instead of term
and remove all occurences of expr::to_expression(prepare_term(...))

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-11-04 15:56:45 +01:00
Jan Ciolek
805ba145d7 cql3: Remove term in column_condition
Replace all uses of term with expression in cql3/column_condition

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-10-28 20:55:09 +02:00
Avi Kivity
be44b579a1 cql3: expr: introduce as/as_if/is
Simple wrappers for std::get, std::get_if, std::holds_alternative.

The new names are shorter and IMO more readable.

Call sites are updated.

We will later replace the implementation.
2021-09-28 23:49:11 +03:00
Jan Ciolek
2523c9ba48 cql3: Replace most uses of terminal with expr::constant
constant is now ready to replace terminal as a final value representation.
Replace bind() with evaluate and shared_ptr<terminal> with constant.

We can't get rid of terminal yet. Sometimes terminal is converted back
to term, which constant can't do. This won't be a problem once we
replace term with expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-21 16:28:15 +02:00
Jan Ciolek
221ed38e94 cql3: Replace all uses of bind_and_get with evaluate_to_raw_view
Start using evaluate_to_raw_value instead of bind_and_get.
This is a step towards using only evaluate.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-21 16:20:30 +02:00
Avi Kivity
8c0f2f9e3d Revert "Merge 'cql3: Add expr::constant to replace terminal' from Jan Ciołek"
This reverts commit e9343fd382, reversing
changes made to 27138b215b. It causes a
regression in v2 serialization_format support:

collection_serialization_with_protocol_v2_test fails with: marshaling error: read_simple_bytes - not enough bytes (requested 1627390306, got 3)

Fixes #9360
2021-09-20 15:15:09 +03:00
Jan Ciolek
a0ec2113ae cql3: Replace most uses of terminal with expr::constant
constant is now ready to replace terminal as a final value representation.
Replace bind() with evaluate and shared_ptr<terminal> with constant.

We can't get rid of terminal yet. Sometimes terminal is converted back
to term, which constant can't do. This won't be a problem once we
replace term with expression.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-13 17:47:17 +02:00
Jan Ciolek
c3fb2f2b57 cql3: Replace all uses of bind_and_get with evaluate_to_raw_view
Start using evaluate_to_raw_value instead of bind_and_get.
This is a step towards using only evaluate.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2021-09-13 17:44:06 +02:00
Avi Kivity
158822c1a6 cql3: term::raw: remove term::raw and scaffolding
Nothing now uses term::raw, remove it and the scaffolding used to
migrate it to expressions.
2021-08-26 16:14:47 +03:00
Avi Kivity
793aca8e4e cql3: column_condition: convert term::raw to expressions
Change term::raw in column_condition::raw to expressions. Because a single
raw class is used to represent multiple shapes (IN ? and IN (x, y, z)),
some of the expressions are optional, corresponding to nullables before the
conversion.

to_term() is not converted, since it's part of the larger relation
hierarchy.
2021-08-26 15:34:13 +03:00
Avi Kivity
2c42a65db1 cql3: expr, constants: convert constants::literal to untyped_constant
Introduce a new expression untyped_constant that corresponds to
constants::literal, which is removed. untyped_constant is rather
ugly in that it won't exist post-prepare. We should probably instead
replace it with typed constants that use the widest possible type
(decimal and varint), and select a narrower type during the prepare
phase when we perform type inference. The conversion itseld is
straightforward.
2021-08-26 15:03:07 +03:00
Avi Kivity
0876248c2b Merge "cql3: cache function calls evaluation for non-deterministic functions" from Pavel S
"
`function_call` AST nodes are created for each function
with side effects in a CQL query, i.e. non-deterministic
functions (`uuid()`, `now()` and some others timeuuid-related).

These nodes are evaluated either when a query itself is executed
or query restrictions are computed (e.g. partition/clustering
key ranges for LWT requests).

We need to cache the calls since otherwise when handling a
`bounce_to_shard` request for an LWT query, we can possibly
enter an infinite bouncing loop (in case a function is used
to calculate partition key ranges for a query), since the
results can be different each time.

Furthermore, we don't support bouncing more than one time.
Returning `bounce_to_shard` message more than one time
will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Function calls are indexed by the order in which they appear
within a statement while parsing. There is no need to
include any kind of statement identifier to the cache key
since `query_options` (which holds the cache) is limited
to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Add a test written in `cql-pytest` framework to verify
that both prepared and unprepared lwt statements handle
`bounce_to_shard` messages correctly in such scenario.

Fixes: #8604

Tests: unit(dev, debug)

NOTE: the patchset uses `query_options` as a container for
cached values. This doesn't look clean and `service::query_state`
seems to be a better place to store them. But it's not
forwarded to most of the CQL code and would mean that a huge number
of places would have to be amended.
The series presents a trade-off to avoid forwarding `query_state`
everywhere (but maybe it's the thing that needs to be done, nonetheless).
"

* 'lwt_bounce_to_shard_cached_fn_v6' of https://github.com/ManManson/scylla:
  cql-pytest: add a test for non-pure CQL functions
  cql3: cache function calls evaluation for non-deterministic functions
  cql3: rename `variable_specifications` to `prepare_context`
2021-07-30 14:21:11 +03:00
Avi Kivity
e52ebe2da5 types: convert abstract_type::compare and related to std::strong_ordering
Change comparators around types to std::strong_ordering.

Ref #1449.
2021-07-28 13:19:24 +03:00
Pavel Solodovnikov
49ddd269ea cql3: rename variable_specifications to prepare_context
The class is repurposed to be more generic and also be able
to hold additional metadata related to function calls within
a CQL statement. Rename all methods appropriately.

Visitor functions in AST nodes (`collect_marker_specification`)
are also renamed to a more generic `fill_prepare_context`.

The name `prepare_context` designates that this metadata
structure is a byproduct of `stmt::raw::prepare()` call and
is needed only for "prepare" step of query execution.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-24 14:33:33 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Michał Chojnowski
4baeea0199 cql3: in multi_item_terminal, return the vector of items by value
Returning by reference requires that the elements are internally stored in in
the multi_item_terminal as a std::vector, but in the next patch we will change
the internal type of lists::value from std::vector to utils::chunked_vector.

The copy is not a problem because all users of multi_item_terminal were copying
the returned vector.
2021-05-17 16:46:28 +02:00
Michał Chojnowski
0bb959e890 cql3: don't linearize elements of lists, tuples, and user types
This patch switches the type used to store collection elements inside the
intermediate form used in lists::value, tuples::value etc. from bytes
to managed_bytes. After this patch, tuple and list elements are only linearized
in from_serialized, which will be corrected soon.
This commit introduces some additional copies in expression.cc, which
will be dealt with in a future commit.
2021-04-01 10:44:21 +02:00
Michał Chojnowski
b9322a6b71 cql3: switch users of cql3::raw_value_view to internals-independent API
We want to change the internals of cql3::raw_value{_view}.
However, users of cql3::raw_value and cql3::raw_value_view often
use them by extracting the internal representation, which will be different
after the planned change.

This commit prepares us for the change by making all accesses to the value
inside cql3::raw_value(_view) be done through helper methods which don't expose
the internal representation publicly.

After this commit we are free to change the internal representation of
raw_value_{view} without messing up their users.
2021-04-01 10:42:04 +02:00
Dejan Mircevski
df3ea2443b cql3: Drop all uses_function methods
No one seems to call them except for other uses_function methods.

Tests: unit (dev)

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-09-04 17:27:30 +02:00
Dejan Mircevski
d97605f4f8 cql3: Drop operator_type from the parser
Replace operator_type with the nicer-behaved oper_t in CQL parser and,
consequently, in the relation hierarchy and column_condition.

After this, no references to operator_type remain in live code.

Signed-off-by: Dejan Mircevski <dejan@scylladb.com>
2020-08-18 12:27:00 +02:00
Pavel Solodovnikov
b183530f2c cql3: use lw_shared_ptr instead of shared_ptr for column_condition
Both `cql3::column_condition` and `cql3::column_condition::raw`
classes are marked as `final`: it's safe to use lw_shared_ptr
instead of generic `seastar::shared_ptr`.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200428202249.82785-1-pa.solodovnikov@scylladb.com>
2020-05-06 13:11:07 +03:00
Pavel Solodovnikov
f6e765b70f cql3: pass column_specification via lw_shared_ptr
`column_specification` class is marked as "final": it's safe
to use non-polymorphic pointer "lw_shared_ptr" instead of a
more generic "shared_ptr".

tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>
2020-04-27 12:47:42 +03:00
Rafael Ávila de Espíndola
caef2ef903 everywhere: Don't assume sstring::begin() and sstring::end() are pointers
If we switch to using std::string we have to handle begin and end
returning iterators.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-03-10 13:13:48 -07:00
Alejo Sanchez
c3b157a80b lwt: support LIKE operator in conditional expressions
Adds support of LIKE operator in conditional column expressions.

Refs: #5777

Signed-off-by: Alejo Sanchez <alejo.sanchez@scylladb.com>
2020-03-01 14:22:10 +01:00
Pavel Solodovnikov
49bf936403 cql3: change signatures of several functions to return crefs instead of pointers
The following functions now accept const reference to
column_specification instead of shared_ptr:
 * lists::index_spec_of
 * lists::value_spec_of
 * lists::uuid_index_spec_of
 * sets::value_spec_of

Changed maps::value_spec_of and maps::key_spec_of signatures
to accept const ref instead of non-const ref to
column_specification.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:23:56 +03:00
Pavel Solodovnikov
bf95bd0916 cql3: more functions marked as const
The following functions are now "const":
 * `term::collect_marker_specification`
 * `relation::to_term`
 * `multi_item_terminal::get_elements`
 * `raw_update::is_compatible_with`

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200213142445.35312-1-pa.solodovnikov@scylladb.com>
2020-02-16 11:22:30 +02:00
Pavel Solodovnikov
bcc4647552 lwt: fix handling of nulls in parameter markers for LWT queries
This patch affects the LWT queries with IF conditions of the
following form: `IF col in :value`, i.e. if the parameter
marker is used.

When executing a prepared query with a bound value
of `(None,)` (tuple with null, example for Python driver), it is
serialized not as NULL but as "empty" value (serialization
format differs in each case).

Therefore, Scylla deserializes the parameters in the request as
empty `data_value` instances, which are, in turn, translated
to non-empty `bytes_opt` with empty byte-string value later.

Account for this case too in the CAS condition evaluation code.

Example of a problem this patch aims to fix:

Suppose we have a table `tbl` with a boolean field `test` and
INSERT a row with NULL value for the `test` column.

Then the following update query fails to apply due to the
error in IF condition evaluation code (assume `v=(null)`):
`UPDATE tbl SET test=false WHERE key=0 IF test IN :v`
returns false in `[applied]` column, but is expected to succeed.

Tests: unit(debug, dev), dtest(prepared stmt LWT tests at https://github.com/scylladb/scylla-dtest/pull/1286)

Fixes: #5710

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200205102039.35851-1-pa.solodovnikov@scylladb.com>
2020-02-09 16:50:42 +02:00
Pavel Solodovnikov
f2feeb4b10 cql3: Propagate "const" to some virtual methods in cql hierarchy
Add "const" attributes to `assignment_testable::test_assignment`
and `term::raw::prepare` methods. These should have been marked as
"const" even before the change but for some reason were missing
these qualifiers.

Mark other supplementary methods with "const" attributes as
necessary.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200127213215.494000-1-pa.solodovnikov@scylladb.com>
2020-01-29 00:23:40 +02:00
Pavel Solodovnikov
e1b22b6a4c cql3: get rid of lw_shared_ptr for variable_specifications
`parsed_statement::get_bound_variables` is assumed to always
return a nonnull pointer to `variable_specifications` instance.

In this case using a pointer is superfluous and can be safely
replaced by a plain reference.

Also add a default ctor and a utility method `set_bound_variables`
to the `variable_specifications` class to actually reset the
contents of the class instance.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200120195839.164296-1-pa.solodovnikov@scylladb.com>
2020-01-22 12:51:02 +02:00
Pavel Solodovnikov
aba9a11ff0 cql: pass variable_specifications via lw_shared_ptr
Instances of `variable_specifications` are passed around as
shared_ptr's, which are redundant in this case since the class
is marked as `final`. Use `lw_shared_ptr` instead since we know
for sure it's not a polymorphic pointer.

Tests: unit(debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20191225232853.45395-1-pa.solodovnikov@scylladb.com>
2019-12-29 16:26:26 +02:00
Pavel Solodovnikov
55a1d46133 cql: some more missing const qualifiers
There are several virtual functions in public interfaces named "is_*"
that clearly should be marked as "const", so fix that.
2019-11-26 17:57:51 +03:00
Konstantin Osipov
e555dc502e lwt: implement basic lightweight transactions support
Support single-statement conditional updates and as well as batches.

This patch almost fully rewrites column_condition.cc, implementing
is_satisfied_by().

Most of the remaining complications in column_condition implementation
come from the need to properly handle frozen and multi-cell
collection in predicates - up until now it was not possible
to compare entire collection values between each other. This is further
complicated since multi-cell lists and sets are returned as maps.

We can no longer assume that the columns fetched by prefetch operation
are non-frozen collections. IF EXISTS/IF NOT EXISTS condition
fetches all columns, besides, a column may be needed to check other
condition.

When fetching the old row for LWT or to apply updates on list/columns,
we now calculate precisely the list of columns to fetch.

The primary key columns are also included in CAS batch result set,
and are thus also prefetched (the user needs them to figure out which
statements failed to apply).

The patch is cross-checked for compatibility with cassandra-3.11.4-1545-g86812fa502
but does deviate from the origin in handling of conditions on static
row cells. This is addressed in future series.
2019-10-27 23:42:49 +03:00
Konstantin Osipov
67e68dabf0 lwt: ensure we don't crash when we get a LIKE 2019-10-27 23:42:49 +03:00
Konstantin Osipov
f8f36d066c lwt: check for unsupported collection type in condition element access
We don't support conditions with element access on non-frozen UDTs,
check that only supported collection types are supplied.
2019-10-27 23:42:49 +03:00
Konstantin Osipov
c9f0adf616 lwt: rewrite cql3::raw::column_condition::prepare()
Restructure the code to avoid quite a bit of code duplication.
2019-10-27 23:42:47 +03:00
Konstantin Osipov
22b0240fe7 lwt: remove useless code in column_condition.hh
Each column_condition and raw::column_condition construction case had a
static method wrapping its constructor, simply supplying some defaults.

This neither improves clarity nor maintainability.
2019-10-27 23:42:03 +03:00
Kamil Braun
b904d04925 cql3: add a TODO to implement column_conditions for UDTs.
This will become relevant after LWT is implemented.
2019-10-25 12:04:44 +02:00
Rafael Ávila de Espíndola
de6d6c46a1 types: Remove collection_type_impl::kind
All uses have been switched to abstract_type::kind.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2019-08-14 10:02:00 -07:00
Avi Kivity
cb7ee5c765 cql3: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Jesse Haber-Kucharsky
509626fe08 Support duration CQL native type
`duration` is a new native type that was introduced in Cassandra 3.10 [1].

Support for parsing and the internal representation of the type was added in
8fa47b74e8.

Important note: The version of cqlsh distributed with Scylla does not have
support for durations included (it was added to Cassandra in [2]). To test this
change, you can use cqlsh distributed with Cassandra.

Duration types are useful when working with time-series tables, because they can
be used to manipulate date-time values in relative terms.

Two interesting applications are:

- Aggregation by time intervals [3]:

`SELECT * FROM my_table GROUP BY floor(time, 3h)`

- Querying on changes in date-times:

`SELECT ... WHERE last_heartbeat_time < now() - 3h`

(Note: neither of these is currently supported, though columns with duration
values are.)

Internally, durations are represented as three signed counters: one for months,
for days, and for nanoseconds. Each of these counters is serialized using a
variable-length encoding which is described in version 5 of the CQL native
protocol specification.

The representation of a duration as three counters means that a semantic
ordering on durations doesn't exist: Is `1mo` greater than `1mo1d`? We cannot
know, because some months have more days than others. Durations can only have a
concrete absolute value when they are "attached" to absolute date-time
references. For example, `2015-04-31 at 12:00:00 + 1mo`.

That duration values are not comparable presents some difficulties for the
implementation, because most CQL types are. Like in Cassandra's implementation
[2], I adopted a similar strategy to the way restrictions on the `counter` type
are checked. A type "references" a duration if it is either a duration or it
contains a duration (like a `tuple<..., duration, ...>`, or a UDT with a
duration member).

The following restrictions apply on durations. Note that some of these contexts
are either experimental features (materialized views), or not currently
supported at run-time (though support exists in the parser and code, so it is
prudent to add the restrictions now):

- Durations cannot appear in any part of a primary key, either for tables or
  materialized views.

- Durations cannot be directly used as the element type of a `set`, nor can they
  be used as the key type of a `map`. Because internal ordering on durations is
  based on a byte-level comparison, this property of Cassandra was intended to
  help avoid user confusion around ordering of collection elements.

- Secondary indexes on durations are not supported.

- "Slice" relations (<=, <, >=, >) are not supported on durations with `WHERE`
   restrictions (like `SELECT ... WHERE span <= 3d`). Multi-column restrictions
   only work with clustering columns, which cannot be `duration` due to the
   first rule.

- "Slice" relations are not supported on durations with query conditions (like
  `UPDATE my_table ... IF span > 5us`).

Backwards incompatibility note:

As described in the documentation [4], duration literals take one of two
forms: either ISO 8601 formats (there are three), or a "standard" format. The ISO
8601 formats start with "P" (like "P5W"). Therefore, identifiers that have this
form are no longer supported.

Fixes #2240.

[1] https://issues.apache.org/jira/browse/CASSANDRA-11873

[2] bfd57d13b7

[3] https://issues.apache.org/jira/browse/CASSANDRA-11871

[4] http://cassandra.apache.org/doc/latest/cql/types.html#working-with-durations
2017-08-10 15:01:10 -04:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Avi Kivity
6290dee438 db: const correctness for abstract_type and friends
Types are immutable.
2015-04-29 15:40:38 +03:00
Avi Kivity
f1219c1d74 cql3: enable collections in column_condition 2015-04-27 14:43:43 +03:00
Avi Kivity
3d38708434 cql3: pass a database& instance to most foo::raw::prepare() variants
To prepare a user-defined type, we need to look up its name in the keyspace.
While we get the keyspace name as an argument to prepare(), it is useless
without the database instance.

Fix the problem by passing a database reference along with the keyspace.
This precolates through the class structure, so most cql3 raw types end up
receiving this treatment.

Origin gets along without it by using a singleton.  We can't do this due
to sharding (we could use a thread-local instance, but that's ugly too).

Hopefully the transition to a visitor will clean this up.
2015-04-20 16:15:34 +03:00
Tomasz Grabiec
e3422525c0 Use column_definition via const reference 2015-03-24 12:03:00 +01:00
Tomasz Grabiec
609e893055 unimplemented: Separate subject from behavior
You can now do:

  fail(unimplemented::cause::PAGING);

and:

  warn(unimplemented::cause::PAGING);
2015-02-27 10:48:56 +01:00