Commit Graph

64 Commits

Author SHA1 Message Date
Avi Kivity
0b418fa7cf cql3, transport, tests: remove "unset" from value type system
The CQL binary protocol introduced "unset" values in version 4
of the protocol. Unset values can be bound to variables, which
cause certain CQL fragments to be skipped. For example, the
fragment `SET a = :var` will not change the value of `a` if `:var`
is bound to an unset value.

Unsets, however, are very limited in where they can appear. They
can only appear at the top-level of an expression, and any computation
done with them is invalid. For example, `SET list_column = [3, :var]`
is invalid if `:var` is bound to unset.

This causes the code to be littered with checks for unset, and there
are plenty of tests dedicated to catching unsets. However, a simpler
way is possible - prevent the infiltration of unsets at the point of
entry (when evaluating a bind variable expression), and introduce
guards to check for the few cases where unsets are allowed.

This is what this long patch does. It performs the following:

(general)

1. unset is removed from the possible values of cql3::raw_value and
   cql3::raw_value_view.

(external->cql3)

2. query_options is fortified with a vector of booleans,
   unset_bind_variable_vector, where each boolean corresponds to a bind
   variable index and is true when it is unset.
3. To avoid churn, two compatiblity structs are introduced:
   cql3::raw_value{,_view}_vector_with_unset, which can be constructed
   from a std::vector<raw_value{,_view/}>, which is what most callers
   have. They can also be constructed with explicit unset vectors, for
   the few cases they are needed.

(cql3->variables)

4. query_options::get_value_at() now throws if the requested bind variable
   is unset. This replaces all the throwing checks in expression evaluation
   and statement execution, which are removed.
5. A new query_options::is_unset() is added for the users that can tolerate
   unset; though it is not used directly.
6. A new cql3::unset_operation_guard class guards against unsets. It accepts
   an expression, and can be queried whether an unset is present. Two
   conditions are checked: the expression must be a singleton bind
   variable, and at runtime it must be bound to an unset value.
7. The modification_statement operations are split into two, via two
   new subclasses of cql3::operation. cql3::operation_no_unset_support
   ignores unsets completely. cql3::operation_skip_if_unset checks if
   an operand is unset (luckily all operations have at most one operand that
   tolerates unset) and applies unset_operation_guard to it.
8. The various sites that accept expressions or operations are modified
   to check for should_skip_operation(). This are the loops around
   operations in update_statement and delete_statement, and the checks
   for unset in attributes (LIMIT and PER PARTITION LIMIT)

(tests)

9. Many unset tests are removed. It's now impossible to enter an
   unset value into the expression evaluation machinery (there's
   just no unset value), so it's impossible to test for it.
10. Other unset tests now have to be invoked via bind variables,
   since there's no way to create an unset cql3::expr::constant.
11. Many tests have their exception message match strings relaxed.
   Since unsets are now checked very early, we don't know the context
   where they happen. It would be possible to reintroduce it (by adding
   a format string parameter to cql3::unset_operation_guard), but it
   seems not to be worth the effort. Usage of unsets is rare, and it is
   explicit (at least with the Python driver, an unset cannot be
   introduced by ommission).

I tried as an alternative to wrap cql3::raw_value{,_view} (that doesn't
recognize unsets) with cql3::maybe_unset_value (that does), but that
caused huge amounts of churn, so I abandoned that in favor of the
current approach.

Closes #12517
2023-01-16 21:10:56 +02:00
Avi Kivity
de0c31b3b6 cql3: query_options: simplify batch query_options constructor
The batch constructor uses an unnecessarily complicated template,
where in fact it only vector<vector<raw_value | raw_value_view>>.

Simplify the constructor to allow exactly that. Delete some confusing
comments around it.

Closes #12488
2023-01-11 07:54:54 +02:00
Avi Kivity
2739ac66ed treewide: drop cql_serialization_format
Now that we don't accept cql protocol version 1 or 2, we can
drop cql_serialization format everywhere, except when in the IDL
(since it's part of the inter-node protocol).

A few functions had duplicate versions, one with and one without
a cql_serialization_format parameter. They are deduplicated.

Care is taken that `partition_slice`, which communicates
the cql_serialization_format across nodes, still presents
a valid cql_serialization_format to other nodes when
transmitting itself and rejects protocol 1 and 2 serialization\
format when receiving. The IDL is unchanged.

One test checking the 16-bit serialization format is removed.
2023-01-03 19:54:13 +02:00
Avi Kivity
654b96660a cql: modification_statement: drop protocol check for LWT
CQL protocol 1 did not support LWT, but since we don't support it
any more, we can drop the check and the supporting get_protocol_version()
helper.
2023-01-03 19:51:57 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Solodovnikov
3b6adf3a62 cql3: cache function calls evaluation for non-deterministic functions
And reuse these values when handling `bounce_to_shard` messages.

Otherwise such a function (e.g. `uuid()`) can yield a different
value when a statement re-executed on the other shard.

It can lead to an infinite number of `bounce_to_shard` messages
sent in case the function value is used to calculate partition
key ranges for the query. Which, in turn, will cause crashes
since we don't support bouncing more than one time and the second
hop will result in a crash.

Caching works only for LWT statements and only for the function
calls that affect partition key range computation for the query.

`variable_specifications` class is renamed to `prepare_context`
and generalized to record information about each `function_call`
AST node and modify them, as needed:
* Check whether a given function call is a part of partition key
  statement restriction.
* Assign ids for caching if above is true and the call is a part
  of an LWT statement.

There is no need to include any kind of statement identifier
in the cache key since `query_options` (which holds the cache)
is limited to a single statement, anyway.

Note that `function_call::raw` AST nodes are not created
for selection clauses of a SELECT statement hence they
can only accept only one of the following things as parameters:
* Other function calls.
* Literal values.
* Parameter markers.

In other words, only parameters that can be immediately reduced
to a byte buffer are allowed and we don't need to handle
database inputs to non-pure functions separately since they
are not possible in this context. Anyhow, we don't even have
a single non-pure function that accepts arguments, so precautions
are not needed at the moment.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-07-30 01:22:39 +03:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Piotr Sarna
c5214eb096 treewide: remove timeout config from query options
Timeout config is now stored in each connection, so there's no point
in tracking it inside each query as well. This patch removes
timeout_config from query_options and follows by removing now
unnecessary parameters of many functions and constructors.
2021-02-25 17:20:27 +01:00
Konstantin Osipov
232ce6f611 lists: rewrite list prepend to use append machinery
Rewrite list prepend to use the same machinery
as append, and thus produce correct results when used in LWT.

After this patch, list prepend begins to honor user supplied timestamps.

If a user supplied timestamp for prepend is less than 2010-01-01 00:00:00
an exception is thrown.

Fixes #7611
2021-01-21 13:03:59 +03:00
Konstantin Osipov
2b8ce83eea lists: use query timestamp for list cell values during append
Scylla list cells are represented internally as a map of
timeuuid => value. To append a new value to a list
the coordinator generates a timeuuid reflecting the current time as key
and adds a value to the map using this key.

Before this patch, Scylla always generated a timeuuid for a new
value, even if the query had a user supplied or LWT timestamp.
This could break LWT linearizability. User supplied timestamps were
ignored.

This is reported as https://github.com/scylladb/scylla/issues/7611

A statement which appended multiple values to a list or a BATCH
generated an own microsecond-resolution timeuuid for each value:

BEGIN BATCH
  UPDATE ... SET a = a + [3]
  UPDATE ... SET a = a + [4]
APPLY BATCH

UPDATE ... SET a = a + [3, 4]

To fix the bug, it's necessary to preserve monotonicity of
timeuuids within a batch or multi-value append, but make sure
they all use the microsecond time, as is set by LWT or user.

To explain the fix, it's first necessary to recall the structure
of time-based UUIDs:

60 bits: time since start of GMT epoch, year 1582, represented
         in 100-nanosecond units
4 bits:  version
14 bits: clock sequence, a random number to avoid duplicates
         in case system clock is adjusted
2 bits:  type
48 bits: MAC address (or other hardware address)

The purpose of clockseq bits is as defined in
https://tools.ietf.org/html/rfc4122#section-4.1.5
is to reduce the probability of UUID collision in case clock
goes back in time or node id changes. The implementation should reset it
whenever one of these events may occur.

Since LWT microsecond time is guaranteed to be
unique by Paxos, the RFC provisioning for clockseq and MAC
slots becomes excessive.

The fix thus changes timeuuid slot content in the following way:
- time component now contains the same microsecond time for all
  values of a statement or a batch. The time is unique and monotonic in
  case of LWT. Otherwise it's most always monotonic, but may not be
  unique if two timestamps are created on different coordinators.
- clockseq component is used to store a sequence number which is
  unique and monotonic for all values within the statement/batch.
- to protect against time back-adjustments and duplicates
  if time is auto-generated, MAC component contains a random (spoof)
  MAC address, re-created on each restart. The address is different
  at each shard.

The change is made for all sources of time: user, generated, LWT.
Conditioning the list key generation algorithm on the source of
time would unnecessarily complicate the code while not increase
quality (uniqueness) of created list keys.

Since 14 bits of clockseq provide us with only 16383 distinct slots
per statement or batch, 3 extra bits in nanosecond part of the time
are used to extend the range to 131071 values per statement/batch.
If the rang is exceeded beyond the limit, an exception is produced.

A twist on the use of clockseq to extend timeuuid uniqueness is
that Scylla, like Cassandra, uses int8 compare to compare lower
bits of timeuuid for ordering. The patch takes this into account
and sign-complements the clockseq value to make it monotonic
according to the legacy compare function.

Fixes #7611

test: unit (dev)
2021-01-21 13:03:59 +03:00
Piotr Sarna
7055297649 cql3: remove query_options::linearize and _temporaries
query_options::linearize was the only user of _temporaries helper
attribute, and it turns out that this function is never used -
- and is therefore removed.
2020-08-26 09:45:49 +02:00
Piotr Sarna
c0a7eda2a8 cql3: remove make_temporary helper function
Since temporary values will no longer be stored inside query options,
the helper function is removed altogether.
2020-08-26 09:45:49 +02:00
Avi Kivity
a4c44cab88 treewide: update concepts language from the Concepts TS to C++20
Seastar recently lost support for the experimental Concepts Technical
Specification (TS) and gained support for C++20 concepts. Re-enable
concepts in Scylla by updating our use of concepts to the C++20
standard.

This change:
 - peels off uses of the GCC6_CONCEPT macro
 - removes inclusions of <seastar/gcc6-concepts.hh>
 - replaces function-style concepts (no longer supported) with
   equation-style concepts
 - semicolons added and removed as needed
 - deprecated std::is_pod replaced by recommended replacement
 - updates return type constraints to use concepts instead of
   type names (either std::same_as or std::convertible_to, with
   std::same_as chosen when possible)

No attempt is made to improve the concepts; this is a specification
update only.
Message-Id: <20200531110254.2555854-1-avi@scylladb.com>
2020-06-02 09:12:21 +03:00
Pavel Solodovnikov
f6e765b70f cql3: pass column_specification via lw_shared_ptr
`column_specification` class is marked as "final": it's safe
to use non-polymorphic pointer "lw_shared_ptr" instead of a
more generic "shared_ptr".

tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200427084016.26068-1-pa.solodovnikov@scylladb.com>
2020-04-27 12:47:42 +03:00
Pavel Solodovnikov
d64fd52ae5 paging_state: switch from shared_ptr to lw_shared_ptr
Change the way `service::pager::paging_state` is passed around
from `shared_ptr` to `lw_shared_ptr`. It's safe since
`paging_state` is final.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-02-16 17:23:36 +03:00
Gleb Natapov
4893bc9139 tracing: split adding prepared query parameters from stopping of a trace
Currently query_options objects is passed to a trace stopping function
which makes it mandatory to make them alive until the end of the
query. The reason for that is to add prepared statement parameters to
the trace.  All other query options that we want to put in the trace are
copied into trace_state::params_values, so lets copy prepared statement
parameters there too. Trace enabled case will become a little bit more
expensive but on the other hand we can drop a continuation that holds
query_options object alive from a fast path. It is safe to drop the call
to stop_foreground_prepared() here since The tracing will be stopped
in process_request_one().

Message-Id: <20191205102026.GJ9084@scylladb.com>
2019-12-05 17:00:47 +02:00
Konstantin Osipov
383e17162a lwt: implement query_options::check_serial_consistency()
Both in a single-statement transaction and in a batch
we expect that serial consistency is provided. Move the
check to query_options class and make it available for
reuse.

Keep get_serial_consistency() around for use in
transport/server.cc.
Message-Id: <20191006154532.54856-2-kostja@scylladb.com>
2019-10-08 00:02:35 +02:00
Avi Kivity
3a44fa9988 cql3, treewide: introduce empty cql3::cql_config class and propagate it
We need a way to configure the cql interpreter and runtime. So far we relied
on accessing the configuration class via various backdoors, but that causes
its own problems around initialization order and testability. To avoid that,
this patch adds an empty cql_config class and propagates it from main.cc
(and from tests) to the cql interpreter via the query_options class, which is
already passed everywhere.

Later patches will fill it with contents.
2019-08-21 19:35:59 +02:00
Piotr Sarna
97d476b90f cql3: add a query options constructor with explicit page size
For internal use, there already exists a query_options constructor
that copies data from another query_options with overwritten paging
state. This commit adds an option to overwrite page size as well.
2019-06-24 13:21:32 +02:00
Piotr Sarna
fa89e220ef cql3: enable explicit copying of query_options 2019-06-24 12:57:04 +02:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Paweł Dziepak
15775c958a cql3: query_options: make simple getter inlineable 2018-07-26 12:37:06 +01:00
Paweł Dziepak
8f4cb36ef2 cql3 query_options: add linearize()
Some code in the CQL3 layer requires bytes_view and it is fairly
reasonable to assume that it won't deal with large buffers (e.g.
statement restrictions). query_options already has make_temporary()
which takes ownership of a cql3::raw_value so that the rest of the code
can use cql3::raw_value_view. This patch adds similar linearize()
function which, if necessary, linearises a cql3::raw_value_view and
returns a bytes_view with lifetime tied to the life or query_options.
2018-07-18 12:28:06 +01:00
Paweł Dziepak
3810045f8f cql3: query_options: use bytes_ostream for temporaries
bytes_ostream is going to be more efficient than
std::vector<std::vector<char>> since it can put multiple small values in
a single buffer thus reducing the number of memory allocations.
2018-07-18 12:28:06 +01:00
Vlad Zolotarov
a469567605 cql3::query_options: add get_names() method
This method returns names of named prepared statement parameters.

Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2018-06-12 10:57:05 -04:00
Avi Kivity
7b5db486a0 query_options: augment with timeout_config
Add a timeout_config member to query_options. This lets the query
processor know what timeouts the user of this query want to apply.
2018-04-30 13:19:53 +03:00
Tomasz Grabiec
52c61df930 Relax includes
To avoid unnecessary recompilations.
Message-Id: <1522168295-994-1-git-send-email-tgrabiec@scylladb.com>
2018-03-28 10:49:07 +03:00
Amnon Heiman
45b3e8cd11 query_options: Allows creating query_options from query_options
query_options object cannot be changed after it was created. For
internal uses, like internal query paging, it is needed to create a new
object based on some of the data from an existing one with a new paging
state.

This patch adds a constructor from a unique_ptr and paging state.

using unique_ptr behave similar to move modify constructor.

Signed-off-by: Amnon Heiman <amnon@scylladb.com>
2017-07-13 14:02:11 +03:00
Vlad Zolotarov
fcef9d3b05 cql3::query_options: add a factory method for creation of options for a BATCH statement
Signed-off-by: Vlad Zolotarov <vladz@scylladb.com>
2017-04-12 12:24:08 -04:00
Pekka Enberg
be0351b49c cql3: Introduce raw_value and raw_value_view types
Currently, the code is using bytes_opt and bytes_view_opt to represent
CQL values, which can hold a value or null. In preparation for
supporting a third state, unset value introduced in CQL v4, introduce
new raw_value and raw_value_view types and use them instead.

The new types are based on boost::variant<> and are capable of holding
null, unset values, and blobs that represent a value.
2017-01-26 13:50:04 +02:00
Pekka Enberg
f92bbc6f44 cql3: Kill unimplemented query_options constructor
The constructor was added in commit 7f3ce39 ("query_options: Add
constructor for batch mode options (multi-level)") but apparently it was
never actually implemented.

Spotted by CLion.
Message-Id: <1474303017-23383-1-git-send-email-penberg@scylladb.com>
2016-09-20 10:01:10 +01:00
Duarte Nunes
2683a49c69 query_options: Remove value_views arg from ctor
Having both the values and value_views arguments in the query_options
ctor is confusing, since query_options uses only the value_views field
but that is not communicated to the caller.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-27 15:24:27 +02:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
6709c0ac15 cql_serialization_format: Make it CQL protocol version aware
We want to serialize it as a single number, the CQL binary protocol
version to which it corresponds, so it needs to be aware of the
version number.
2016-02-15 17:05:55 +01:00
Tomasz Grabiec
9d11968ad8 Rename serialization_format to cql_serialization_format 2016-02-15 16:53:56 +01:00
Calle Wilund
32e480025f cql3::query_options: Add constructors for internal processing 2016-01-13 08:49:01 +00:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Calle Wilund
7f3ce3935e query_options: Add constructor for batch mode options (multi-level)
Added explicit move constructors as well as prohibit copy to help
disambiguate the constructor delegation
2015-09-15 11:20:13 +02:00
Pekka Enberg
10c6eee221 transport/server: Use sstring_view for query option names
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 09:06:13 +03:00
Pekka Enberg
23e9bf7162 cql3/query_options: make_temporary() helper
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 09:06:13 +03:00
Pekka Enberg
6dee204db2 cql3/query_options: Store values as bytes view
Store values as bytes view when possible. This improves the CQL protocol
option parsing path by avoiding allocating memory and copying individual
values as "bytes" objects.

Please note that we retain the non-view version for internal queries
where performance is not as important.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-24 09:06:13 +03:00
Pekka Enberg
ed92f8516c cql3/query_options: Fix formatting
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-17 09:48:59 +03:00
Pekka Enberg
b165d22443 cql3/query_options: Move implementation to source file
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-17 09:48:59 +03:00
Pekka Enberg
401c3668a4 cql3/query_options: Remove ifdef'd code
Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-17 09:48:58 +03:00
Pekka Enberg
5b9901d693 cql3/query_options: Encapsulate underlying values
Encapsulate the '_values' vector to make it easier to switch the
underlying type from bytes_opt to bytes_view_opt.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-08-17 09:48:58 +03:00
Calle Wilund
53c3067fc4 cql3::query_options - add convinience constructor for internals 2015-07-06 08:21:16 +02:00
Pekka Enberg
d50139351f cql3: Use pragma once everywhere
There's no benefit to using C include guards so switch to pragma once
everywhere for consistency.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-05-12 16:32:56 +03:00
Avi Kivity
f1fe44a407 cql3: add batch support to query_options
Transport support yet to be wired up.
2015-04-13 14:55:07 +03:00