Commit Graph

32 Commits

Author SHA1 Message Date
Benny Halevy
3feb759943 everywhere: use utils::chunked_vector for list of mutations
Currently, we use std::vector<*mutation> to keep
a list of mutations for processing.
This can lead to large allocation, e.g. when the vector
size is a function of the number of tables.

Use a chunked vector instead to prevent oversized allocations.

`perf-simple-query --smp 1` results obtained for fixed 400MHz frequency
and PGO disabled:

Before (read path):
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...

89055.97 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39417 insns/op,   18003 cycles/op,        0 errors)
103372.72 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39380 insns/op,   17300 cycles/op,        0 errors)
98942.27 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39413 insns/op,   17336 cycles/op,        0 errors)
103752.93 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39407 insns/op,   17252 cycles/op,        0 errors)
102516.77 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39403 insns/op,   17288 cycles/op,        0 errors)
throughput:
	mean=   99528.13 standard-deviation=6155.71
	median= 102516.77 median-absolute-deviation=3844.59
	maximum=103752.93 minimum=89055.97
instructions_per_op:
	mean=   39403.99 standard-deviation=14.25
	median= 39406.75 median-absolute-deviation=9.30
	maximum=39416.63 minimum=39380.39
cpu_cycles_per_op:
	mean=   17435.81 standard-deviation=318.24
	median= 17300.40 median-absolute-deviation=147.59
	maximum=18002.53 minimum=17251.75
```

After (read path)
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=read, query_single_key=no, counters=no}
Disabling auto compaction
Creating 10000 partitions...
59755.04 tps ( 66.2 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39466 insns/op,   22834 cycles/op,        0 errors)
71854.16 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39417 insns/op,   17883 cycles/op,        0 errors)
82149.45 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.2 tasks/op,   39411 insns/op,   17409 cycles/op,        0 errors)
49640.04 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.3 tasks/op,   39474 insns/op,   19975 cycles/op,        0 errors)
54963.22 tps ( 66.1 allocs/op,   0.0 logallocs/op,  14.3 tasks/op,   39474 insns/op,   18235 cycles/op,        0 errors)
throughput:
	mean=   63672.38 standard-deviation=13195.12
	median= 59755.04 median-absolute-deviation=8709.16
	maximum=82149.45 minimum=49640.04
instructions_per_op:
	mean=   39448.38 standard-deviation=31.60
	median= 39466.17 median-absolute-deviation=25.75
	maximum=39474.12 minimum=39411.42
cpu_cycles_per_op:
	mean=   19267.01 standard-deviation=2217.03
	median= 18234.80 median-absolute-deviation=1384.25
	maximum=22834.26 minimum=17408.67
```

`perf-simple-query --smp 1 --write` results obtained for fixed 400MHz frequency
and PGO disabled:

Before (write path):
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no}
Disabling auto compaction
63736.96 tps ( 59.4 allocs/op,  16.4 logallocs/op,  14.3 tasks/op,   49667 insns/op,   19924 cycles/op,        0 errors)
64109.41 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   49992 insns/op,   20084 cycles/op,        0 errors)
56950.47 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50005 insns/op,   20501 cycles/op,        0 errors)
44858.42 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50014 insns/op,   21947 cycles/op,        0 errors)
28592.87 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50027 insns/op,   27659 cycles/op,        0 errors)
throughput:
	mean=   51649.63 standard-deviation=15059.74
	median= 56950.47 median-absolute-deviation=12087.33
	maximum=64109.41 minimum=28592.87
instructions_per_op:
	mean=   49941.18 standard-deviation=153.76
	median= 50005.24 median-absolute-deviation=73.01
	maximum=50027.07 minimum=49667.05
cpu_cycles_per_op:
	mean=   22023.01 standard-deviation=3249.92
	median= 20500.74 median-absolute-deviation=1938.76
	maximum=27658.75 minimum=19924.32
```

After (write path)
```
enable-cache=1
Running test with config: {partitions=10000, concurrency=100, mode=write, query_single_key=no, counters=no}
Disabling auto compaction
53395.93 tps ( 59.4 allocs/op,  16.5 logallocs/op,  14.3 tasks/op,   50326 insns/op,   21252 cycles/op,        0 errors)
46527.83 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50704 insns/op,   21555 cycles/op,        0 errors)
55846.30 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50731 insns/op,   21060 cycles/op,        0 errors)
55669.30 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50735 insns/op,   21521 cycles/op,        0 errors)
52130.17 tps ( 59.3 allocs/op,  16.0 logallocs/op,  14.3 tasks/op,   50757 insns/op,   21334 cycles/op,        0 errors)
throughput:
	mean=   52713.91 standard-deviation=3795.38
	median= 53395.93 median-absolute-deviation=2955.40
	maximum=55846.30 minimum=46527.83
instructions_per_op:
	mean=   50650.57 standard-deviation=182.46
	median= 50731.38 median-absolute-deviation=84.09
	maximum=50756.62 minimum=50325.87
cpu_cycles_per_op:
	mean=   21344.42 standard-deviation=202.86
	median= 21334.00 median-absolute-deviation=176.37
	maximum=21554.61 minimum=21060.24
```

Fixes #24815

Improvement for rare corner cases. No backport required

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>

Closes scylladb/scylladb#24919
2025-07-13 19:13:11 +03:00
Andrzej Jackowski
b4f0a5149a db: cql3: add comments regarding unsafe interval<clustering_key_prefix>
class clustering_range is a range of Clustering Key Prefixes implemented
as interval<clustering_key_prefix>. However, due to the nature of
Clustering Key Prefix, the ordering of clustering_range is complex and
does not satisfy the invariant of interval<>. To be more specific, as a
comment in interval<> implementation states: “The end bound can never be
smaller than the start bound”. As a range of CKP violates the invariant,
some algorithms, like intersection(), can return incorrect results.
For more details refer to scylladb#8157, scylladb#21604, scylladb#22817.

This commit:
 - Add a WARNING comment to discourage usage of clustering_range
 - Add WARNING comments to potentially incorrect uses of
   interval<clustering_key_prefix> non-trivial methods
 - Add a FIXME comment to incorrect use of
   interval<clustering_key_prefix_view>::deoverlap and WARNING comments
   to related interval<clustering_key_prefix_view> misuse.

Closes scylladb/scylladb#22913
2025-02-26 12:01:28 +01:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Jan Ciolek
ccdb26bf9e statements/cas_request: fix crash on empty clustering range in LWT
LWT queries with empty clustering range used to cause a crash.
For example in:
```cql
UPDATE tab SET r = 9000 WHERE p = 1  AND c = 2 AND c = 2000 IF r = 3
```
The range of `c` is empty - there are no valid values.

This caused a segfault when accessing the `first` range:
```c++
op.ranges.front()
```

To fix it let's throw en exception when the clustering range
is empty. Cassandra also rejects queries with `c = 1 AND c = 2`.

There's also a check for empty partition range, as it used
to crash in the past, can't really hurt to add it.

Signed-off-by: Jan Ciolek <jan.ciolek@scylladb.com>
2023-06-28 10:18:06 +02:00
Avi Kivity
b858a4669d cql3: expr: break up expression.hh header
Adding a function declaration to expression.hh causes many
recompilations. Reduce that by:

 - moving some restrictions-related definitions to
   the existing expr/restrictions.hh
 - moving evaluation related names to a new header
   expr/evaluate.hh
 - move utilities to a new header
   expr/expr-utilities.hh

expression.hh contains only expression definitions and the most
basic and common helpers, like printing.
2023-06-22 14:21:03 +03:00
Avi Kivity
32b27d6a08 cql3: expr: change evaluation_input vector components to take spans
Spans are slightly cleaner, slightly faster (as they avoid an indirection),
and allow for replacing some of the arguments with small_vector:s.

Closes #14313
2023-06-22 11:28:01 +02:00
Avi Kivity
c5e4bf51bd Introduce mutation/ module
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.

mutation_reader remains in the readers/ module.

mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.

This is a step forward towards librarization or modularization of the
source base.

Closes #12788
2023-02-14 11:19:03 +02:00
Avi Kivity
3a2d8175fb cql3: update_parameters: use evaluation_inputs compatible row prefetch
update_parameters::prefetch_data is used for some list updates (which
need a read-before-write to determine the key to update) and for
LWT compare-and-swap. Currently they use a custom structure for
representing a read row.

Switch to the same structure that is used in evaluation_inputs (and
in SELECT statement evaluation) to the expression machinery can be reused.

The expression representation is irregular (with different fields for
the keys and regular/static columns), so we introduce an old_row
structure to hold both the clustering key and the regular row values
for cas_request.

A nice bonus is that we can use get_non_pk_values() to read the data
into the format expected by evaluation_inputs, but on the other hand
we have to adjust get_prefetched_list() to fix up the type of
the returned list (we return it as a map, not a list, so list updates
can access the index).
2023-02-12 17:25:41 +02:00
Botond Dénes
d1d53f1b84 query: add tombstone-limit to read-command
Propagate the tombstone-limit from coordinator to replicas, to make sure
all is using the same limit.
2022-08-10 06:01:47 +03:00
Konstantin Osipov
670b2562a1 lwt: Cassandrda compatibility when incarnating a row for UPDATE
When evaluating an LWT condition involving both static and non-static
cells, and matching no regular row, the static row must be used UNLESS
the IF condition is IF EXISTS/IF NOT EXISTS, in which case special rules
apply.

Before this fix, Scylla used to assume a row doesn't exist if there is
no matching primary key. In Cassandra, if there is a
non-empty static row in the partition, a regular row based
on the static row' cell values is created in this case, and then this
row is used to evaluate the condition.

This problem was reported as gh-10081.

The reason for Scylla behaviour before the patch was that when
implementing LWT I tried to converge Cassandra data model (or lack of
thereof) with a relational data model, and assumed a static row is a
"shared" portion of a regular row, i.e. a storage level concept intended
to save space, and doesn't have independent existence.
This was an oversimplification.

This patch fixes gh-10081, making Scylla semantics match the one of
Cassandra.

I will now list other known examples when a static row has an own
independent existence as part of a table, for cataloguing purposes.

SELECT * from a partition which has a partition key
and a static cell set returns 1 row. If later a regular row is added
to the partition, the SELECT would still return 1 row, i.e.
the static row will disappear, and a regular row will appear instead.

Another example showing a static row has an independent existence below:

CREATE TABLE t (p int, c int, s int static, PRIMARY KEY(p, c));
INSERT INTO t (p, c) VALUES(1, 1);
INSERT INTO t (p, s) VALUES(1, 1) IF NOT EXISTS;

In Cassandra (and Scylla), IF NOT EXISTS evaluates to TRUE, even though both
the regular row and the partition exist. But the static cells are not
set, and the insert only provides a partition key, so the database assumes the
insert is operating against a static row.

It would be wrong to assume that a static row exists when the partition
key exists:
INSERT INTO t (p, c, s) VALUES(1, 1, 1) IF NOT EXISTS;

 [applied] | p | c | s
 -----------+---+---+------
      False | 1 | 1 | null

evaluates to False, i.e. the regular row does exist when p and c exist.

Issue

CREATE TABLE t (p INT, c INT, r INT, s INT static, PRIMARY KEY(p, c))
INSERT INTO t (p, s) VALUES (1, 1);
UPDATE t SET s=2, r=1 WHERE p=1 AND c=1 IF s=1 and r=null;
- in this case, even though the regular row doesn't exist, the static
row does, and should be used for condition evaluation.

In other words, IF EXISTS/IF NOT EXISTS have contextual semantics.
They apply to the regular row if clustering key is used in the WHERE
clause, otherwise they apply to static row.

One analogy for static rows is that it is like a static member of C++ or
Java class. It's an attribute of the class (assuming class = partition),
which is accessible through every object of the class (object = regular
row). It is also present if there are no objects of the class, but the
class itself exists: i.e. a partition could have no regular rows, but
some static cells set, in this case it has a static row.

*Unlike C++/Java static class members* a static row is an optional
attribute of the partition. A partition may exist, but the static row
may be absent (e.g. no static cell is set). If the static row does exist,
all regular rows share its contents, *even if they do not exist*.
A regular row exists when its clustering key is present
in the table. A static row exists when at least one static cell is set.

Tests are updated because now when no matching row is found
for the update we show the value of the static row as the previous
value, instead of a non-matching clustering row.

Changes in v2:
- reworded the commit message
- added select tests

Closes #10711
2022-06-16 19:23:46 +03:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
b29d3f1758 cas_request: Make read_command() accept query_processor
Just relpace the argument and patch the callers

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-12-23 10:54:28 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
2187a59089 treewide: move service::cas_request out from storage_proxy.hh
And remove all remaining inclusions of `storage_proxy.hh` in the
headers.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
92fd515186 lwt: for each statement in cas_request provide a row in CAS result set
Previously batch statement result set included rows for only
those updates which have a prefetch data present (i.e. there
was an "old" (pre-existing) row for a key).

Also, these rows were sorted not in the order in which statements
appear in the batch, but in the order of updated clustering keys.

If we have a batch which updates a few non-existent keys, then
it's impossible to figure out which update inserted a new key
by looking at the query response. Not only because the responses
may not correspond to the order of statements in the batch, but
even some rows may not show up in the result set at all.

The patch proposes the following fix:

For conditional batch statements the result set now always
includes a row for each LWT statement, in the same order
in which individual statements appear in the batch.

This way we can always tell which update did actually insert
a new key or update the existing one.

`update_parameters::prefetch_data::row::is_in_cas_result_set`
member variable was removed as well as supporting code in
`cas_request::applies_to` which iterated through cas updates
and marked individual `prefetch_data` rows as "need to be in
cas result set".

Instead now `cas_request::applies_to` is significantly
simplified since it doesn't do anything more than checking
`stmt.applies_to()` in short-circuiting manner.

A few tests for the issue are written, other lwt-batch-related
tests were adjusted accordingly to include rows in result set
for each statement inside conditional batches.

Tests: unit(dev, debug)

Co-authored-by: Konstantin Osipov <kostja@scylladb.com>
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 13:13:26 +03:00
Pavel Solodovnikov
feaf2b6320 cas_request: move modification_statement::build_cas_result_set to cas_request
This is just a plain move of the code from `modification_statement`
to `cas_request` without changes in the logic, which will further
help to refactor `build_cas_result_set` behavior to include a row
for each LWT statement and order rows in the order of statements
in a batch.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 12:25:06 +03:00
Pavel Solodovnikov
0f0ff73a58 cas_request: extract find_old_row helper function
Factor out little helper function which finds a pre-existing
row for a given `cas_row_update` (matching the primary key).
Used in `cas_request::applies_to`.

Will be used in a subsequent patch to move
`modification_statement::build_cas_result_set` into `cas_request`.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 12:09:31 +03:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Botond Dénes
92a7b16cba query: read_command: add max_result_size
This field will replace max size which is currently passed once per
established rpc connection via the CLIENT_ID verb and stored as an
auxiliary value on the client_info. For now it is unused, but we update
all sites creating a read command to pass the correct value to it. In the
next patch we will phase out the old max size and use this field to pass
max size on each verb instead.
2020-07-28 18:00:29 +03:00
Pavel Emelyanov
3df4f3078f storage_proxy: Move hint_wrapper from .hh to .cc
It's only used there, but requires mutation_query.hh, which can thus be
removed from storage_proxy.hh

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-17 17:40:25 +03:00
Gleb Natapov
0fed86e4c6 lwt: change cas_request::apply signature
Change the way query result is passed from getting a reference to a
result to getting a foreign_ptr<lw_shared_ptr<query::result>>. This will
allow cas_request to keep it without copying.
2020-05-05 12:38:23 +03:00
Avi Kivity
6c84dd0045 cql3: update_statement: do not set query option always_return_static_content for list read-before-write
The query option always_return_static_content was added for lightweight
transations in commits e0b31dd273 (infrastructure) and 65b86d155e
(actual use). However, the flag was added unconditionally to
update_parameters::options. This caused it to be set for list
read-modify-write operations, not just for lightweight transactions.
This is a little wasteful, and worse, it breaks compatibility as old
nodes do not understand the always_return_static_content flag and
complain when they see it.

To fix, remove the always_return_static_content from
update_parameters::options and only set it from compare-and-swap
operations that are used to implement lightweight transactions.

Fixes #5593.

Reviewed-by: Gleb Natapov <gleb@scylladb.com>
Message-Id: <20200114135133.2338238-1-avi@scylladb.com>
2020-01-14 16:15:20 +02:00
Konstantin Osipov
90346236ac cql: propagate const property through prepared statement tree.
cql_statement is a class representing a prepared statement in Scylla.
It is used concurrently during execution, so it is important that its
change is not changed by execution.

Add const qualifier to the execution methods family, throghout the
cql hierarchy.

Mark a few places which do mutate prepared statement state during
execution as mutable. While these are not affecting production today,
as code ages, they may become a source of latent bugs and should be
moved out of the prepared state or evaluated at prepare eventually:

cf_property_defs::_compaction_strategy_class
list_permissions_statement::_resource
permission_altering_statement::_resource
property_definitions::_properties
select_statement::_opts
2019-11-26 14:18:17 +03:00
Avi Kivity
6aed3b7471 Merge "cql: trivial cleanup" from Vova
* 'cql-trivial-cleanup' of ssh://github.com/scylladb/scylla-dev:
  cql: rename modification_statement::_sets_a_collection to _selects_a_collection
  cql: rename _column_conditions to _regular_conditions
  cql: remove unnecessary optional around prefetch_data
2019-11-13 15:12:10 +02:00
Konstantin Osipov
6159c012db schema: pre-allocate the bitset of column_set
The number of columns is usually small, and avoiding
a resize speeds up bit manipulation functions.
2019-11-13 11:41:51 +03:00
Konstantin Osipov
191acec7ab schema: rename column_mask to column_set
Since it contains a precise set of columns, it's more
accurate to call it a set, not a mask. Besides, the name
column_mask is already used for column options on storage
level.
2019-11-13 11:41:30 +03:00
Konstantin Osipov
0ad0369684 cql: remove unnecessary optional around prefetch_data 2019-11-12 20:15:24 +03:00
Vladimir Davydov
65b86d155e cql: add static row to CAS failure result if there are static conditions
Even if no rows match clustering key restrictions of a conditional
statement with static columns conditions, we still must include the
static column value into the CAS failure result set. For example,
the following conditional DELETE statement

  create table t(k int, c int, s int static, v int, primary key(k, c));
  insert into t(k, s) values(1, 1);
  delete v from t where k=1 and c=1 if v=1 and s=1;

must return

  [applied=False, v=null, s=1]

not just

  [applied=False, v=null, s=null]

To fix that, set partition_slice::option::always_return_static_content
for querying rows used for checking conditions so that we have the
static row in update_parameters::prefetch_data even if no regular row
matches clustering column restrictions. Plus modify cas_request::
applies_to() so that it sets is_in_cas_result_set flag for the static
row in case there are static column conditions, but the result set
happens to be empty.

As pointed out by Tomek, there's another reason to set partition_slice::
option::always_return_static_content apart from building a correct
result set on CAS failure. There could be a batch with two statements,
one with clustering key restrictions which select no row, and another
statement with only static column conditions. If we didn't enable this
flag, we wouldn't get a static row even if it exists, and static column
conditions would evaluate as if the static row didn't exist, for
example, the following batch

  create table t(k int, c int, s int static, primary key(k, c));
  insert into t(k, s) values(1, 1);
  begin batch
  insert into t(k, c) values(1, 1) if not exists
  update t set s = 2 where k = 1 if s = 1
  apply batch;

would fail although it clearly must succeed.
2019-10-28 22:30:37 +03:00
Vladimir Davydov
57d284d254 cql: exclude statements not checked by cas from result set
Apart from conditional statements, there may be other reading statements
in a batch, e.g. manipulating lists. We must not include rows fetched
for them into the CAS result set. For instance, the following CAS batch:

  create table t(p int, c int, i int, l list<int>, primary key(p, c));
  insert into t(p, c, i) values(1, 1, 1)
  insert into t(p, c, i, l) values(1, 1, 1, [1, 2, 3])
  begin batch
  update t set i=3 where p=1 and c=1 if i=2
  update t set l=l-[2] where p=1 and c=2
  apply batch;

is supposed to return

  [applied] | p | c | i
  ----------+---+---+---
     False  | 1 | 1 | 1

not

  [applied] | p | c | i
  ----------+---+---+---
     False  | 1 | 1 | 1
     False  | 1 | 2 | 1

To filter out such collateral rows from the result set, let's mark rows
checked by conditional statements with a special flag.
2019-10-28 21:50:43 +03:00
Vladimir Davydov
8fbf344f03 cql: ignore clustering key if statement checks only static columns
In case a CQL statement has only static columns conditions, we must
ignore clustering key restrictions.

Example:

  create table t(p int, c int, s int static, v int, primary key(p, c));
  insert into t(p, s) values(1, 1);
  update t set v=1 where p=1 and c=1 if s=1;

This conditional statement must successfully insert row (p=1, c=1, v=1)
into the table even though there's no regular row with p=1 and c=1 in
the table before it's executed, because the statement condition only
applies to the static column s, which exists and matches.
2019-10-28 21:13:19 +03:00
Konstantin Osipov
e555dc502e lwt: implement basic lightweight transactions support
Support single-statement conditional updates and as well as batches.

This patch almost fully rewrites column_condition.cc, implementing
is_satisfied_by().

Most of the remaining complications in column_condition implementation
come from the need to properly handle frozen and multi-cell
collection in predicates - up until now it was not possible
to compare entire collection values between each other. This is further
complicated since multi-cell lists and sets are returned as maps.

We can no longer assume that the columns fetched by prefetch operation
are non-frozen collections. IF EXISTS/IF NOT EXISTS condition
fetches all columns, besides, a column may be needed to check other
condition.

When fetching the old row for LWT or to apply updates on list/columns,
we now calculate precisely the list of columns to fetch.

The primary key columns are also included in CAS batch result set,
and are thus also prefetched (the user needs them to figure out which
statements failed to apply).

The patch is cross-checked for compatibility with cassandra-3.11.4-1545-g86812fa502
but does deviate from the origin in handling of conditions on static
row cells. This is addressed in future series.
2019-10-27 23:42:49 +03:00