Commit Graph

69 Commits

Author SHA1 Message Date
Ernest Zaslavsky
5ba5aec1f8 treewide: Move mutation related files to a mutation directory
As requested in #22104, moved the files and fixed other includes and build system.

Moved files:
 - combine.hh
 - collection_mutation.hh
 - collection_mutation.cc
 - converting_mutation_partition_applier.hh
 - converting_mutation_partition_applier.cc
 - counters.hh
 - counters.cc
 - timestamp.hh

Fixes: #22104

This is a cleanup, no need to backport

Closes scylladb/scylladb#25085
2025-09-24 13:23:38 +03:00
Ernest Zaslavsky
d624413ddd treewide: Move query related files to a new query directory
As requested in #22120, moved the files and fixed other includes and build system.

Moved files:
- query.cc
- query-request.hh
- query-result.hh
- query-result-reader.hh
- query-result-set.cc
- query-result-set.hh
- query-result-writer.hh
- query_id.hh
- query_result_merger.hh

Fixes: #22120

This is a cleanup, no need to backport

Closes scylladb/scylladb#25105
2025-09-16 23:40:47 +03:00
Avi Kivity
f3eade2f62 treewide: relicense to ScyllaDB-Source-Available-1.0
Drop the AGPL license in favor of a source-available license.
See the blog post [1] for details.

[1] https://www.scylladb.com/2024/12/18/why-were-moving-to-a-source-available-license/
2024-12-18 17:45:13 +02:00
Nadav Har'El
a4a318f394 cql: USING TTL 0 means unlimited, not default TTL
Our documentation states that writing an item with "USING TTL 0" means it
should never expire. This should be true even if the table has a default
TTL. But Scylla mistakenly handled "USING TTL 0" exactly like having no
USING TTL at all (i.e., it took the default TTL, instead of unlimited).
We had two xfailing tests demonstrating that Scylla's behavior in this
is different from Cassandra. Scylla's behavior in this case was also
undocumented.

By the way, Cassandra used to have the same bug (CASSANDRA-11207) but
it was fixed already in 2016 (Cassandra 3.6).

So in this patch we fix Scylla's "USING TTL 0" behavior to match the
documentation and Cassandra's behavior since 2016. One xfailing test
starts to pass and the second test passes this bug and fails on a
different one. This patch also adds a third test for "USING TTL ?"
with UNSET_VALUE - it behaves, on both Scylla and Cassandra, like a
missing "USING TTL".

The origin of this bug was that after parsing the statement, we saved
the USING TTL in an integer, and used 0 for the case of no USING TTL
given. This meant that we couldn't tell if we have USING TTL 0 or
no USING TTL at all. This patch uses an std::optional so we can tell
the case of a missing USING TTL from the case of USING TTL 0.

Fixes #6447

Signed-off-by: Nadav Har'El <nyh@scylladb.com>

Closes #13079
2023-03-08 16:18:23 +02:00
Avi Kivity
69a385fd9d Introduce schema/ module
Schema related files are moved there. This excludes schema files that
also interact with mutations, because the mutation module depends on
the schema. Those files will have to go into a separate module.

Closes #12858
2023-02-15 11:01:50 +02:00
Avi Kivity
c5e4bf51bd Introduce mutation/ module
Move mutation-related files to a new mutation/ directory. The names
are kept in the global namespace to reduce churn; the names are
unambiguous in any case.

mutation_reader remains in the readers/ module.

mutation_partition_v2.cc was missing from CMakeLists.txt; it's added in this
patch.

This is a step forward towards librarization or modularization of the
source base.

Closes #12788
2023-02-14 11:19:03 +02:00
Avi Kivity
3a2d8175fb cql3: update_parameters: use evaluation_inputs compatible row prefetch
update_parameters::prefetch_data is used for some list updates (which
need a read-before-write to determine the key to update) and for
LWT compare-and-swap. Currently they use a custom structure for
representing a read row.

Switch to the same structure that is used in evaluation_inputs (and
in SELECT statement evaluation) to the expression machinery can be reused.

The expression representation is irregular (with different fields for
the keys and regular/static columns), so we introduce an old_row
structure to hold both the clustering key and the regular row values
for cas_request.

A nice bonus is that we can use get_non_pk_values() to read the data
into the format expected by evaluation_inputs, but on the other hand
we have to adjust get_prefetched_list() to fix up the type of
the returned list (we return it as a map, not a list, so list updates
can access the index).
2023-02-12 17:25:41 +02:00
Avi Kivity
5937b1fa23 treewide: remove empty comments in top-of-files
After fcb8d040 ("treewide: use Software Package Data Exchange
(SPDX) license identifiers"), many dual-licensed files were
left with empty comments on top. Remove them to avoid visual
noise.

Closes #10562
2022-05-13 07:11:58 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
0909e3c17d treewide: remove redundant "x <=> 0" compares
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.

Ref #1449.
2021-07-28 13:30:32 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Michał Chojnowski
c555e84a77 cql3: update_parameters: remove unused version of make_cell for bytes_view
It became unused after previous patches in this series changed the
representation of collections in cql3 from bytes_view to managed_bytes_view.
2021-04-01 10:44:21 +02:00
Michał Chojnowski
9777026e71 cql3: update_parameters: add make_cell version for managed_bytes_view
We will use it to port the representation of collections in cql3/
from bytes to managed_bytes.
The duplicate version for bytes_view will be removed after that transition
is complete.
2021-04-01 10:42:07 +02:00
Michał Chojnowski
c2c6b2abfa cql3: remove operation::make_*cell
The operation::make_*cell functions are useless aliases to methods of
update_parameters, and are used interchangeably with them throughout the code.
Remove them.

Also, remove the now-unused update_parameters::make_cell version for
fragmented_temporary_buffer::view.
2021-04-01 10:42:07 +02:00
Michał Chojnowski
b9322a6b71 cql3: switch users of cql3::raw_value_view to internals-independent API
We want to change the internals of cql3::raw_value{_view}.
However, users of cql3::raw_value and cql3::raw_value_view often
use them by extracting the internal representation, which will be different
after the planned change.

This commit prepares us for the change by making all accesses to the value
inside cql3::raw_value(_view) be done through helper methods which don't expose
the internal representation publicly.

After this commit we are free to change the internal representation of
raw_value_{view} without messing up their users.
2021-04-01 10:42:04 +02:00
Avi Kivity
58b7f225ab keys: convert trichotomic comparators to return std::strong_ordering
A trichotomic comparator returning an int an easily be mistaken
for a less comparator as the return types are convertible.

Use the new std::strong_ordering instead.

A caller in cql3's update_parameters.hh is also converted, following
the path of least resistance.

Ref #1449.

Test: unit (dev)

Closes #8323
2021-03-21 09:30:43 +02:00
Pavel Solodovnikov
92fd515186 lwt: for each statement in cas_request provide a row in CAS result set
Previously batch statement result set included rows for only
those updates which have a prefetch data present (i.e. there
was an "old" (pre-existing) row for a key).

Also, these rows were sorted not in the order in which statements
appear in the batch, but in the order of updated clustering keys.

If we have a batch which updates a few non-existent keys, then
it's impossible to figure out which update inserted a new key
by looking at the query response. Not only because the responses
may not correspond to the order of statements in the batch, but
even some rows may not show up in the result set at all.

The patch proposes the following fix:

For conditional batch statements the result set now always
includes a row for each LWT statement, in the same order
in which individual statements appear in the batch.

This way we can always tell which update did actually insert
a new key or update the existing one.

`update_parameters::prefetch_data::row::is_in_cas_result_set`
member variable was removed as well as supporting code in
`cas_request::applies_to` which iterated through cas updates
and marked individual `prefetch_data` rows as "need to be in
cas result set".

Instead now `cas_request::applies_to` is significantly
simplified since it doesn't do anything more than checking
`stmt.applies_to()` in short-circuiting manner.

A few tests for the issue are written, other lwt-batch-related
tests were adjusted accordingly to include rows in result set
for each statement inside conditional batches.

Tests: unit(dev, debug)

Co-authored-by: Konstantin Osipov <kostja@scylladb.com>
Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2020-09-04 13:13:26 +03:00
Pavel Emelyanov
757a7145b9 headers: Remove mutation.hh from trace_state.hh
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-17 17:40:23 +03:00
Pavel Emelyanov
4fa12f2fb8 header: De-bloat schema.hh
The header sits in many other headers, but there's a handy
schema_fwd.hh that's tiny and contains needed declarations
for other headers. So replace shema.hh with schema_fwd.hh
in most of the headers (and remove completely from some).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200303102050.18462-1-xemul@scylladb.com>
2020-03-03 11:34:00 +01:00
Pavel Solodovnikov
a46f235092 cql3: prefer passing schema as const ref instead of shared_ptr
De-pointerize cql3 code APIs further: change some call sites
to pass `schema` as const-ref instead of `shared_ptr`.

Affected functions known to be expecting always non-null
pointer to schema and don't store or pass the pointer somewhere
else, assuming it's safe to give them just a reference.

Tests: unit(dev, debug)

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
Message-Id: <20200218142338.69824-1-pa.solodovnikov@scylladb.com>
2020-02-18 20:13:10 +02:00
Avi Kivity
6c84dd0045 cql3: update_statement: do not set query option always_return_static_content for list read-before-write
The query option always_return_static_content was added for lightweight
transations in commits e0b31dd273 (infrastructure) and 65b86d155e
(actual use). However, the flag was added unconditionally to
update_parameters::options. This caused it to be set for list
read-modify-write operations, not just for lightweight transactions.
This is a little wasteful, and worse, it breaks compatibility as old
nodes do not understand the always_return_static_content flag and
complain when they see it.

To fix, remove the always_return_static_content from
update_parameters::options and only set it from compare-and-swap
operations that are used to implement lightweight transactions.

Fixes #5593.

Reviewed-by: Gleb Natapov <gleb@scylladb.com>
Message-Id: <20200114135133.2338238-1-avi@scylladb.com>
2020-01-14 16:15:20 +02:00
Vladimir Davydov
65b86d155e cql: add static row to CAS failure result if there are static conditions
Even if no rows match clustering key restrictions of a conditional
statement with static columns conditions, we still must include the
static column value into the CAS failure result set. For example,
the following conditional DELETE statement

  create table t(k int, c int, s int static, v int, primary key(k, c));
  insert into t(k, s) values(1, 1);
  delete v from t where k=1 and c=1 if v=1 and s=1;

must return

  [applied=False, v=null, s=1]

not just

  [applied=False, v=null, s=null]

To fix that, set partition_slice::option::always_return_static_content
for querying rows used for checking conditions so that we have the
static row in update_parameters::prefetch_data even if no regular row
matches clustering column restrictions. Plus modify cas_request::
applies_to() so that it sets is_in_cas_result_set flag for the static
row in case there are static column conditions, but the result set
happens to be empty.

As pointed out by Tomek, there's another reason to set partition_slice::
option::always_return_static_content apart from building a correct
result set on CAS failure. There could be a batch with two statements,
one with clustering key restrictions which select no row, and another
statement with only static column conditions. If we didn't enable this
flag, we wouldn't get a static row even if it exists, and static column
conditions would evaluate as if the static row didn't exist, for
example, the following batch

  create table t(k int, c int, s int static, primary key(k, c));
  insert into t(k, s) values(1, 1);
  begin batch
  insert into t(k, c) values(1, 1) if not exists
  update t set s = 2 where k = 1 if s = 1
  apply batch;

would fail although it clearly must succeed.
2019-10-28 22:30:37 +03:00
Vladimir Davydov
57d284d254 cql: exclude statements not checked by cas from result set
Apart from conditional statements, there may be other reading statements
in a batch, e.g. manipulating lists. We must not include rows fetched
for them into the CAS result set. For instance, the following CAS batch:

  create table t(p int, c int, i int, l list<int>, primary key(p, c));
  insert into t(p, c, i) values(1, 1, 1)
  insert into t(p, c, i, l) values(1, 1, 1, [1, 2, 3])
  begin batch
  update t set i=3 where p=1 and c=1 if i=2
  update t set l=l-[2] where p=1 and c=2
  apply batch;

is supposed to return

  [applied] | p | c | i
  ----------+---+---+---
     False  | 1 | 1 | 1

not

  [applied] | p | c | i
  ----------+---+---+---
     False  | 1 | 1 | 1
     False  | 1 | 2 | 1

To filter out such collateral rows from the result set, let's mark rows
checked by conditional statements with a special flag.
2019-10-28 21:50:43 +03:00
Vladimir Davydov
74b9e80e4c cql: fix EXISTS check that applies only to static columns
If a CQL statement only updates static columns, i.e. has no clustering
key restrictions, we still fetch a regular row so that we can check it
against EXISTS condition. In this case we must be especially careful: we
can't simply pass the row to modification_statement::applies_to, because
it may turn out that the row has no static columns set, i.e. there's no
in fact static row in the partition. So we filter out such rows without
static columns right in cas_request::applies_to before passing them
further to modification_statement::applies_to.

Example:

  create table t(p int, c int, s int static, primary key(p, c));
  insert into t(p, c) values(1, 1);
  insert into t(p, s) values(1, 1) if not exists;

The conditional statement must succeed in this case.
2019-10-28 21:49:37 +03:00
Vladimir Davydov
934a87999f cql: turn prefetch_data::row into struct
This will allow us to add helper methods and store extra info in each
row. For example, we can add a method for checking if a row has static
columns. Also, to build CAS result set, we need to differentiate rows
fetched to check conditions from those fetched for reading operations.
Using struct as row container will allow us to store this information in
each prefetched row.
2019-10-28 21:12:52 +03:00
Konstantin Osipov
e555dc502e lwt: implement basic lightweight transactions support
Support single-statement conditional updates and as well as batches.

This patch almost fully rewrites column_condition.cc, implementing
is_satisfied_by().

Most of the remaining complications in column_condition implementation
come from the need to properly handle frozen and multi-cell
collection in predicates - up until now it was not possible
to compare entire collection values between each other. This is further
complicated since multi-cell lists and sets are returned as maps.

We can no longer assume that the columns fetched by prefetch operation
are non-frozen collections. IF EXISTS/IF NOT EXISTS condition
fetches all columns, besides, a column may be needed to check other
condition.

When fetching the old row for LWT or to apply updates on list/columns,
we now calculate precisely the list of columns to fetch.

The primary key columns are also included in CAS batch result set,
and are thus also prefetched (the user needs them to figure out which
statements failed to apply).

The patch is cross-checked for compatibility with cassandra-3.11.4-1545-g86812fa502
but does deviate from the origin in handling of conditions on static
row cells. This is addressed in future series.
2019-10-27 23:42:49 +03:00
Konstantin Osipov
f32a7a0763 lwt: move option set for modification statement read command
Move the option set for read command to update_parameters
class, since this class encapsulates the logic of working
with the read command result.
2019-10-16 22:41:00 +03:00
Konstantin Osipov
a2b629c3a1 lwt: boost update_parameters to serve as a CAS result set
In modification_statement/batch_statement, we need to prefetch data to
1) apply list operations
2) evaluate CAS conditions
3) return CAS result set.

Boost update_parameters::prefetch_data to serve as a single result set
for all of the above. In case of a batch, store multiple rows for
multiple clustering keys involved in the batch.

Use an ordered set for columns and rows to make sure 3) CAS result set
is returned to the client in an ordered manner.

Deserialize the primary key and add it to result set rows since
it is returned to the client as part of CAS result set.

Index columns using ordinal_id - this allows having a single
set for all columns and makes columns easy to look up.

Remove an extra memcpy to build view objects when looking
up a cell by primary key, use partition_key/clustering_key
objects for lookup.
2019-10-16 15:56:50 +03:00
Konstantin Osipov
a450c25946 lwt: remove dead code in cql3/update_parameters.hh 2019-10-16 15:48:40 +03:00
Konstantin Osipov
a4ccbece5c lwt: remove an unnecessary optional around prefetch_data
Get rid of an unnecessary optional around
update_parameters::prefetch_data.

update_parameters won't own prefetch_data in the future anyway,
since prefetch_data can be shared among multiple modification
statements of a batch, each statement having its own options
and hence its own update_parameters instance.
2019-10-16 15:48:25 +03:00
Konstantin Osipov
7a399ebe0d lwt: move prefetch_data_builder to update_parameters.cc
Move prefetch_data_builder class from modification_statement.cc
to update_parameters.cc.

We're going to share the same builder to build a result set
for condition evaluation and to apply updates of batch statements, so we
need to share it.

No other changes.
2019-10-16 15:48:08 +03:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Avi Kivity
cb7ee5c765 cql3: convert sprint() to format()
sprint() recently became more strict, throwing on sprint("%s", 5). Replace
with the more modern format().

Mechanically converted with https://github.com/avikivity/unsprint.
2018-11-01 13:16:17 +00:00
Paweł Dziepak
dff6cd3e2f cql3: operation: make make_cell accept fragmented_temporary_buffer::view 2018-07-18 12:28:06 +01:00
Paweł Dziepak
0ea6d14cf5 atomic_cell: explicitly state when atomic_cell is a collection member
Collections are not going to be fully converted to the IMR just yet and
still use the old serialisation format. This means that they still don't
support fragmented values very well. This patch passes the information
when an atomic_cell is created as a member of a collection so that later
we can avoid fragmenting the value in such cases.
2018-05-31 15:51:11 +01:00
Paweł Dziepak
e9d6fc48ac treewide: require type for creating atomic_cell 2018-05-31 15:51:11 +01:00
Duarte Nunes
9e88b60ef5 mutation: Set cell using clustering_key_prefix
Change the clustering key argument in mutation::set_cell from
exploded_clustering_prefix to clustering_key_prefix, which allows for
some overall code simplification and fewer copies. This mostly affects
the cql3 layer.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-05-04 15:59:50 +02:00
Paweł Dziepak
bdac487b5a do not use long_type for counter update 2017-03-01 16:33:37 +00:00
Paweł Dziepak
d6ebf84edf cql3: add counter increment and decrement operations 2017-02-02 10:35:14 +00:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Tomasz Grabiec
63006e5dd2 query: Serialize collection cells using CQL format
We want the format of query results to be eventually defined in the
IDL and be independent of the format we use in memory to represent
collections. This change is a step in this direction.

The change decouples format of collection cells in query results from
our in-memory representation. We currently use collection_mutation_view,
after the change we will use CQL binary protocol format. We use that because
it requires less transformations on the coordinator side.

One complication is that some list operations need to retrieve keys
used in list cells, not only values. To satisfy this need, new query
option was added called "collections_as_maps" which will cause lists
and sets to be reinterpreted as maps matching their underlying
representation. This allows the coordinator to generate mutations
referencing existing items in lists.
2016-02-15 17:05:55 +01:00
Tomasz Grabiec
383296c05b cql3: Fix handling of lists with static columns
List operations and prefetching were not handling static columns
correctly. One issue was that prefetching was attaching static column
data to row data using ids which might overlap with clustered columns.

Another problem was that list operations were always constructing
clustering key even if they worked on a static column. For static
columns the key would be always empty and lookup would fail.

The effect was that list operations which depend on curent state had
no effect. Similar problem could be observed on C* 2.1.9, but not on 2.2.3.

Fixes #903.
2016-02-15 17:05:55 +01:00
Avi Kivity
79f7431a03 db: change collection_mutation::{one,view} not to use nested classes
Nested classes cannot be forward-declared, so change the naming
not to use them.  Follows atomic_cell{,_view}.
2015-11-13 17:13:07 +02:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Paweł Dziepak
f5fff734ed cq3: update_parameters: add getters for ttl and expiry
Signed-off-by: Paweł Dziepak <pdziepak@cloudius-systems.com>
2015-07-30 14:10:06 +02:00
Pekka Enberg
d50139351f cql3: Use pragma once everywhere
There's no benefit to using C include guards so switch to pragma once
everywhere for consistency.

Signed-off-by: Pekka Enberg <penberg@cloudius-systems.com>
2015-05-12 16:32:56 +03:00
Tomasz Grabiec
b1e45e4401 db: Store ttl in atomic_cell
Origin does that, so should we. Both ttl and expiry time are stored in
sstables. The value of ttl seems to be used to calculate the read
digest (expiry is not used for that).

The API for creating atomic_cells changed a bit.

To create a non-expiring cell:

  atomic_cell::make_live(timestamp, value);

To create an expiring cell:

  atomic_cell::make_live(timestamp, value, expiry, ttl);

or:

  // Expiry is calculated based on current clock reading
  atomic_cell::make_live(timestamp, value, ttl_optional);
2015-05-06 19:42:38 +02:00
Tomasz Grabiec
5ba1486ae7 db: Rename "ttl" to "expiry" when it's used as time point
To avoid confusion with "ttl" the duration.
2015-05-06 17:27:22 +02:00
Tomasz Grabiec
731a63e371 schema: Embed raw_schema inside schema
Public fields got encapsulated.
2015-04-24 18:01:01 +02:00
Tomasz Grabiec
00f99cefd4 db: split query.hh to reduce header dependencies 2015-04-15 20:44:59 +02:00