Commit Graph

39 Commits

Author SHA1 Message Date
Botond Dénes
21584262be query_result_builder: remove v1 support
Amounts to dropping (the noop) range tombstone consume() overload.
2022-03-11 09:24:17 +02:00
Botond Dénes
728c14549f query_result_writer: add v2 support
Add a consume() overload which takes a range tombstone change and drops
it just like the existing range tombstone overload does: query results
don't care about range tombstones.
2022-03-11 09:22:14 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Benny Halevy
fa6d6c17f2 mutation_partition: mark query_result_builder constructor noexcept
It is trivially so.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Botond Dénes
950150c6df query_result_builder: make it a public type
We will want to use it in multishard_mutation_query.cc.
2021-03-02 07:53:53 +02:00
Avi Kivity
00864b26c3 query-result-writer: fix idl definition order related failures with clang
Following ad48d8b43c, fix a similar problem which popped up
with higher inlining thresholds in query-result-writer.hh. Since
idl/query depends on idl/keys, it must follow in definition order.

Closes #7384
2020-10-11 17:57:12 +03:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Duarte Nunes
b2e1a91f4d query-result: Use digester instead of md5_hasher
Use the digester class instead of md5_hasher to encapsulate the
decision of which hash algorithm to use.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00
Duarte Nunes
6b4b429883 query-result: Introduce class result_options
Introduce class result_options to carry result options through the
request pipeline, which at this point mean the result type and the
digest algorithm. This class allows us to encapsulate the concrete
digest algorithm to use.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00
Avi Kivity
e428805ba5 Merge "Optimize query result partition and row counts" from Duarte
"Now that range queries go through the normal digest path, we rely on
query::result::calculate_counts() to count the amount of partitions
and rows returned.

This series optimizes it, in case it is needed, and also changes the
result message to include the partition and row counts, avoiding the
calculation altogether."

* 'calculate-counts/v3' of github.com:duarten/scylla:
  query-result: Send row and partition count over the wire
  query::result: Optimize calculate_counts()
2017-08-17 13:41:21 +03:00
Duarte Nunes
a17cef76b2 query-result-writer: Remove unneeded field
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20170811102940.22747-1-duarte@scylladb.com>
2017-08-14 12:33:33 +01:00
Duarte Nunes
3b9a9b7321 query-result: Send row and partition count over the wire
To avoid calculating them on the coordinator side.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-08-14 10:29:06 +02:00
Paweł Dziepak
38ee69dee0 idl: allow writers to use any output stream
Original IDL generated code was hardcoded to always use bytes_ostream.
This patch makes the output stream a template parameter so that any
valid output stream can be used.
Unfortunately, making IDL writers generic requires updates in the code
that uses them, this is fixed in C++17 which would be able to deduce the
parameter in most cases.
2016-12-22 13:35:04 +01:00
Duarte Nunes
93be8d7cef query::result: Add partition count
This patch adds a partition count to query::result, filled by the
query::result::builder. The partition count is present whenever the
result carries data, being absent only for the case where the result
contains only a digest.

We also ensure that counts are present for an empty query::result.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Duarte Nunes
2409b6b250 query::result::builder: Add partition count
This patch adds a partition count to the query::result::builder. It is
intended to be incremented by users, and later used to build a
query::result.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-12-15 10:27:46 +00:00
Paweł Dziepak
0bce4047bd query_builder: add partition_slice getter
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
15de8de9e5 reconcilable_result: keep result_memory_tracker object
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:02 +00:00
Paweł Dziepak
da7ca85040 query: allow short reads
When paging is used the cluster is allowed to return less rows than the
client asked for. However, if such possibility is used we need a way of
telling that to the coordinator and the paging implementation so that
they can differentiate between short reads caused by the replica running
out of data to sent and short reads caused by any other means.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:01 +00:00
Avi Kivity
18078bea9b storage_proxy: avoid calculating digest when only one replica is contacted
If we're talking to just one replica, the digest is not going to be used,
so better not to calculate it at all.  The optimization helps with
LOCAL_ONE queries where the result is large, but does not contain large
blobs (many small rows).

This patch adds a digest_algorithm parameter to the READ_DATA verb that
can take on two values: none and MD5 (default), and sets it to none when
we're reading from one replica.

In the future we may add other values for more hardware-friendly digest
algorithms.
Message-Id: <1479380600-19206-1-git-send-email-avi@scylladb.com>
2016-11-17 13:04:30 +02:00
Gleb Natapov
1e6f64f4ab query: add latest modification timestamp to result structure 2016-05-24 13:27:34 +03:00
Gleb Natapov
5fef0717cc query: find latest modification timestamp while calculating result digest 2016-05-24 13:27:34 +03:00
Gleb Natapov
db322d8f74 query: put live row count into query::result
The patch calculates row count during result building and while merging.
If one of results that are being merged does not have row count the
merged result will not have one either.
2016-05-02 15:10:15 +03:00
Pekka Enberg
38a54df863 Fix pre-ScyllaDB copyright statements
People keep tripping over the old copyrights and copy-pasting them to
new files. Search and replace "Cloudius Systems" with "ScyllaDB".

Message-Id: <1460013664-25966-1-git-send-email-penberg@scylladb.com>
2016-04-08 08:12:47 +03:00
Paweł Dziepak
82d2a2dccb specify whether query::result, result_digest or both are needed
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-11 18:27:13 +00:00
Paweł Dziepak
21e2ebcf8c query: build only result, only digest or both
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-11 18:27:13 +00:00
Paweł Dziepak
46079f763b query: add keys and tombstones to result digest
Query result digest is used to verify that all replicas have the same
data. Therefore, it needs to contain more information than the query
result itself in order to ensure proper detection of disagreements.

Generally, adding clustering keys to the digest regardless of whether
the client asked for them will guarantee correctness. However, adding
tombstones as well improves the chances of early detection of nodes
containing stale data.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-03-11 18:27:13 +00:00
Tomasz Grabiec
6cec131432 query: Switch to IDL-generated views and writers
The query result footprint for cassandra-stress mutation as reported
by tests/memory-footprint increased by 18% from 285 B to 337 B.

perf_simple_query shows slight regression in throughput (-8%):

  build/release/tests/perf/perf_simple_query -c4 -m1G --partitions 100000

Before: ~433k tps
After:  ~400k tps
2016-02-26 12:26:13 +01:00
Tomasz Grabiec
63006e5dd2 query: Serialize collection cells using CQL format
We want the format of query results to be eventually defined in the
IDL and be independent of the format we use in memory to represent
collections. This change is a step in this direction.

The change decouples format of collection cells in query results from
our in-memory representation. We currently use collection_mutation_view,
after the change we will use CQL binary protocol format. We use that because
it requires less transformations on the coordinator side.

One complication is that some list operations need to retrieve keys
used in list cells, not only values. To satisfy this need, new query
option was added called "collections_as_maps" which will cause lists
and sets to be reinterpreted as maps matching their underlying
representation. This allows the coordinator to generate mutations
referencing existing items in lists.
2016-02-15 17:05:55 +01:00
Tomasz Grabiec
916a91c913 query: Split send_timestamp_and_expiry into two separate options
It's cleaner that way. They don't need to come together.
2016-02-15 16:53:56 +01:00
Avi Kivity
79f7431a03 db: change collection_mutation::{one,view} not to use nested classes
Nested classes cannot be forward-declared, so change the naming
not to use them.  Follows atomic_cell{,_view}.
2015-11-13 17:13:07 +02:00
Calle Wilund
284b10cabe Make partition_slice::row_ranges mulitplex on partition
Allows for having more than one clustering row range set, depending on
PK queried (although right now limited to one - which happens to be exactly
the number of mutiplexing paging needs... What a coincidence...)

Encapsulates the row_ranges member in a query function, and if needed holds
ranges outside the default one in an extra object.

Query result::builder::add_partition now fetches the correct row range for
the partition, and this is the range used in subsequent iteration.
2015-11-10 13:12:33 +01:00
Avi Kivity
d5cf0fb2b1 Add license notices 2015-09-20 10:43:39 +03:00
Tomasz Grabiec
c08742e4fc query-result-writer: Remove assert(_finished) guards from destructors
They're not exception-safe. If exception is thrown before finish() is
called, they'll trigger.
2015-09-06 21:24:58 +02:00
Tomasz Grabiec
9724b84bb3 db: Fix query of partitions with no live clustered rows
When partition has no live regular rows, but has some data live in the
static row, then it should appear in the results, even though we
didn't select any static column.

To reproduce:

  create table cf (k blob, c blob, v blob, s1 blob static, primary key (k, c));
  update cf set s1 = 0x01 where k = 0x01;
  update cf set s1 = 0x02 where k = 0x02;
  select k from cf;

The "select" statement should return 2 rows, but was returning 0.

The following query worked fine, because static columns were included:

  select * from cf;

The data query should contain only live data, so we shouldn't write a
partition entry if it's supposed to be absent from the results. We
can'r tell that though until we've processed all the data. To solve
this problem, query result writer is using an optimistic approach,
where the partition header will be retracted from the buffer
(cheaply), if it turns out there's no live data in it.
2015-07-09 19:55:00 +02:00
Tomasz Grabiec
09ed972068 mutation_partition: Remove redundant slice parameter from query()
The slice used by partition_writer must match the one used by query()
anyway.
2015-07-09 19:47:32 +02:00
Tomasz Grabiec
eaceb61801 db: Add atomic_cell::deletion_time()
Deleted cells store deletion time not expiry time. This change makes
expiry() valid only for live cells with TTL and adds deletion_time(),
which is inteded to be used with deleted cells.
2015-05-10 12:03:26 +03:00
Tomasz Grabiec
5ba1486ae7 db: Rename "ttl" to "expiry" when it's used as time point
To avoid confusion with "ttl" the duration.
2015-05-06 17:27:22 +02:00
Tomasz Grabiec
00f99cefd4 db: split query.hh to reduce header dependencies 2015-04-15 20:44:59 +02:00