Commit Graph

98 Commits

Author SHA1 Message Date
Michał Sala
fff454761a messaging_service: add verb for count(*) request forwarding
Except for the verb addition, this commit also defines forward_request
and forward_result structures, used as an argument and result of the new
rpc. forward_request is used to forward information about select
statement that does count(*) (or other aggregating functions such as
max, min, avg in the future). Due to the inability to serialize
cql3::statements::select_statement, I chose to include
query::read_command, dht::partition_range_vector and some configuration
options in forward_request. They can be serialized and are sufficient
enough to allow creation of service::pager::query_pagers::pager.
2022-02-01 21:14:41 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Kamil Braun
b2b242d0ad query-request: add comment about clustering ranges with non-full prefix key bounds 2021-11-29 11:10:49 +01:00
Tomasz Grabiec
cc56a971e8 database, treewide: Introduce partition_slice::is_reversed()
Cleanup, reduces noise.

Message-Id: <20211014093001.81479-1-tgrabiec@scylladb.com>
2021-10-14 12:39:16 +03:00
Michał Radwański
dac2509a7f query: reverse clustering_range 2021-10-05 16:47:04 +02:00
Kamil Braun
4bd601c6fd query-request: introduce half_reverse_slice
A utility function for converting between forward and half-reversed (or
'legacy'-reversed) slices to be used in the next commit.
2021-09-28 17:03:57 +03:00
Botond Dénes
502a45ad58 treewide: switch to native reversed format for reverse reads
We define the native reverse format as a reversed mutation fragment
stream that is identical to one that would be emitted by a table with
the same schema but with reversed clustering order. The main difference
to the current format is how range tombstones are handled: instead of
looking at their start or end bound depending on the order, we always
use them as-usual and the reversing reader swaps their bounds to
facilitate this. This allows us to treat reversed streams completely
transparently: just pass along them a reversed schema and all the
reader, compacting and result building code is happily ignorant about
the fact that it is a reversed stream.
2021-09-09 15:42:15 +03:00
Botond Dénes
5d33d76cfd query: add slice reversing functions 2021-09-09 14:18:32 +03:00
Botond Dénes
a2eb0f7d7e partition_slice_builder: add constructor with slice
Intended to be used to modify an existing slice. We want to move the
slice into the direction where the schema is at: make it completely
immutable, all mutations happening through the slice builder class.
2021-09-09 14:15:42 +03:00
Botond Dénes
34abbe82fe query: specific_ranges: add non-const ranges accessor 2021-09-09 12:09:08 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Botond Dénes
f15551d23a query: partition_slice: add range_scan_data_variant option
Switching to the data variant of range scans have to be coordinated by
the coordinator to avoid replicas noticing the availability of the
respective feature in different time, resulting in some using the
mutation variant, some using the data variant.
So the plan is that it will be the coordinator's job to check the
cluster feature and set the option in the partition slice which will
tell the replicas to use the data variant for the query.
2021-03-02 07:53:53 +02:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Botond Dénes
92a7b16cba query: read_command: add max_result_size
This field will replace max size which is currently passed once per
established rpc connection via the CLIENT_ID verb and stored as an
auxiliary value on the client_info. For now it is unused, but we update
all sites creating a read command to pass the correct value to it. In the
next patch we will phase out the old max size and use this field to pass
max size on each verb instead.
2020-07-28 18:00:29 +03:00
Botond Dénes
8992bcd1f8 query: read_command: use tagged ints for limit ctor params
The convenience constructor of read_command now has two integer
parameter next to each other. In the next patch we intend to add another
one. This is recipe for disaster, so to avoid mistakes this patch
converts these parameters to tagged integers. This makes sure callers
pass what they meant to pass. As a matter of fact, while fixing up
call-sites, I already found several ones passing `query::max_partitions`
to the `row_limit` parameter. No harm done yet, as
`query::max_partitions` == `query::max_rows` but this shows just how
easy it is to mix up parameters with the same type.
2020-07-28 18:00:29 +03:00
Botond Dénes
2ca118b2d5 query: read_command: add separate convenience constructor
query::read_command currently has a single constructor, which serves
both as an idl constructor (order of parameters is fixed) and a convenience one
(most parameters have default values). This makes it very error prone to
add new parameters, that everyone should fill. The new parameter has to
be added as last, with a default value, as the previous ones have a
default value as well. This means the compiler's help cannot be enlisted
to make sure all usages are updated.

This patch adds a separate convenience constructor to be used by normal
code. The idl constructor looses all default parameters. New parameters
can be added to any position in the convenience constructor (to force
users to fill in a meaningful value) while the removed default
parameters from the idl constructor means code cannot accidentally use
it without noticing.
2020-07-28 18:00:29 +03:00
Botond Dénes
e778b072b1 read_command: use bool_class for is_first_page parameter
The constructor of `read_command` is used both by IDL and clients in the
code. However, this constructor has a parameter that is not used by IDL:
`read_timestamp`. This requires that this parameter is the very last in
the list and that new parameters that are used by IDL are added before
it. One such new parameter was `bool is_first_page`. Adding this
parameter right before the read timestamp one created a situation where
the last parameter (read_timestamp) implicitly converts to the one
before it (is_first_page). This means that some call sites passing
`read_timestamp` were now silently converting this to `is_first_page`,
effectively dropping the timestamp.

This patch aims to rectify this, while also avoiding similar accidents
in the future, by making `is_first_page` a `bool_class` which doesn't
have any implicit convertions defined. This change does not break the
ABI as `bool_class` is also sent as a `bool` on the wire.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Tests: unit(dev)
Message-Id: <20200422073657.87241-1-bdenes@scylladb.com>
2020-04-22 11:01:22 +03:00
Benny Halevy
dafbd88349 query: initialize read_command timestamp to now
This was initialized to api::missing_timestamp but
should be set to either a client provided-timestamp or
the server's.

Unlike write operations, this timestamp need not be unique
as the one generated by client_state::get_timestamp.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20200108074021.282339-2-bhalevy@scylladb.com>
2020-01-08 10:19:07 +02:00
Konstantin Osipov
191acec7ab schema: rename column_mask to column_set
Since it contains a precise set of columns, it's more
accurate to call it a set, not a mask. Besides, the name
column_mask is already used for column options on storage
level.
2019-11-13 11:41:30 +03:00
Vladimir Davydov
e0b31dd273 query: add flag to return static row on partition with no rows
A SELECT statement that has clustering key restrictions isn't supposed
to return static content if no regular rows matches the restrictions,
see #589. However, for the CAS statement we do need to return static
content on failure so this patch adds a flag that allows the caller to
override this behavior.
2019-10-28 21:50:44 +03:00
Konstantin Osipov
c0f0ab5edd lwt: introduce column mask
Introduce a bitset container which can be used to compute
all columns used in a query.

Add a partition_slice constructor which uses the bitset.
2019-10-16 22:40:55 +03:00
Botond Dénes
87973498a1 query: refactor trim_clustering_row_ranges_to()
Allow expressing `pos` in term of a `position_in_partition_view`, which
allows finer control of the exact position, allowing specifying position
before, at or after a certain key.
The previous overload is kept for backward compatibility, invoking the
new overload behind the curtains.
2019-08-13 09:47:55 +03:00
Botond Dénes
181bf64858 query: add trim_clustering_row_ranges_to()
This algorithm was already duplicated in two places
(service/pager/query_pagers.cc and mutation_reader.cc). Soon it will be
used in a third place. Instead of triplicating, move it into a function
that everybody can use.
2019-02-08 16:30:17 +02:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Paweł Dziepak
9024187222 partition_slice: use small_vector for column_ids 2018-12-06 14:21:04 +00:00
Botond Dénes
77b758707c query::partition_slice: add clear_ranges() method
Allows for clearing any custom partition ranges, effectively resetting
them to the default ones. Useful for code that needs to set several
different specific partition ranges, one after the other, but doesn't
want to remember the last key it set a range for to be able to clear the
previous range with `clear_range()`.
2018-12-04 08:51:05 +02:00
Avi Kivity
b835b93ee6 db: add query option to bypass cache
With the option enabled, we bypass the cache unconditionally and only
read from memtables+sstables. This is useful for analytics queries.
2018-11-25 16:26:08 +02:00
Nadav Har'El
fa284f6307 Add query UUID to read command
This patch adds the parameter to read_command which is needed for
caching of readers during multiple pages of a paged queries, which
we will introduce in the next patches.

The query_uuid is a UUID of a previously saved reader, which
the replica is now asked to recall and resume (if this saved reader is
no longer in the cache, it is fine, a new reader will be started).

Additionally a helper flag is_first_page is added so that the replica
can avoid doing any cache lookups (and incrementing miss counters) for
the first page.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-03-13 10:34:34 +02:00
Duarte Nunes
4ea2f52ddb query::partition_slice: Add option to specify when digest is requested
Having this option enables us to communicate from the upper to the
lower layers whether a digest was requested, so that we can pre-calculate
and cache a cell's hash in the readers that have access to the actual
in-memory cells (within the memtable and the row cache).

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 01:02:50 +00:00
Tomasz Grabiec
90796893ee query: Introduce is_single_row() 2017-11-02 11:05:19 +01:00
Avi Kivity
d6cd44a725 Revert "Merge 'Single key sstable reader optimization' from Botond"
This reverts commit 5e9cd128ad, reversing
changes made to 1f4e6759a7. Tomek found
some serious issues.
2017-10-19 12:47:21 +03:00
Botond Dénes
6cdeca1846 Add selects_only_full_rows() and selects_only_full_rows_with_atomic_columns() 2017-10-18 17:24:03 +03:00
Duarte Nunes
baeec0935f Replace query::full_slice with schema::full_slice()
query::full_slice doesn't select any regular or static columns, which
is at odds with the expectations of its users. This patch replaces it
with the schema::full_slice() version.

Refs #2885

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1507732800-9448-2-git-send-email-duarte@scylladb.com>
2017-10-17 11:25:53 +02:00
Tomasz Grabiec
0073df30aa query: Introduce full_clustering_range 2017-02-23 18:50:53 +01:00
Duarte Nunes
21d1bbb527 view: Add may_be_affected_by function
This patch adds the may_be_affected_by() function to the view class,
which is responsible to determine whether an update to a base class
affects one of its views.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-02-06 13:35:30 +01:00
Asias He
e5485f3ea6 Get rid of query::partition_range
Use dht::partition_range instead
2016-12-19 08:09:25 +08:00
Paweł Dziepak
da7ca85040 query: allow short reads
When paging is used the cluster is allowed to return less rows than the
client asked for. However, if such possibility is used we need a way of
telling that to the coordinator and the paging implementation so that
they can differentiate between short reads caused by the replica running
out of data to sent and short reads caused by any other means.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-12-14 14:10:01 +00:00
Avi Kivity
a35136533d Convert ring_position and token ranges to be nonwrapping
Wrapping ranges are a pain, so we are moving wrap handling to the edges.

Since cql can't generate wrapping ranges, this means thrift and the ring
maintenance code; also range->ring transformations need to merge the first
and last ranges.

Message-Id: <1478105905-31613-1-git-send-email-avi@scylladb.com>
2016-11-02 21:04:11 +02:00
Raphael S. Carvalho
768aced741 partition_slice: introduce key-independent function to get ranges
That will be important for sstable code that will rule out a sstable
if it doesn't cover a given clustering key range.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
2016-09-02 10:50:56 -03:00
Piotr Jastrzebski
b05b90b3a5 Introduce clustering_key_filter_ranges.
This fixes the problem of multiple concurrent get_ranges calls.
Previously each call was invalidating the result of the previous
call. Now they don't step on each other foot.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2016-08-30 19:46:38 +02:00
Duarte Nunes
5161ea283f query: query::clustering_range can't wrap around
This patch changes the type of query::clustering_range to express that
ranges that wrap around are not allowed, and ranges that have the
start bound after the end bound are considered empty.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-15 14:50:20 +00:00
Duarte Nunes
b0c5996580 read_command: Add comment explaining partition_limit
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-02 21:17:06 +00:00
Vlad Zolotarov
c1bb4d147d query::read_command: std::move() std::experimental::optional when initializing trace_info
Signed-off-by: Vlad Zolotarov <vladz@cloudius-systems.com>
2016-07-19 18:21:58 +03:00
Duarte Nunes
aaa76d58ba query: Move to_partition_range to dht namespace
This patch moves to_partition_range, from the query namespace
to the dht namespace, where it is a more natural fit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468498060-19251-1-git-send-email-duarte@scylladb.com>
2016-07-15 10:41:52 +02:00
Duarte Nunes
21d0a2c764 query: Optionally send cell ttl
This patch adds support to send a cell's ttl as part of a query's
result. This is needed for thrift support.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-07-14 15:36:23 +02:00
Duarte Nunes
f013425bb5 query: Ensure timestamp is last param in read_command
Since the timestamp is not serialized, it must always be the last
parameter of query::read_command. This patch reorders it with the
partition_limit parameters and updates callers that specified a
timestamp argument.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1468312334-10623-1-git-send-email-duarte@scylladb.com>
2016-07-12 10:41:54 +01:00
Avi Kivity
28fab55e6e Merge "Convert sstable writes to streamed mutations" from Paweł
"This series converts sstable writers (including compaction) to streamed
mutations and makes them use consumer-style interface.

Code related to sstable writes and compaction is converted to consumers
that can be used with consume_flattened_in_thread() (which is a variant
of consume_flattened() intended to be run inside a thread).
compac_for_query is improved so that it can be reused by sstable
compaction."
2016-07-04 15:07:47 +03:00
Paweł Dziepak
3c08ffb275 query: add full_slice
query::full_slice is a partiton slice which has full clustering row
ranges for all partition keys and no per-partition row limit.
Options and columns are not set.

It is used as a helper object in cases when a reference to
partition_slice is needed but the user code needs just all data there is
(an example of such case would be sstable compaction).

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-30 11:37:54 +01:00
Duarte Nunes
0ae6eafadd query: Make partition_limit last parameter
The partition_limit should have been added to the end of the ctor
argument list, as its current placement causes some callers to pass it
the timestamp instead of the limit.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1467239360-6853-3-git-send-email-duarte@scylladb.com>
2016-06-30 12:31:11 +02:00
Duarte Nunes
69798df95e query: Limit number of partitions returned
This is required to implement a thrift verb.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-22 09:48:13 +02:00