Commit Graph

80 Commits

Author SHA1 Message Date
Avi Kivity
88322086cb Merge "Add fuzzer-type unit test for range scans" from Botond
"
This series adds a fuzzer-type unit test for range scans, which
generates a semi-random dataset and executes semi-random range scans
against it, validating the result.
This test aims to cover a wide range of corner cases with the help of
randomness. Data and queries against it are generated in such a way that
various corner cases and their combinations are likely to be covered.

The infrastructure under range-scans have gone under massive changes in
the last year, growing in complexity and scope. The correctness of range
scans is critical for the correct functioning of any Scylla cluster, and
while the current unit tests served well in detecting any major problems
(mostly while developing), they are too simplistic and can only be
relied on to check the correctness of the basic functionality. This test
aims to extend coverage drastically, testing cases that the author of
the range-scan code or that of the existing unit tests didn't even think
exists, by relying on some randomness.

Fixes: #3954 (deprecates really)
"

* 'more-extensive-range-scan-unit-tests/v2' of https://github.com/denesb/scylla:
  tests/multishard_mutation_query_test: add fuzzy test
  tests/multishard_mutation_query_test: refactor read_all_partitions_with_paged_scan()
  tests/test_table: add advanced `create_test_table()` overload
  tests/test_table: make `create_test_table()` customizable
  query: add trim_clustering_row_ranges_to()
  tests/test_table: add keyspace and table name params
  tests/test_table: s/create_test_cf/create_test_table/
  tests: move create_test_cf() to tests/test_table.{hh,cc}
  tests/multishard_mutation_query_test: drop many partition test
  tests/multishard_mutation_query_test: drop range tombstone test
2019-02-27 17:26:53 +02:00
Piotr Sarna
c743617236 cql3: unify max value for row limit and per-partition limit
Limits are stored as uint32_t everywhere, but in some places
int32_t was used, which created inconsistencies when comparing
the value to std::numeric_limits<Type>::max().
In order to solve inconsistencies, the types are unified to uint32_t,
and instead of explicitly calling numeric limit max,
an already existing constant value query::max_rows is utilized.

Fixes #4253

Message-Id: <4234712ff61a0391821acaba63455a34844e489b.1550683120.git.sarna@scylladb.com>
2019-02-21 13:56:02 +02:00
Piotr Sarna
acf7bedad4 idl,service: add persistent last partition row count
In order to process paged queries with per-partition limits properly,
paging state needs to keep additional information: what was the row
count of last partition returned in previous run.
That's necessary because the end of previous page and the beginning
of current one might consist of rows with the same partition key
and we need to be able to trim the results to the number indicated
by per-partition limit.
2019-02-18 11:06:44 +01:00
Piotr Sarna
1dadae212a cql3: add checking for previous partition count to filtering
Filtering now needs to take into account per partition limits as well,
and for that it's essential to be able to compare partition keys
and decide which rows should be dropped - if previous page(s) contained
rows with the same partition key, these need to be taken into
consideration too.
2019-02-18 11:06:43 +01:00
Piotr Sarna
82a3883575 pager: add adjusting per-partition row limit
For filtering pagers, per partition limit should be set
to page size every time a query is executed, because some rows
may potentially get dropped from results.
2019-02-18 10:55:52 +01:00
Piotr Sarna
b965c3778f cql3: obey per partition limit for filtering
Filtering queries now take into account the limit of rows
per single partition provided by the user.
2019-02-18 10:29:34 +01:00
Botond Dénes
181bf64858 query: add trim_clustering_row_ranges_to()
This algorithm was already duplicated in two places
(service/pager/query_pagers.cc and mutation_reader.cc). Soon it will be
used in a third place. Instead of triplicating, move it into a function
that everybody can use.
2019-02-08 16:30:17 +02:00
Piotr Sarna
87c23372fb cql3: fix filtering with LIMIT with regard to paging
Previously the limit was erroneously applied per page
instead of being accumulated, which might have caused returning
too many rows. As of now, LIMIT is handled properly inside
restrictions filter.

Fixes #4100
2019-01-17 13:25:09 +01:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Piotr Sarna
5b052bdae5 service/pager: use dropped_rows to adjust how many rows to read
Filtering pager may drop some rows and as a result return less
than what was fetched from the replica. To properly adjust how
many rows were actually read, dropped_rows variable is introduced.
2018-11-29 14:53:29 +01:00
Piotr Sarna
021caeddf7 service/pager: virtualize max_rows_to_fetch function
Regular pagers use max_rows to figure out how many rows to fetch,
but filtering pager potentially needs the whole page to be fetched
in order to filter the results.
2018-11-29 14:14:37 +01:00
Piotr Sarna
4f5ee3dfcd cql3: add counting dropped rows in filtering pager
Counter for dropped rows is added to the filtering pager.
This metrics can be used later to implement applying LIMIT
to filtering queries properly.
Dropped rows are returned on visitor::accept_partition_end.
2018-11-29 14:06:59 +01:00
Avi Kivity
775b7e41f4 Update seastar submodule
* seastar d59fcef...b924495 (2):
  > build: Fix protobuf generation rules
  > Merge "Restructure files" from Jesse

Includes fixup patch from Jesse:

"
Update Seastar `#include`s to reflect restructure

All Seastar header files are now prefixed with "seastar" and the
configure script reflects the new locations of files.

Signed-off-by: Jesse Haber-Kucharsky <jhaberku@scylladb.com>
Message-Id: <5d22d964a7735696fb6bb7606ed88f35dde31413.1542731639.git.jhaberku@scylladb.com>
"
2018-11-21 00:01:44 +02:00
Piotr Sarna
b3685342a6 service/pager: avoid dereferencing null partition key
The pager::state() function returns a valid paging object even
if the pager itself is exhausted. It may also not contain the partition
key, so using it unconditionally was a bug - now, in case there is no
partition key present, paging state will contain an empty partition key.

Fixes #3829

Message-Id: <28401eb21ab8f12645c0a33d9e92ada9de83e96b.1539074813.git.sarna@scylladb.com>
2018-10-09 12:13:52 +03:00
Piotr Sarna
b6d90b2869 pager: make state() defined for exhausted pagers
If service::pager is exhausted, state() function used to return
a nullptr instead of a pointer to a valid paging state and the
documented return type in this case was 'unspecified'.
Sometimes a paging state may be needed anyway, even if the pager
is already exhausted - thus, state() return value becomes defined
after this commit. Exhausted pagers will return a valid object
to a state with _remaining field set to 0.
2018-09-27 15:29:28 +02:00
Piotr Sarna
336cc70438 pager: add setters for partition/clustering keys 2018-09-27 15:18:06 +02:00
Piotr Sarna
1d34ef38a8 cql3: make pagers use time_point instead of duration
A standard way for passing a timeout parameter is specifying
a time_point, while pagers used to take a duration in order
to compute time points on the fly. This patch adds a timeout
parameter, which is a time_point, to fetch_page().
2018-09-27 15:18:06 +02:00
Paweł Dziepak
a3746d3b05 paging: make may_need_paging() more conservative
There is a bad interaction between may_need_paging() and query result
size limiter. The former is trying to avoid the complexity of paged
queries when the number of returned rows is going to be smaller than the
page size. The latter uses the fact that paged queries need not return
all requested rows to limit the size of a query results. Since
may_need_paging() may turn a paged query into non-paged one as a side
effect it disables the oversized result protection.

This patch limits the cases when may_need_paging() disables paging to
the situations when we know for sure that query result size limiter
won't be needed, i.e.: the result is not going to contain more than one
row. If the client knows for sure that the paging is not needed and
the performance impact is worthwhile it can disable paging on its side.
Otherwise, let's default to the safer behaviour.

Fixes #3620.

Message-Id: <20180925134431.24329-1-pdziepak@scylladb.com>
2018-09-25 17:01:04 +03:00
Botond Dénes
cd49c23a66 query_pagers: generate query_uuid for range-scans as well
And thus enable stateful range scans.
2018-09-03 10:31:44 +03:00
Piotr Sarna
8c18aaa511 cql3: pass query options to restrictions filter
Query options may contain bound values needed for checking filtering
restrictions. Previously, empty query_options{} were used, which
caused prepared statements to fail.

Fixes #3677
2018-08-09 17:44:45 +02:00
Paweł Dziepak
757d9e3b5d query_pager: avoid visiting result_view if not needed
query::result_visitor provides get_last_partition_and_clustering_key()
which allows getting those without iterating through the whole result.
Moreover, row count may be precomputed in the result, if it isn't there
is query::result_view::count_partitions_and_rows() for getting it.
2018-07-26 12:14:48 +01:00
Piotr Sarna
03f2f8633b cql3: add updating ALLOW FILTERING metrics
Metrics related to ALLOW FILTERING queries are now properly
updated on read requests.
2018-07-06 12:00:29 +02:00
Duarte Nunes
c126b00793 Merge 'ALLOW FILTERING support' from Piotr
"
The main idea of this series is to provide a filtering_visitor
as a specialised result_set_builder::visitor implementation
that keeps restriction info and applies it on query results.
Also, since allow_filtering checking is not correct now (e.g. #2025)
on select_statement level, this series tries to fix any issues
related to it.

Still in TODO:
 * handling CONTAINS relation in single column restriction filtering
 * handling multi-column restrictions - especially EQ, which can be
   split into multiple single-column restrictions
 * more tests - it's never enough; especially esoteric cases
   like filtering queries which also use secondary indexes,
   paging tests, etc.

Tests: unit (release)
"

* 'allow_filtering_6' of https://github.com/psarna/scylla:
  tests: add allow_filtering tests to cql_query_test
  cql3: enable ALLOW FILTERING
  service: add filtering_pager
  cql3: optimize filtering partition keys and static rows
  cql3: add filtering visitor
  cql3: move result_set_builder functions to header
  cql3: amend need_filtering()
  cql3: add single column primary key restrictions getters
  cql3: expose single column primary key restrictions
  cql3: add needs_filtering to primary key restrictions
  cql3: add simpler single_column_restriction::is_satisfied_by
2018-07-05 10:18:08 +01:00
Piotr Sarna
7b018f6fd6 service: add filtering_pager
For paged results of an 'ALLOW FILTERING' query, a filtering pager
is provided. It's based on a filtering_visitor for result_builder.
2018-07-05 10:50:43 +02:00
Botond Dénes
8084ce3a8e query_pager: use query::is_single_partition() to check for singular range
Use query::is_single_partition() to check whether the queried ranges are
singular or not. The current method of using
`dht::partition_range::is_singular()` is incorrect, as it is possible to
build a singular range that doesn't represent a single partition.
`query::is_single_partition()` correctly checks for this so use it
instead.

Found during code-review.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <f671f107e8069910a2f84b14c8d22638333d571c.1530675889.git.bdenes@scylladb.com>
2018-07-04 10:04:50 +01:00
Botond Dénes
59a30f0684 query_pager: be prepared to _ranges being empty
do_fetch_page() checks in the beginning whether there is a saved query
state already, meaning this is not the first page. If there is not it
checks whether the query is for a singulular partitions or a range scan
to decide whether to enable the stateful queries or not. This check
assumed that there is at least one range in _ranges which will not hold
under some circumstances. Add a check for _ranges being empty.

Fixes: #3564
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <cbe64473f8013967a93ef7b2104c7ca0507afac9.1530610709.git.bdenes@scylladb.com>
2018-07-03 11:05:01 +01:00
Vladimir Krivopalov
82f76b0947 Use std::reference_wrapper instead of a plain reference in bound_view.
The presence of a plain reference prohibits the bound_view class from
being copyable. The trick employed to work around that was to use
'placement new' for copy-assigning bound_view objects, but this approach
is ill-formed and causes undefined behaviour for classes that have const
and/or reference members.

The solution is to use a std::reference_wrapper instead.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <a0c951649c7aef2f66612fc006c44f8a33713931.1530113273.git.vladimir@scylladb.com>
2018-06-28 11:24:06 +01:00
Paweł Dziepak
1cf3cb285f pager: add fetch_page_generator()
fetch_page_generator() is an equivalent of fetch_page(), but instead of
building a cql3::result_set it returns a cql3::result_generator().
2018-06-25 09:21:47 +01:00
Paweł Dziepak
f6fe831d49 pager: make the visitor handle_result() accepts a template parameter 2018-06-25 09:21:47 +01:00
Paweł Dziepak
fc87ca5926 pager: make query_result_visitor base class a template parameter
So far query_result_visitor was tied to result_set_builder. The goal is
to enable result_generator to work with paged queries as well so we need
to decouple them.
2018-06-25 09:21:47 +01:00
Paweł Dziepak
dc9a65ea76 pager: make myvistor a member class of query_pager
It is going to be come a class template.
2018-06-25 09:21:47 +01:00
Paweł Dziepak
319b2cde7e pager: make shared pointers to selection constant
Shared pointers make code harder to reason about, it is not easy to get
rid of them in this piece of the code, but we can restore at least a bit
of sanity by adding consts.
2018-06-25 09:21:47 +01:00
Paweł Dziepak
327d3de51e pager: merge query_pager and query_pagers::impl
There is just a single implementation of query_pager and there is no
reason to make anything virtual. Devirtualising this code will allow
higher layers to pass visitors via templates.
2018-06-25 09:21:47 +01:00
Duarte Nunes
a23bda3393 Merge 'Implement separate timeout for range queries' from Avi
"
This patchset implements separate timeouts for range queries, and lays
the foundations for separate timeouts for other query types.

While the feature in itself is worthy, the real motivation is to have
the timeouts decided by the caller, instead of storage_proxy. This in
turn is required to disentangle each layer behaving differently
depending on whether the query is internal or not; instead, the goal
is to have each caller declare its needs in terms of consistency level
and timeouts, and have the lower layers implement its requirements
instead of making their own decisions.

Fixes #3013.

Tests: unit (release)
"

* tag '3013/v1.1' of https://github.com/avikivity/scylla:
  storage_proxy: remove default_query_timeout()
  storage_proxy: don't use default timeouts
  query_options: augment with timeout_config
  thrift: configure thrift transport and handler with a timeout_config
  transport: configure native transport with a timeout_config
  cql3: define and populate timeout_config_selector
  timeout_config: introduce timeout configuration
2018-05-13 20:05:50 +02:00
Botond Dénes
ddd70dc113 Use dht::token_range alias for last/preferred replicas
Use the pre-existing type alias instead of fully spelling out the type
everywhere.
2018-05-10 06:22:39 +03:00
Avi Kivity
d8dd7e05a7 storage_proxy: don't use default timeouts
Require all callers to supply timeouts instead of relying on defaults.

Since all callers now have the timeouts set up, they can easily supply
them.
2018-04-30 13:19:53 +03:00
Botond Dénes
eee9bda85b Make the read-repair decision only once
Make the read-repair decision on the first page of a paged-query and use
it for all the remaining pages. This helps querier-cache hit-rates as
reads to nodes will be sent consistently throught the query.
2018-03-19 16:29:43 +02:00
Botond Dénes
2e2abf6edb storage_proxy: add coordinator_query_options and coordinator_query_result
As yet more parameters and return-values are about to be added to all
storage_proxy::query_* methods we need a way that scales better than
changing the signatures every time. To this end we aggregate all
non-mandatory query parameters into `coordinator_query_options` and all
return values into `coordinator_query_result`.
This way new fields can be simply added to the respective structs while
the signatures of the methods themselves and their client code can
remain unchanged.
2018-03-19 15:17:35 +02:00
Botond Dénes
b55dcc2ce5 Add query_read_repair_decision to paging-state
This new field will store the repair-decision made on the first page of
the query. This decision will be sticky to all pages of the query.
In mixed clusters the decision might not happen on the first page and it
might even change during the query as old coordinators will not store
nor respect the decision.
2018-03-19 15:17:31 +02:00
Botond Dénes
f1171803b5 Use the last_replicas stored in the page_state
Pass the last_replicas from the page_state as the preferred_replicas
for query() and save the returned last_replicas as the last_replicas
field of the next page_state. The circle is now complete. The first page
of any query will pass an empty list as the preferred replicas (having
no previous paging_state) so the replicas will be selected according to
the load-balancing strategy. Any subsequent page will use the last
replicas from the last page as the preferred ones for the current one.
Thus if all goes well all pages of a query will hit the same replicas.
2018-03-13 10:34:34 +02:00
Botond Dénes
eac597d726 Add preferred and last replicas to the signature of query()
preferred_replicas are added to the parameters and last_replicas are
added to the return type. The preferred replicas will be used as a hint
for the selection of the replicas to send the read requests to. The last
replicas (returned) are the replicas actually selected for the read.
This will allow queries to consistently hit the same replicas for each
page thus reusing readers created on these replicas.
For convenience a query() overload is provided that doesn't take or
return the preferred and last replicas.

This patch only adds the parameters and propagates them down to
query_singular() and query_partition_key_range(). The code to actually
use these preferred-replicas will be added in later patches.
This reason for separating this is to reduce noise and improve
reviewability for those functional changes later.
2018-03-13 10:34:34 +02:00
Botond Dénes
f281b3e923 Add last_replicas to paging_state
Helps paged queries consistently hit the same replicas for each
subsequent page. Replicas that already served a page will keep the
readers used for filling it around in a cache. Subsequent page request
hitting the same replicas can reuse these readers to fill the pages
avoiding the work of creating these readers from scratch on every page.
In a mixed cluster older coordinators will ignore this value.
The value of last_replicas may change between pages as nodes may become
available/unavailable or the coordinator may decide to send the read
requests to different replicas at its discretion.
Replicas are identified by an opaque uuid which should only make sense
to the storage-proxy.
2018-03-13 10:34:34 +02:00
Nadav Har'El
fa284f6307 Add query UUID to read command
This patch adds the parameter to read_command which is needed for
caching of readers during multiple pages of a paged queries, which
we will introduce in the next patches.

The query_uuid is a UUID of a previously saved reader, which
the replica is now asked to recall and resume (if this saved reader is
no longer in the cache, it is fine, a new reader will be started).

Additionally a helper flag is_first_page is added so that the replica
can avoid doing any cache lookups (and incrementing miss counters) for
the first page.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-03-13 10:34:34 +02:00
Nadav Har'El
ec7c56d18a Add query UUID to paging state
This patch adds to the "paging_state", the opaque cookie that clients are
supposed to provide when asking for the next page on a paged query, a
unique id field. This new field will be used to tell that a new request
for a page really continues the previous page, and doesn't just by chance
start at the same position the previous page stopped.

We need to support setups with mixed versions - a client may get a paging
state from a coordinator running a new version of Scylla and send it to
a different coordinator running an old version - or vice versa. So the new
uuid field is set up to have a default uuid of UUID() (a recognizable
invalid uuid 0), so new versions receiving no uuid from an old version will
set this invalid uuid, and old versions receiving a uuid from a new version
will simply ignore it.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2018-03-13 10:34:34 +02:00
Duarte Nunes
9254a9a6fe db/system_keyspace: Move dependency on db/schema_tables to source file
And add missing dependencies to header file.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20180307111304.2914-1-duarte@scylladb.com>
2018-03-07 14:45:36 +02:00
Avi Kivity
ebaeefa02b Merge seatar upstream (seastar namespace)
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
 - 'net' namespace conflicts with seastar::net, renamed to 'netw'.
 - 'transport' namespace conflicts with seastar::transport, renamed to
   cql_transport.
 - "logger" global variables now conflict with logger global type, renamed
   to xlogger.
 - other minor changes
2017-05-21 12:26:15 +03:00
Calle Wilund
2049303399 query_pagers: bugfix: must count pk only/pk + static rows as 1
Previously only counted clustered/regular

Message-Id: <1494249013-4069-1-git-send-email-calle@scylladb.com>
2017-05-08 16:35:27 +03:00
Tomasz Grabiec
bc6486b304 Use gc_clock instead of db_clock where possible
Some code paths were obtaining db_clock timestamp to only convert it
to gc_clock later. Avoid this. In the future we could make gc_clock
cheaper cause it has low precision.

Message-Id: <1482401190-2035-1-git-send-email-tgrabiec@scylladb.com>
2016-12-22 13:27:55 +02:00
Duarte Nunes
d7e607ff51 query_pagers: Fix over-counting of rows
This patch fixes a regression introduced in 0518895, where we counted
one extra row per partition when it contained live, non static rows.

We also simplify the visitor logic further, since now we don't need to
count rows one by one. Also remove a bunch of unused fields.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <1482234083-2447-1-git-send-email-duarte@scylladb.com>
2016-12-20 11:58:37 +00:00
Asias He
937f28d2f1 Convert to use dht::partition_range_vector and dht::token_range_vector 2016-12-19 14:08:50 +08:00