Commit Graph

82 Commits

Author SHA1 Message Date
Botond Dénes
ae366868fb multishard_mutation_query: save_reader(): avoid round-trip for destroying rparts
Force its destruction when saving the reader.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210514140844.119362-1-bdenes@scylladb.com>
2021-05-18 10:07:13 +03:00
Benny Halevy
cd0991f28d multishard_mutation_query: read_context::stop: properly close unregistered inactive_reads
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
93d6dcdbcf multishard_mutation_query: read_context: stop: wait on unregistering inactive reads
Currently unregister_inactive_read for other shards is moved
to the background with nothing keep the respective
reader_concurrency_semaphore around.

This change runs the loop in parallel_for_each
so that we don't have to serially wait on all of them
but rather they can run in parallel on all shards, but
all are waited on via the returned future<>.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
8421e1f61e mutlishard_mutation_query: read_context: close: unregister all inactive reads
Currently only if the reader_meta is in the saved state
we unregister its inactive_read, yet it is possible
that it will hold an inactive_read also in the lookup state.

To cover all cases, rather than testing the reader_state,
unregister if the inactive_read_handle is engaged.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
53889ef9b0 multishard_mutation_query: read_page: close reader when done
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
afa2fe0b76 multishard_mutation_query: read_page: make compaction_state first
To simplify error handling for always closing the reader
in this function.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
e2a767bef7 multishard_mutation_query: page_consume_result: mark constructor noexcept
As it can't throw. This is needed to simplify the following
patch that will always close the reader in read_page.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
2c1edb1a94 mutation_reader: reader_lifecycle_policy: return future from destroy_reader
So we can wait on it from to-be-introduced shard_reader::close().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Botond Dénes
45471419d0 multishard_mutation_query: re-enable reverse queries
034cb81323 and 0f0c3be disallowed reverse partition-range scans based on
the observation that the CQL frontend disallows them, assuming that
other client APIs also disallow them. As it turns out this is not true
and there it at least one client API (Thrift) which does allows reverse
range scans. So re-enable them.

Fixes: #8211

Tests: unit(release), dtest(thrift_tests.py)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210304142249.164247-1-bdenes@scylladb.com>
2021-03-04 17:06:16 +02:00
Botond Dénes
0f0c3be63e multishard_mutation_query: query_mutations_on_all_shards(): refuse reverse queries
Refuse reverse queries just like in the new
`query_data_on_all_shards()`. The reason is the same, reverse range
scans are not supported on the client API level and hence they are
underspecified and more importantly: not tested.
2021-03-02 07:53:53 +02:00
Botond Dénes
034cb81323 multishard_mutation_query: add query_data_on_all_shards()
A data query variant of the existing `query_mutations_on_all_shards()`.
This variant builds a `query::result`, instead of `reconcilable_result`.
This is actually the result format coordinators want when executing
range scans, the reason for using the reconcilable result for these
queries is historic, and it just introduces an unnecessary intermediate
format.
This new method allows the storage proxy to skip this intermediate
format and the associated conversion to `query::result`, just like we do
for single partition queries.

Reverse queries are refused because they are not supported on the client
API (CQL) level anyway and hence it is unspecified how they should work
and more importantly: they are not tested.
2021-03-02 07:53:53 +02:00
Botond Dénes
f19ab5cff1 multishard_mutation_query: generalize query code w.r.t. the result builder used
We want to add support to building `query::result` directly and reuse
the code path we use to build reconcilable result currently for it.
So templatize said code path on the result builder used. Since the
different result builders don't have a source level compatible interface
an adaptor class is used.
2021-03-02 07:53:53 +02:00
Botond Dénes
bddb0d35d6 multishard_mutation_query: query_mutations_on_all_shards(): extract logic into new method
In the next patches we are going to generalize the query logic w.r.t.
the result builder used, so query_mutations_on_all_shards() will be just
a facade parametrizing the actual query code with the right result
builder.
2021-03-02 07:53:53 +02:00
Botond Dénes
b0b620b501 multishard_mutation_query: query_mutations_on_all_shards(): convert to coroutine
In preparation to generalizing it w.r.t. the result builder used.
This change will be much simpler with the coroutine code.
2021-03-02 07:53:53 +02:00
Botond Dénes
5d85615698 multishar_mutation_query: do_query_mutations(): convert to coroutine
In preparation to generalizing it w.r.t. the result builder used.
This change will be much simpler with the coroutine code.
2021-03-02 07:53:53 +02:00
Botond Dénes
8138bdb434 multishard_mutation_query: read_page(): convert to coroutine
In preparation to generalizing it w.r.t. the result builder used. This
change will be much simpler with the coroutine code.
2021-03-02 07:53:53 +02:00
Botond Dénes
29195f67f1 multishard_mutation_query: extract page reading logic into separate method
The block of code moved also coincides with the scope in which the
reader has to be alive, making the code more clear.
2021-03-02 07:53:53 +02:00
Benny Halevy
d565e3fb57 reader_lifecycle_policy: retire low level try_resume method
The caller can now just call sem.unregister_inactive_read(irh) directly.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 20:32:40 +02:00
Botond Dénes
226088d12e mutation_reader: reader_lifecycle_policy::stopped_reader: drop pending_next_partition flag
Its not used anymore.
2021-01-22 16:18:59 +02:00
Benny Halevy
29002e3b48 flat_mutation_reader: return future from next_partition
To allow it to asynchronously close underlying readers
on next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Benny Halevy
ff931c2ecc multishard_mutation_query: read_context: save_reader: destroy reader_meta from the calling shard
The reader_meta in _readers[shard] is created on shard 0 and must
be destroyed on it as well.

A following patch changes next_partition() to return a future<>
thus it introduces a continuation that requires access to `rm`.

We cannot move it down to the conuation safely, since it will be
wrongly destroyed in the invoked shard, so use do_with to hold it
in the scope of the calling shard until the invoked function
completes.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Tomasz Grabiec
ba42e7fcc5 multishard_mutation_query: Propagate mutation_reader::forwarding flag
Otherwise all readers will be created with the default forwarding::yes.
This inhibits some optimizations (e.g. results in more sstable read-ahead).

It will also be problematic when we introduce mutation sources which don't support
forwarding::yes in the future.

Message-Id: <1604065206-3034-1-git-send-email-tgrabiec@scylladb.com>
2020-11-02 15:24:36 +02:00
Botond Dénes
ff623e70b3 reader_concurrency_semaphore: name permits
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.

As not all read can be associated with one schema, the schema is allowed
to be null.

The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
2020-10-13 12:32:13 +03:00
Botond Dénes
307cdf1e0d multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
Allow the evictable reader managing the underlying reader to pass its
own permit to it when creating it, making sure they share the same
permit. Note that the two parts can still end up using different
permits, when the underlying reader is kept alive between two pages of a
paged read and thus keeps using the permit received on the previous
page.

Also adjust the `reader_context` in multishard_mutation_query.cc to use
the passed-in permit instead of creating a new one when creating a new
reader.
2020-10-12 15:56:56 +03:00
Botond Dénes
e09ab09fff multishard_combining_reader: add permit parameter
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
2020-10-12 15:56:56 +03:00
Botond Dénes
256140a033 mutation_fragment: memory_usage(): remove unused schema parameter
The memory usage is now maintained and updated on each change to the
mutation fragment, so it needs not be recalculated on a call to
`memory_usage()`, hence the schema parameter is unused and can be
removed.
2020-09-28 11:27:47 +03:00
Botond Dénes
6ca0464af5 mutation_fragment: add schema and permit
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.

In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
2020-09-28 11:27:23 +03:00
Botond Dénes
0518571e56 flat_mutation_reader: make _buffer a tracked buffer
Via a tracked_allocator. Although the memory allocations made by the
_buffer shouldn't dominate the memory consumption of the read itself,
they can still be a significant portion that scales with the number of
readers in the read.
2020-09-28 10:53:56 +03:00
Piotr Sarna
71bb277cbc multishard_mutation_query: fix a typo in variable name
s/allwoed/allowed
Message-Id: <eedb62b1f13ebf4ab1e6e92642a77fab32379d73.1596799589.git.sarna@scylladb.com>
2020-08-09 12:52:40 +03:00
Wojciech Mitros
45215746fe increase the maximum size of query results to 2^64
Currently, we cannot select more than 2^32 rows from a table because we are limited by types of
variables containing the numbers of rows. This patch changes these types and sets new limits.

The new limits take effect while selecting all rows from a table - custom limits of rows in a result
stay the same (2^32-1).

In classes which are being serialized and used in messaging, in order to be able to process queries
originating from older nodes, the top 32 bits of new integers are optional and stay at the end
of the class - if they're absent we assume they equal 0.

The backward compatibility was tested by querying an older node for a paged selection, using the
received paging_state with the same select statement on an upgraded node, and comparing the returned
rows with the result generated for the same query by the older node, additionally checking if the
paging_state returned by the upgraded node contained new fields with correct values. Also verified
if the older node simply ignores the top 32 bits of the remaining rows number when handling a query
with a paging_state originating from an upgraded node by generating and sending such a query to
an older node and checking the paging_state in the reply(using python driver).

Fixes #5101.
2020-08-03 17:32:49 +02:00
Botond Dénes
f7a4d19fb1 mutation_partition: abort read when hard limit is exceeded for non-paged reads
If the read is not paged (short read is not allowed) abort the query if
the hard memory limit is reached. On reaching the soft memory limit a
warning is logged. This should allow users to adjust their application
code while at the same time protecting the database from the really bad
queries.
The enforcement happens inside the memory accounter and doesn't require
cooperation from the result builders. This ensures memory limit set for
the query is respected for all kind of reads. Previously non-paged reads
simply ignored the memory accounter requesting the read to stop and
consumed all the memory they wanted.
2020-07-29 08:32:31 +03:00
Botond Dénes
9eab5bca27 query_*(): use the coordinator specified memory limit for unlimited queries
It is important that all replicas participating in a read use the same
memory limits to avoid artificial differences due to different amount of
results. The coordinator now passes down its own memory limit for reads,
in the form of max_result_size (or max_size). For unpaged or reverse
queries this has to be used now instead of the locally set
max_memory_unlimited_query configuration item.

To avoid the replicas accidentally using the local limit contained in
the `query_class_config` returned from
`database::make_query_class_config()`, we refactor the latter into
`database::get_reader_concurrency_semaphore()`. Most of its callers were
only interested in the semaphore only anyway and those that were
interested in the limit as well should get it from the coordinator
instead, so this refactoring is a win-win.
2020-07-28 18:00:29 +03:00
Botond Dénes
159d37053d storage_proxy: use read_command::max_result_size to pass max result size around
Use the recently added `max_result_size` field of `query::read_command`
to pass the max result size around, including passing it to remote
nodes. This means that the max result size will be sent along each read,
instead of once per connection.
As we want to select the appropriate `max_result_size` based on the type
of the query as well as based on the query class (user or internal) the
previous method won't do anymore. If the remote doesn't fill this
field, the old per-connection value is used.
2020-07-28 18:00:29 +03:00
Botond Dénes
fbbbc3e05c query: result_memory_limiter: use the new max_result_size type 2020-07-28 18:00:29 +03:00
Botond Dénes
eeeef0a0f1 multishard_mutation_query: use cached semaphore
Instead of requesting the query class config from the database every
time the semaphore is needed, use the cached one by calling
`semaphore()`.
2020-07-27 12:17:22 +03:00
Botond Dénes
b7cfa4ea97 multishard_mutation_query: validate the semaphore of the looked-up reader
To make sure it belongs to the same semaphore that the database thinks
is appropriate for the current query. Since a semaphore mismatch points
to a serious bug, we use `on_internal_error()` to allow generating
coredumps on-demand.
2020-07-23 16:43:37 +03:00
Botond Dénes
63309f925c mutation_reader: reader_lifecycle_policy: make semaphore() available early
Currently all reader lifecycle policy implementations assume that
`semaphore()` will only be called after at least one call to
`make_reader()`. This assumption will soon not hold, so make sure
`semaphore()` can be called at any time, including before any calls are
made to `make_reader()`.
2020-06-23 10:01:38 +03:00
Botond Dénes
3cd2598ab3 reader_permit: forbid empty permits
Remove `no_reader_permit()` and all ways to create empty (invalid)
permits. All permits are guaranteed to be valid now and are only
obtainable from a semaphore.

`reader_permit::semaphore()` now returns a reference, as it is
guaranteed to always have a valid semaphore reference.
2020-05-28 11:34:35 +03:00
Botond Dénes
e4c591aa67 database: introduce make_query_class_config()
And use it to obtain any query-class specific configuration that was
obtained from `table::config` before, such as the read concurrency
semaphore and the max memory limit for unlimited queries. As all users
of these items get these from the query class config now, we can remove
them from `table::config`.
2020-05-28 11:34:35 +03:00
Botond Dénes
d5ebd763ff multishard_mutation_query: pass a valid permit to shard mutation sources
In preparation of a valid permit being required to be passed to all
mutation sources, create a permit before creating the shard readers and
pass it to the mutation source when doing so. The permit is also
persisted in the `shard_mutation_querier` object when saving the reader,
which is another forward looking change, to allow the querier-cache to
use it to obtain the semaphore the read is actually registered with.
2020-05-28 11:34:35 +03:00
Botond Dénes
0b4ec62332 flat_mutation_reader: flat_multi_range_reader: add reader_permit parameter
Mutation sources will soon require a valid permit so make sure we have
one and pass it to the mutation sources when creating the underlying
readers.
For now, pass no_reader_permit() on call sites, deferring the obtaining
of a valid permit to later patches.
2020-05-28 11:34:35 +03:00
Piotr Jastrzebski
e72696a8e6 sharding_info: rename the class to sharder
Also rename all variables that were named si or sinfo
to sharder.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:33 +02:00
Piotr Jastrzebski
d8ac8fd6e8 multishard_mutation_query: use sharding_info::shard_of
This patch replaces all the uses of i_partitioner:shard_of
with sharding_info::shard_of in read_context.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:33 +02:00
Piotr Jastrzebski
031f589dba multishard_combining_reader: use token_for_next_shard from sharding info not partitioner
Previously this function was accessing sharding logic
through partitioner obtained from the schema.

While converting tests, dummy_partitioner is turned into
dummy_sharding_info.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:25 +02:00
Rafael Ávila de Espíndola
c5795e8199 everywhere: Replace engine().cpu_id() with this_shard_id()
This is a bit simpler and might allow removing a few includes of
reactor.hh.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200326194656.74041-1-espindola@scylladb.com>
2020-03-27 11:40:03 +03:00
Piotr Jastrzebski
924ed7bb1c make_multishard_combining_reader: stop taking partitioner
The function already takes schema so there's no need
for it to take partitioner. It can be obtained using
schema::get_partitioner

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-15 10:25:20 +01:00
Botond Dénes
7bdeec4b00 flat_mutation_reader: make_reversing_reader(): add memory limit
If the reversing requires more memory than the limit, the read is
aborted. All users are updated to get a meaningful limit, from the
respective table object, with the exception of tests of course.
2020-02-27 18:11:54 +02:00
Piotr Jastrzebski
0677bafd16 multishard_mutation_query: stop calling global_partitioner()
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-02-17 10:59:15 +01:00
Botond Dénes
dfc8b2fc45 treewide: replace reader_resource_tracer with reader_permit
The former was never really more than a reader_permit with one
additional method. Currently using it doesn't even save one from any
includes. Now that readers will be using reader_permit we would have to
pass down both to mutation_source. Instead get rid of
reader_resource_tracker and just use reader_permit. Instead of making it
a last and optional parameter that is easy to ignore, make it a
first class parameter, right after schema, to signify that permits are
now a prominent part of the reader API.

This -- mostly mechanical -- patch essentially refactors mutation_source
to ask for the reader_permit instead of reader_resource_tracking and
updates all usage sites.
2020-01-28 08:13:16 +02:00
Avi Kivity
ba64ec78cf messaging_service: use rpc::tuple instead of variadic futures for rpc
Since variadic future<> is deprecated, switch to rpc::tuple for multiple
return values in rpc calls. This is more or less mechanical translation.
2019-09-26 12:09:31 +02:00