Commit Graph

15 Commits

Author SHA1 Message Date
Avi Kivity
b06dffcc19 Merge "messaging: make verb handler registering independent of current scheduling group" from Botond
"
0c6bbc8 refactored `get_rpc_client_idx()` to select different clients
for statement verbs depending on the current scheduling group.
The goal was to allow statement verbs to be sent on different
connections depending on the current scheduling group. The new
connections use per-connection isolation. For backward compatibility the
already existing connections fall-back to per-handler isolation used
previously. The old statement connection, called the default statement
connection, also used this. `get_rpc_client_idx()` was changed to select
the default statement connection when the current scheduling group is
the statement group, and a non-default connection otherwise.

This inadvertently broke `scheduling_group_for_verb()` which also used
this method to get the scheduling group to be used to isolate a verb at
handle register time. This method needs the default client idx for each
verb, but if verb registering is run under the system group it instead
got the non-default one, resulting in the per-handler isolation not
being set-up for the default statement connection, resulting in default
statement verb handlers running in whatever scheduling group the process
loop of the rpc is running in, which is the system scheduling group.

This caused all sorts of problems, even beyond user queries running in
the system group. Also as of 0c6bbc8 queries on the replicas are
classified based on the scheduling group they are running on, so user
reads also ended up using the system concurrency semaphore.

In particular this caused severe problems with ranges scans, which in
some cases ended up using different semaphores per page resulting in a
crash. This could happen because when the page was read locally the code
would run in the statement scheduling group, but when the request
arrived from a remote coordinator via rpc, it was read in a system
scheduling group. This caused a mismatch between the semaphore the saved
reader was created with and the one the new page was read with. The
result was that in some cases when looking up a paused reader from the
wrong semaphore, a reader belonging to another read was returned,
creating a disconnect between the lifecycle between readers and that of
the slice and range they were referencing.

This series fixes the underlying problem of the scheduling group
influencing the verb handler registration, as well as adding some
additional defenses if this semaphore mismatch ever happens in the
future. Inactive read handles are now unique across all semaphores,
meaning that it is not possible anymore that a handle succeeds in
looking up a reader when used with the wrong semaphore. The range scan
algorithm now also makes sure there is no semaphore mismatch between the
one used for the current page and that of the saved reader from the
previous page.

I manually checked that each individual defense added is already
preventing the crash from happening.

Fixes: #6613
Fixes: #6907
Fixes: #6908

Tests: unit(dev), manual(run the crash reproducer, observe no crash)
"

* 'query-classification-regressions/v1' of https://github.com/denesb/scylla:
  multishard_mutation_query: use cached semaphore
  messaging: make verb handler registering independent of current scheduling group
  multishard_mutation_query: validate the semaphore of the looked-up reader
  reader_concurrency_semaphore: make inactive read handles unique across semaphores
  reader_concurrency_semaphore: add name() accessor
  reader_concurrency_semaphore: allow passing name to no-limit constructor

(cherry picked from commit 3f84d41880)
2020-07-27 17:41:51 +03:00
Botond Dénes
e678f06a5e querier_cache: get semaphore from querier
Currently the `querier_cache` is passed a semaphore during its
construction and it uses this semaphore to do all the inactive reader
registering/unregistering. This is inaccurate as in theory cached reads
could belong to different semaphores (although currently this is not yet
the case). As all queriers store a valid permit now, use this
permit to obtain the semaphore the querier is associated with, and
register the inactive read with this semaphore.
2020-05-28 11:34:35 +03:00
Botond Dénes
e4c591aa67 database: introduce make_query_class_config()
And use it to obtain any query-class specific configuration that was
obtained from `table::config` before, such as the read concurrency
semaphore and the max memory limit for unlimited queries. As all users
of these items get these from the query class config now, we can remove
them from `table::config`.
2020-05-28 11:34:35 +03:00
Botond Dénes
a08467da29 test: move away from reader_concurrency_semaphore::wait_admission()
And use the reader_permit for this instead. This refactoring has
revealed a pre-existing bug in the `test_lifecycle_policy`, which is
also addressed in this patch. The bug is that said policy executes
reader destructions in the background, and these are not waited for. For
some reason, the semaphore -> permit transition pushes these races over
the edge and we start seeing some of these destruction fibers still
being unfinished when test scopes are exited, causing all sorts of
trouble. The solution is to introduce a special gate that tests can use
to wait for all background work to finish, before the test scope is
exited.
2020-05-28 11:34:35 +03:00
Botond Dénes
d5ebd763ff multishard_mutation_query: pass a valid permit to shard mutation sources
In preparation of a valid permit being required to be passed to all
mutation sources, create a permit before creating the shard readers and
pass it to the mutation source when doing so. The permit is also
persisted in the `shard_mutation_querier` object when saving the reader,
which is another forward looking change, to allow the querier-cache to
use it to obtain the semaphore the read is actually registered with.
2020-05-28 11:34:35 +03:00
Botond Dénes
bad53c4245 querier: add reader_permit parameter and forward it to the mutation_source
In preparation of a valid permit being required to be passed to all
mutation sources, also add a permit to the querier object, which is then
passed to the source when it is used to create a reader.
2020-05-28 11:34:35 +03:00
Avi Kivity
11698aafc1 tests: querier_cache_test: don't exhaust random number entropy
rand_int() re-creates a random device each time it is called.
Change it to use a static random_device, and get random numbers
from a random_engine instead of from the device directly.

This avoids exhausting entropy, see [1] for details.

[1] https://gcc.gnu.org/bugzilla/show_bug.cgi?id=94087
2020-05-26 20:51:16 +03:00
Avi Kivity
157fe4bd19 Merge "Remove default timeouts" from Botond
"
Timeouts defaulted to `db::no_timeout` are dangerous. They allow any
modifications to the code to drop timeouts and introduce a source of
unbounded request queue to the system.

This series removes the last such default timeouts from the code. No
problems were found, only test code had to be updated.

tests: unit(dev)
"

* 'no-default-timeouts/v1' of https://github.com/denesb/scylla:
  database: database::query*(), database::apply*(): remove default timeouts
  database: table::query(): remove default timeout
  mutation_query: data_query(): remove default timeout
  mutation_query: mutation_query(): remove default timeout
  multishard_mutation_query: query_mutations_on_all_shards(): remove default timeout
  reader_concurrency_semaphore: wait_admission(): remove default timeout
  utils/logallog: run_when_memory_available(): remove default timeout
2020-03-01 17:29:17 +02:00
Botond Dénes
f6013a39ec reader_concurrency_semaphore: wait_admission(): remove default timeout 2020-02-27 18:43:12 +02:00
Botond Dénes
7bdeec4b00 flat_mutation_reader: make_reversing_reader(): add memory limit
If the reversing requires more memory than the limit, the read is
aborted. All users are updated to get a meaningful limit, from the
respective table object, with the exception of tests of course.
2020-02-27 18:11:54 +02:00
Botond Dénes
dfc8b2fc45 treewide: replace reader_resource_tracer with reader_permit
The former was never really more than a reader_permit with one
additional method. Currently using it doesn't even save one from any
includes. Now that readers will be using reader_permit we would have to
pass down both to mutation_source. Instead get rid of
reader_resource_tracker and just use reader_permit. Instead of making it
a last and optional parameter that is easy to ignore, make it a
first class parameter, right after schema, to signify that permits are
now a prominent part of the reader API.

This -- mostly mechanical -- patch essentially refactors mutation_source
to ask for the reader_permit instead of reader_resource_tracking and
updates all usage sites.
2020-01-28 08:13:16 +02:00
Botond Dénes
c0f96db2d9 reader_concurrency_semaphore: mv reader_resources and reader_permit to reader_permit.hh
In the next patches we will replace `reader_resource_tracker` and have
code use the `reader_permit` directly. In subsequent patches, the
`reader_permit` will get even more usages as we attempt to make the
tracking of reader resource more accurate by tracking more parts of it.
So the grand plan is that the current `reader_concurrency_semaphore.hh`
is split into two headers:
* `reader_concurrency_semaphore.hh` - containing the semaphore proper.
* `reader_permit.hh` - a very lightweight header, to be used by
  components which only want to track various parts of the resource
  consumption of reads.
2020-01-28 08:13:16 +02:00
Botond Dénes
2005495857 reader_concurrency_semaphore: reader_permit: make it a value type
Currently `reader_permit` is passed around as
`lw_shared_ptr<reader_permit>`, which is clunky to write and use and is
also an unnecessary leak of details on how permit ownership is managed.
Make `reader_permit` a simple value type, making it a little bit easier
and safer to use.
In the next patches we will get rid of `reader_resource_tracker` and
instead have code use the permit instance directly, so this small
improvement in usability will go a long way towards preventing eye sore.
2020-01-28 08:13:16 +02:00
Rafael Ávila de Espíndola
dca1bc480f everywhere: Use serialized(foo) instead of data_value(foo).serialize()
This is just a simple cleanup that reduces the size of another patch I
am working on and is an independent improvement.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200114051739.370127-1-espindola@scylladb.com>
2020-01-14 12:17:12 +02:00
Konstantin Osipov
1c8736f998 tests: move all test source files to their new locations
1. Move tests to test (using singular seems to be a convention
   in the rest of the code base)
2. Move boost tests to test/boost, other
   (non-boost) unit tests to test/unit, tests which are
   expected to be run manually to test/manual.

Update configure.py and test.py with new paths to tests.
2019-12-16 17:47:42 +03:00