Commit Graph

83 Commits

Author SHA1 Message Date
Botond Dénes
ad02f313dd test: mutation_reader_test: add test for permit cleanup
Check that a permit correctly restores the units on the semaphore in
each state it can be destroyed in.
2021-03-18 16:18:22 +02:00
Botond Dénes
1a337d0ec1 reader_concurrency_semaphore: fix clear_inactive_reads()
Broken by the move to an intrusive container (9cbbf40), which caused
said method to only clear the container but not destroy the inactive
reads contained therein. This patch restores the previous behaviour and
also adds a call the destructor (to ensure inactive reads are cleaned up
under any circumstances), as well as a unit test.
2021-03-18 14:57:57 +02:00
Botond Dénes
bc1fcd3db2 multishard_combining_reader: only read from needed shards
The multishard combining reader currently assumes that all shards have
data for the read range. This however is not always true and in extreme
cases (like reading a single token) it can lead to huge read
amplification. Avoid this by not pushing shards to
`_shard_selection_min_heap` if the first token they are expected to
produce falls outside of the read range. Also change the read ahead
algorithm to select the shards from `_shard_selection_min_heap`, instead
of walking them in shard order. This was wrong in two ways:
* Shards may be ordered differently with respect to the first partition
  they will produce; reading ahead on the next shard in shard order
  might not bring in data on the next shard the read will continue on.
  Shard order is only correct when starting a new range and shards are
  iterated over in the order they own tokens according to the sharding
  algorithm.
* Shards that may not have data relevant to the read range are also
  considered for read ahead.

After this patch, the multishard reader will only read from shards that
have data relevant to the read range, both in the case of normal reads
and also for read-ahead.

Fixes: #8161

Tests: unit(release)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210226132536.85438-1-bdenes@scylladb.com>
2021-02-26 23:29:20 +02:00
Botond Dénes
c3b4c3f451 evictable_reader: reset _range_override after fast-forwarding
`_range_override` is used to store the modified range the reader reads
after it has to be recreated (when recreating a reader it's read range
is reduced to account for partitions it already read). When engaged,
this field overrides the `_pr` field as the definitive range the reader
is supposed to be currently reading. Fast forwarding conceptually
overrides the range the reader is currently reading, however currently
it doesn't reset the `_range_override` field. This resulted in
`_range_override` (containing the modified pre-fast-forward range)
incorrectly overriding the fast-forwarded-to range in `_pr` when
validating the first partition produced by the just recreated reader,
resulting in a false-positive validation failure.

Fixes: #8059

Tests: unit(release)
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20210217164744.420100-1-bdenes@scylladb.com>
2021-02-17 19:11:00 +02:00
Michał Chojnowski
6b8a69e01f test: mutation_test: fix memory calculations in make_fragments_with_non_monotonic_positions
The off-by-one error would cause
test_multishard_combining_reader_non_strictly_monotonic_positions to fail if
the added range_tombstones filled the buffer exactly to the end.
In such situation, with the old loop condition,
make_fragments_with_non_monotonic_positions would add one range_tombstone too
many to the deque, violating the test assumptions.
2021-02-16 21:35:14 +01:00
Botond Dénes
dd26a96e63 tests: don't pass temporary ranges to readers
The sstable_mutation_reader, like all other mutation readers expects
that the partition-range passed to it is kept alive by its creator
for the duration of its lifetime. However, the single-key constructor
of the sstable reader was more tolerant, as it only extracted the key
from the range, essentially requiring only the key to be kept alive (but
not the containing range). Naturally in time some code come to rely on
it and ended up passing temporary ranges to the reader. This behaviour
will no longer be acceptable as we are about to consolidate the various
sstable reader constructors, uniformly requiring that the range is kept
alive. So this patch fixes up the tests so they work with this stricter
requirement. Only two occurences were found.
2021-01-27 17:38:17 +02:00
Botond Dénes
c3b4e990a2 tree-wide: use sstables::make_reader() instead of the read_.*row.*_flat() methods 2021-01-27 17:38:17 +02:00
Benny Halevy
1847d49971 test: test_env: pick the highest sstable version by default
If possible, test the highest sstable format version,
as it's the mostly used.

If there pre-written sstables we need to load from the
test directory from an older version, either specify their
version explicitly, or use the new test_env::reusable_sst
method that looks up the latest sstable version in the
given directory and generation.

Test: unit(release)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20201210161822.2833510-1-bhalevy@scylladb.com>
2021-01-24 10:38:55 +02:00
Kamil Braun
570d15c7bc multishard_combining_reader: do not use smp::count
`multishard_combining_reader` currently only works under the assumption
that every table uses the same sharder configured using the node's number
of shards. But we could potentially specify a different sharder for a chosen table,
e.g. one that puts everything on shard 0.
Then this assumption will be broken and the reader causes a segfault.

Fixes #7945.
2021-01-21 18:28:18 +02:00
Benny Halevy
29002e3b48 flat_mutation_reader: return future from next_partition
To allow it to asynchronously close underlying readers
on next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Kamil Braun
b41139a07f mutation_reader_test: test clustering_order_reader_merger with time_series_sstable_set 2020-12-18 16:33:27 +01:00
Kamil Braun
5e846b33b8 clustering_order_reader_merger: fix the 0 readers case
With 0 readers the merger would produce a `partition_end` fragment
when it should immediately return `end_of_stream` instead.
2020-12-18 12:30:40 +01:00
Avi Kivity
f802356572 Revert "Revert "Merge "raft: fix replication if existing log on leader" from Gleb""
This reverts commit dc77d128e9. It was reverted
due to a strange and unexplained diff, which is now explained. The
HEAD on the working directory being pulled from was set back, so git
thought it was merging the intended commits, plus all the work that was
committed from HEAD to master. So it is safe to restore it.
2020-12-08 19:19:55 +02:00
Avi Kivity
dc77d128e9 Revert "Merge "raft: fix replication if existing log on leader" from Gleb"
This reverts commit 0aa1f7c70a, reversing
changes made to 72c59e8000. The diff is
strange, including unrelated commits. There is no understanding of the
cause, so to be safe, revert and try again.
2020-12-06 11:34:19 +02:00
Kamil Braun
502ed2e9f7 tests: mutation_source_test for clustering_order_reader_merger 2020-12-02 11:13:58 +01:00
Kamil Braun
4f7e2bf920 mutation_reader_test: test clustering_order_reader_merger in memory 2020-11-30 11:55:44 +01:00
Kamil Braun
0b36c5e116 mutation_reader: introduce clustering_order_reader_merger
This abstraction is used to merge the output of multiple readers, each
opened for a single partition query, into a non-decreasing stream
of mutation_fragments.

It is similar to `mutation_reader_merger`,
an important difference is that the new merger may select new readers
in the middle of a partition after it already returned some fragments
from that partition.  It uses the new `position_reader_queue` abstraction
to select new readers. It doesn't support multi-partition (ring range) queries.

The new merger will be later used when reading from sstable sets created
by TimeWindowCompactionStrategy. This strategy creates many sstables
that are mostly disjoint w.r.t the contained clustering keys, so we can
delay opening sstable readers when querying a partition until after we have
processed all mutation fragments with positions before the keys
contained by these sstables.
2020-11-30 11:55:44 +01:00
Kamil Braun
40d8bfa394 sstables: move sstable reader creation functions to sstable_set
Lower level functions such as `create_single_key_sstable_reader`
were made methods of `sstable_set`.

The motivation is that each concrete sstable_set
may decide to use a better sstable reading algorithm specific to the
data structures used by this sstable_set. For this it needs to access
the set's internals.

A nice side effect is that we moved some code out of table.cc
and database.hh which are huge files.
2020-11-19 17:52:39 +01:00
Botond Dénes
f5323b29d9 mutation_reader: queue_reader: don't set EOS flag on abort
If the consumer happens to check the EOS flag before it hits the
exception injected by the abort (by calling fill_buffer()), they can
think the stream ended normally and expect it to be valid. However this
is not guaranteed when the reader is aborted. To avoid consumers falsely
thinking the stream ended normally, don't set the EOS flag on abort at
all.

Additionally make sure the producer is aborted too on abort. In theory
this is not needed as they are the one initiating the abort, but better
to be safe then sorry.

Fixes: #7411
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20201102100732.35132-1-bdenes@scylladb.com>
2020-11-11 13:44:25 +02:00
Avi Kivity
1041521eb8 test: mutation_reader_test: drop <ranges>
Clang has trouble with some parts of <ranges>. Replace with
boost range adaptors for now.
2020-10-19 10:23:31 +03:00
Avi Kivity
e6d55e2778 test: mutation_reader_test: don't capture structured bindings in lambdas
Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
2020-10-16 15:25:15 +03:00
Botond Dénes
ff623e70b3 reader_concurrency_semaphore: name permits
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.

As not all read can be associated with one schema, the schema is allowed
to be null.

The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
2020-10-13 12:32:13 +03:00
Botond Dénes
e09ab09fff multishard_combining_reader: add permit parameter
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
2020-10-12 15:56:56 +03:00
Botond Dénes
dd372c8457 flat_mutation_reader: de-virtualize buffer_size()
The main user of this method, the one which required this method to
return the collective buffer size of the entire reader tree, is now
gone. The remaining two users just use it to check the size of the
reader instance they are working with.
So de-virtualize this method and reduce its responsibility to just
returning the buffer size of the current reader instance.
2020-10-06 08:22:56 +03:00
Botond Dénes
256140a033 mutation_fragment: memory_usage(): remove unused schema parameter
The memory usage is now maintained and updated on each change to the
mutation fragment, so it needs not be recalculated on a call to
`memory_usage()`, hence the schema parameter is unused and can be
removed.
2020-09-28 11:27:47 +03:00
Botond Dénes
6ca0464af5 mutation_fragment: add schema and permit
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.

In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
2020-09-28 11:27:23 +03:00
Botond Dénes
3fab83b3a1 flat_mutation_reader: impl: add reader_permit parameter
Not used yet, this patch does all the churn of propagating a permit
to each impl.

In the next patch we will use it to track to track the memory
consumption of `_buffer`.
2020-09-28 10:53:48 +03:00
Botond Dénes
d692993bdc mutation_reader_test: test_multishard_combining_reader_non_strictly_monotonic_positions: reset size between buffer fills
Current code uses a single counter to produce multiple buffer worth of
data. This uses carry-on from on buffer to the other, which happens to
work with the current memory accounting but is very fragile. Account
each buffer separately, resetting the counter between them.
2020-09-28 08:46:22 +03:00
Botond Dénes
3c73cc2a4e tests: prepare for permit forwarding consumption post admission
Some tests rely on `consume*()` calls on the permit to take effect
immediately. Soon this will only be true once the permit has been
admitted, so make sure the permit is admitted in these tests.
2020-09-28 08:46:22 +03:00
Tomasz Grabiec
97c99ea9f3 Merge "evictable_reader: validate buffer on reader recreation" from Botond
The reader recreation mechanism is a very delicate and error-prone one,
as proven by the countless bugs it had. Most of these bugs were related
to the recreated reader not continuing the read from the expected
position, inserting out-of-order fragments into the stream.
This patch adds a defense mechanism against such bugs by validating the
start position of the recreated reader.
The intent is to prevent corrupt data from getting into the system as
well as to help catch these bugs as close to the source as possible.

Fixes: #7208

Tests: unit(dev), mutation_reader_test:debug (v4)

* botond/evictable-reader-validate-buffer/v5:
  mutation_reader_test: add unit test for evictable reader self-validation
  evictable_reader: validate buffer after recreation the underlying
  evictable_reader: update_next_position(): only use peek'd position on partition boundary
  mutation_reader_test: add unit test for evictable reader range tombstone trimming
  evictable_reader: trim range tombstones to the read clustering range
  position_in_partition_view: add position_in_partition_view before_key() overload
  flat_mutation_reader: add buffer() accessor
2020-09-25 17:02:51 +02:00
Botond Dénes
076c27318b mutation_reader_test: add unit test for evictable reader self-validation
Add both positive (where the validation should succeed) and negative
(where the validation should fail) tests, covering all validation cases.
2020-09-25 12:09:01 +03:00
Botond Dénes
d1b0573e1c mutation_reader_test: add unit test for evictable reader range tombstone trimming 2020-09-25 12:09:00 +03:00
Avi Kivity
6fd4601cf8 test: mutation_reader_test: avoid constructing temporary sstables::test_env
A test_env contains an sstables_manager, which will soon have a close() method.
As such, it can no longer be a temporary. Switch to using test_env::do_with_async().
2020-09-23 20:55:08 +03:00
Benny Halevy
c9cade833c sstables: metadata_collector: make only for write path
make a metadata_collector only when writing the sstable,
no need to make one when reading.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2020-08-10 18:51:12 +03:00
Botond Dénes
5de0afdab7 mutation_reader: expose new_reader_base_cost
So that test code can use it.
2020-07-20 11:23:39 +03:00
Rafael Ávila de Espíndola
6fe7706fce mutation_reader_test: Wait for a future
Nothing was waiting for this future. Found while testing another
patch.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
Message-Id: <20200630183929.1704908-1-espindola@scylladb.com>
2020-07-03 08:24:41 +02:00
Botond Dénes
cb69406f6c test/boost/mutation_reader_test: multishard reader: add tests for fast-forwarding with read-ahead 2020-07-01 10:15:49 +03:00
Botond Dénes
d6e2033d8a test/boost/mutation_reader_test: extract multishard read-ahead test setup
Testing the multishard reader's various read-ahead related corner cases
requires a non-trivial setup. Currently there is just one such test,
but we plan to add more so in this patch we extract this setup code to a
free function to allow reuse across multiple tests.
2020-07-01 10:15:49 +03:00
Botond Dénes
851ae8c650 test/boost/mutation_reader_test: puppet_reader: fast-forward-support
A fast-forwarded puppet reader goes immediately to EOS. A counter is
added to the remote control to allow tests to check which readers were
actually fast forwarded.
2020-07-01 10:15:49 +03:00
Botond Dénes
741f0c276d mutation_reader_test: puppet_reader: make interface more predictable
Currently the puppet reader will do an automatic (half) buffer-fill in
the constructor. This makes it very hard to reason about when and how
the action that was passed to it will be executed. Refactor it to take a
list of actions and only execute those, no hidden buffer-fill anymore.
No better proof is needed for this than the fact that the test which is
supposed to test the multishard reader being destroyed with a pending
read-ahead was silently broken (not testing what it should).
This patch fixes this test too.

Also fixed in this patch is the `pending` and `destroyed` fields of the
remote control, tests can now rely on these to be correct and add
additional checkpoints to ensure the test is indeed doing what it was
intended to do.
2020-07-01 10:15:49 +03:00
Botond Dénes
542d9c3711 mutation_reader: expose evictable_reader
Expose functions for the outside world to create evictable readers. We
expose two functions, which create an evictable reader with
`auto_pause::yes` and `auto_pause::no` respectively. The function
creating the latter also returns a handle in addition to the reader,
which can be used to pause the reader.
2020-06-23 21:08:21 +03:00
Raphael S. Carvalho
fb6976f1b9 Make sure SSTables created by streaming are added to backlog tracker
New SStables are only added to backlog tracker if set_unshared() was
called on their behalf. SStables created for streaming are not being
added to the tracker because make_streaming_sstable_for_write()
doesn't call set_unshared() nor does it caller. Which results in backlog
not accounting for their existence, which means backlog will be much
lower than expected.

This problem could be fixed by adding a set_unshared() call but it
turns out we don't even need set_unshared() anymore. It was introduced
when Scylla metadata didn't exist, now a SSTable has built-in knowledge
of whether or not it's shared. Relying on every SSTable creator calling
set_unshared() is bug prone. Let's get rid of it and let the SStable
itself say whether or not it's shared. If an imported SSTable has not
Scylla metadata, Scylla will still be able to compute shards using
token range metadata.

Refs #6021.
Refs #6227.
Fixes #6441.

Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200512220226.134481-1-raphaelsc@scylladb.com>
2020-06-03 17:35:22 +03:00
Botond Dénes
7ea64b1838 test: mutation_reader_test: use <ranges>
Replace all the ranges stuff we use from boost with the std equivalents.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200529141407.158960-3-bdenes@scylladb.com>
2020-05-31 12:58:59 +03:00
Botond Dénes
d68ac8bf18 treewide: remove all uses of no_reader_permit() 2020-05-28 11:34:35 +03:00
Botond Dénes
a08467da29 test: move away from reader_concurrency_semaphore::wait_admission()
And use the reader_permit for this instead. This refactoring has
revealed a pre-existing bug in the `test_lifecycle_policy`, which is
also addressed in this patch. The bug is that said policy executes
reader destructions in the background, and these are not waited for. For
some reason, the semaphore -> permit transition pushes these races over
the edge and we start seeing some of these destruction fibers still
being unfinished when test scopes are exited, causing all sorts of
trouble. The solution is to introduce a special gate that tests can use
to wait for all background work to finish, before the test scope is
exited.
2020-05-28 11:34:35 +03:00
Botond Dénes
4409579352 mutation_reader: restricted_reader: work in terms of reader_permit
We want to refactor all read resource tracking code to work through the
read_permit, so refactor the restricted reader to also do so.
2020-05-28 11:34:35 +03:00
Botond Dénes
9ede82ebf8 memtable: pass a valid permit to the delegate reader
All reader are soon going to require a valid permit, so make sure we
have a valid permit which we can pass to the delegate reader when
creating it. This means `memtable::make_flat_reader()` now also requires
a permit to be passed to it.
Internally the permit is stored in `scanning_reader`, which is used both
for flushes and normal reads. In the former case a permit is not
required.
2020-05-28 11:34:35 +03:00
Botond Dénes
196dd5fa9b treewide: throw std::bad_function_call with backtraces
We typically use `std::bad_function_call` to throw from
mandatory-to-implement virtual functions, that cannot have a meaningful
implementation in the derived class. The problem with
`std::bad_function_call` is that it carries absolutely no information
w.r.t. where was it thrown from.

I originally wanted to replace `std::bad_function_call` in our codebase
with a custom exception type that would allow passing in the name of the
function it is thrown from to be included in the exception message.
However after I ended up also including a backtrace, Benny Halevy
pointed out that I might as well just throw `std:bad_function_call` with
a backtrace instead. So this is what this patch does.

All users are various unimplemented methods of the
`flat_mutation_reader::impl` interface.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200408075801.701416-1-bdenes@scylladb.com>
2020-04-08 13:54:06 +02:00
Piotr Jastrzebski
c44f019eee dummy_sharder: rename dummy_sharding_info.* to dummy_sharder.*
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:33 +02:00
Piotr Jastrzebski
e72696a8e6 sharding_info: rename the class to sharder
Also rename all variables that were named si or sinfo
to sharder.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:33 +02:00