Clang does not yet implement p1091r3, which allows lambdas
to capture structured bindings. To accomodate it, don't
use structured bindings for variables that are later
captured.
Require a schema and an operation name to be given to each permit when
created. The schema is of the table the read is executed against, and
the operation name, which is some name identifying the operation the
permit is part of. Ideally this should be different for each site the
permit is created at, to be able to discern not only different kind of
reads, but different code paths the read took.
As not all read can be associated with one schema, the schema is allowed
to be null.
The name will be used for debugging purposes, both for coredump
debugging and runtime logging of permit-related diagnostics.
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
The main user of this method, the one which required this method to
return the collective buffer size of the entire reader tree, is now
gone. The remaining two users just use it to check the size of the
reader instance they are working with.
So de-virtualize this method and reduce its responsibility to just
returning the buffer size of the current reader instance.
The memory usage is now maintained and updated on each change to the
mutation fragment, so it needs not be recalculated on a call to
`memory_usage()`, hence the schema parameter is unused and can be
removed.
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.
In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
Not used yet, this patch does all the churn of propagating a permit
to each impl.
In the next patch we will use it to track to track the memory
consumption of `_buffer`.
Current code uses a single counter to produce multiple buffer worth of
data. This uses carry-on from on buffer to the other, which happens to
work with the current memory accounting but is very fragile. Account
each buffer separately, resetting the counter between them.
Some tests rely on `consume*()` calls on the permit to take effect
immediately. Soon this will only be true once the permit has been
admitted, so make sure the permit is admitted in these tests.
The reader recreation mechanism is a very delicate and error-prone one,
as proven by the countless bugs it had. Most of these bugs were related
to the recreated reader not continuing the read from the expected
position, inserting out-of-order fragments into the stream.
This patch adds a defense mechanism against such bugs by validating the
start position of the recreated reader.
The intent is to prevent corrupt data from getting into the system as
well as to help catch these bugs as close to the source as possible.
Fixes: #7208
Tests: unit(dev), mutation_reader_test:debug (v4)
* botond/evictable-reader-validate-buffer/v5:
mutation_reader_test: add unit test for evictable reader self-validation
evictable_reader: validate buffer after recreation the underlying
evictable_reader: update_next_position(): only use peek'd position on partition boundary
mutation_reader_test: add unit test for evictable reader range tombstone trimming
evictable_reader: trim range tombstones to the read clustering range
position_in_partition_view: add position_in_partition_view before_key() overload
flat_mutation_reader: add buffer() accessor
A test_env contains an sstables_manager, which will soon have a close() method.
As such, it can no longer be a temporary. Switch to using test_env::do_with_async().
Testing the multishard reader's various read-ahead related corner cases
requires a non-trivial setup. Currently there is just one such test,
but we plan to add more so in this patch we extract this setup code to a
free function to allow reuse across multiple tests.
A fast-forwarded puppet reader goes immediately to EOS. A counter is
added to the remote control to allow tests to check which readers were
actually fast forwarded.
Currently the puppet reader will do an automatic (half) buffer-fill in
the constructor. This makes it very hard to reason about when and how
the action that was passed to it will be executed. Refactor it to take a
list of actions and only execute those, no hidden buffer-fill anymore.
No better proof is needed for this than the fact that the test which is
supposed to test the multishard reader being destroyed with a pending
read-ahead was silently broken (not testing what it should).
This patch fixes this test too.
Also fixed in this patch is the `pending` and `destroyed` fields of the
remote control, tests can now rely on these to be correct and add
additional checkpoints to ensure the test is indeed doing what it was
intended to do.
Expose functions for the outside world to create evictable readers. We
expose two functions, which create an evictable reader with
`auto_pause::yes` and `auto_pause::no` respectively. The function
creating the latter also returns a handle in addition to the reader,
which can be used to pause the reader.
New SStables are only added to backlog tracker if set_unshared() was
called on their behalf. SStables created for streaming are not being
added to the tracker because make_streaming_sstable_for_write()
doesn't call set_unshared() nor does it caller. Which results in backlog
not accounting for their existence, which means backlog will be much
lower than expected.
This problem could be fixed by adding a set_unshared() call but it
turns out we don't even need set_unshared() anymore. It was introduced
when Scylla metadata didn't exist, now a SSTable has built-in knowledge
of whether or not it's shared. Relying on every SSTable creator calling
set_unshared() is bug prone. Let's get rid of it and let the SStable
itself say whether or not it's shared. If an imported SSTable has not
Scylla metadata, Scylla will still be able to compute shards using
token range metadata.
Refs #6021.
Refs #6227.
Fixes#6441.
Signed-off-by: Raphael S. Carvalho <raphaelsc@scylladb.com>
Message-Id: <20200512220226.134481-1-raphaelsc@scylladb.com>
And use the reader_permit for this instead. This refactoring has
revealed a pre-existing bug in the `test_lifecycle_policy`, which is
also addressed in this patch. The bug is that said policy executes
reader destructions in the background, and these are not waited for. For
some reason, the semaphore -> permit transition pushes these races over
the edge and we start seeing some of these destruction fibers still
being unfinished when test scopes are exited, causing all sorts of
trouble. The solution is to introduce a special gate that tests can use
to wait for all background work to finish, before the test scope is
exited.
All reader are soon going to require a valid permit, so make sure we
have a valid permit which we can pass to the delegate reader when
creating it. This means `memtable::make_flat_reader()` now also requires
a permit to be passed to it.
Internally the permit is stored in `scanning_reader`, which is used both
for flushes and normal reads. In the former case a permit is not
required.
We typically use `std::bad_function_call` to throw from
mandatory-to-implement virtual functions, that cannot have a meaningful
implementation in the derived class. The problem with
`std::bad_function_call` is that it carries absolutely no information
w.r.t. where was it thrown from.
I originally wanted to replace `std::bad_function_call` in our codebase
with a custom exception type that would allow passing in the name of the
function it is thrown from to be included in the exception message.
However after I ended up also including a backtrace, Benny Halevy
pointed out that I might as well just throw `std:bad_function_call` with
a backtrace instead. So this is what this patch does.
All users are various unimplemented methods of the
`flat_mutation_reader::impl` interface.
Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200408075801.701416-1-bdenes@scylladb.com>
dummy_partitioner was renamed to dummy_sharding_info in
the previous patch. This patch cleans up the names of
files. It's done in a separate patch to not obstruct
the diff of previous patch.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Previously this function was accessing sharding logic
through partitioner obtained from the schema.
While converting tests, dummy_partitioner is turned into
dummy_sharding_info.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
The class does not depend on partitioning logic but only uses
sharding logic. This means it is possible and desirable to limit its
dependency to only sharding_info.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
sstable_test.hh started as collection of utilities shared between the
various `_sstable_test.cc` files. Predictably other tests started using
it as well, among them some that are non boost unit tests. This poses a
problem as if we add the missing boost/test/unit_test.hpp include to
sstable_test.hh these tests will suddenly have missing symbols from
boost::test. To avoid linking boost::test into all these users, extract
utilities more widely used into sstable_utils.hh
Detach test_multishard_combining_reader_as_mutation_source into
individual file.
This particular test runs ~13 minutes. What's left in the origin
completes a bit faster.
The split also requires moving the reader_lifecycle_policy and
the dummy_partitioner into lib/
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
"
Allow adding compacting to any reader pipeline. The intended users are
streaming and repair, with the goal to prevent wasting transfer
bandwidth with data that is purgeable.
No current user in the tree.
Tests: unit(dev), mutation_reader_test.compacting_reader_*(debug)
"
* 'compacting-reader/v3' of https://github.com/denesb/scylla:
test: boost/mutation_reader_test: add unit test for compacting_reader
test: lib/flat_mutation_reader_assertions: be more lenient about empty mutations
test: lib/mutation_source_test: make data compaction friendly
test: random_mutation_generator: add generate_uncompactable mode
mutation_reader: introduce compacting_reader
The function already takes schema so there's no need
for it to take partitioner. It can be obtained using
schema::get_partitioner
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
We use boost test logging primarily to generate nice XML xunit
files used in Jenkins. These XML files can be bloated
with messages from BOOST_TEST_MESSAGE(), hundreds of megabytes
of build archives, on every build.
Let's use seastar logger for test logging instead, reserving
the use of boost log facilities for boost test markup information.
"
Timeouts defaulted to `db::no_timeout` are dangerous. They allow any
modifications to the code to drop timeouts and introduce a source of
unbounded request queue to the system.
This series removes the last such default timeouts from the code. No
problems were found, only test code had to be updated.
tests: unit(dev)
"
* 'no-default-timeouts/v1' of https://github.com/denesb/scylla:
database: database::query*(), database::apply*(): remove default timeouts
database: table::query(): remove default timeout
mutation_query: data_query(): remove default timeout
mutation_query: mutation_query(): remove default timeout
multishard_mutation_query: query_mutations_on_all_shards(): remove default timeout
reader_concurrency_semaphore: wait_admission(): remove default timeout
utils/logallog: run_when_memory_available(): remove default timeout
and replace all dht::global_partitioner().decorate_key
with dht::decorate_key
It is an improvement because dht::decorate_key takes schema
and uses it to obtain partitioner instead of using global
partitioner as it was before.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
i_partitioner.hh is widely included while sharders are used
only in 6 places so there's no need to include them in
the whole codebase.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Since token representation is fixed now, all the partitioners
will share the sharding logic. It makes sense now to keep
the logic in common super class and separate header that's
included only in i_partitioner.cc.
shard_of and token_for_next_shard are now implemented in
i_partitioner. They would be non-virtual but we have to
keep them virtual because one test is overriding them
to enforce some specific sharding.
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>