Commit Graph

210 Commits

Author SHA1 Message Date
Botond Dénes
6c8e98e33d mutation_reader: queue_reader_handle: make abandoned() exception safe
Allocating the exception might fail, terminating the application as
`abandoned()` is called in a noexcept context. Handle this case by
catching the bad-alloc and aborting the reader with that instead when
this happens.
2021-10-21 06:50:22 +03:00
Kamil Braun
27238eaa0f sstables: mx: implement reversed single-partition reads
We use partition_reversing_data_source and the new `index_reader` methods
to implement single-partition reads in `mx_sstable_mutation_reader`.

The parsing logic does not need to change: the buffers returned by the
source already contain rows in reversed clustering order.

Some changes were required in `mp_row_consumer_m` which processes the
parsed rows and emits appropriate mutation fragments. The consumer uses
`mutation_fragment_filter` underneath to decide whether a fragment
should be ignored or not (e.g. the parsed fragment may come from outside
the requested clustering range), among other things. Previously
`mutation_fragment_filter` was provided a `partition_slice`. If the
slice was reversed, the filter would use
`clustering_key_filter_ranges::get_ranges` to obtain the clustering
ranges from the slice in unreversed order (they were reversed in the
slice) since we didn't perform any reversing in the reader. Now the
reader provides the ranges directly instead of the slice; furthermore,
the ranges are provided in native-reversed format (the order of ranges
is reversed and the ranges themselves are also reversed), and the schema
provided to the filter is also reversed. Thus to the filter everything
appears as if it was used during a non-reversed query but on a table
with reversed schema, which works correctly given the fact that the
reader is feeding parsed rows into the consumer in reversed order.

During reversed queries the reader uses alternative logic for skipping
to a later range (or, speaking in non-reversed terms, to an earlier range),
which happens in `advance_context`. It asks the index to advance its
upper bound in reverse so that the reversing_data_source notices the
change of the index end position and returns following buffers with rows
from the new range.

There is a slight difference in behavior of the reader from
`mp_row_consumer_m`'s point of view. For non-reversed reads, after
the consumer obtains the beginning of a row (`consume_row_start`)
- which contains the row's position but not the columns - and tells the
reader that the row won't be emitted because we need to skip to a later
range, the reader would tell the data source (the 'context') immediately
to skip to a later range by calling `skip_to`. This caused the source
not to return the rest of the row, and the rest of the row would not
be fed to the consumer (`consume_row_end`). However, for reversed reads,
the data source performs skipping 'on its own', after it notices that
the index end position has changed. This may happen 'too late', causing
the rest of the row to be returned anyway. We are prepared for this
situation inside `mp_row_consumer` by consulting the mutation fragment
filter again when the rest of the row arrives.

Fast forwarding is not supported at this point, which is fine given that
the cache is disabled for reversed queries for now (and the cache is the
only user of fast forwarding).

The `partition_slice` provided by callers is provided in 'half-reversed'
format for reversed queries, where the order of clustering ranges is
reversed, but the ranges themselves are not. This means we need to modify
the slice sometimes: for non-single-partition queries the mx reader must
use a non-reversed slice, and for single-partition queries the mx reader
must use a native-reversed slice (where the clustering ranges themselves
are reversed as well). The modified slice must be stored somewhere; we
store it inside the mx reader itself so we don't need to allocate more
intermediate readers at the call sites.  This causes the interface of
`mx::make_reader` to be a bit weird: for non-single-partition queries
where the provided slice is reversed the reader will actually return a
non-reversed stream of fragments, telling the user to reverse the stream
on their own. The interface has been documented in detail with
appropriate comments.
2021-10-04 15:24:12 +02:00
Botond Dénes
41facb3270 treewide: move reversing to the mutation sources
Push down reversing to the mutation-sources proper, instead of doing it
on the querier level. This will allow us to test reverse reads on the
mutation source level.
The `max_size` parameter of `consume_page()` is now unused but is not
removed in this patch, it will be removed in a follow-up to reduce
churn.
2021-09-29 12:15:45 +03:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Benny Halevy
4476800493 flat_mutation_reader: get rid of timeout parameter
Now that the timeout is taken from the reader_permit.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Benny Halevy
605a1e6943 multishard_mutation_query: create_reader: validate saved reader permit
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Piotr Wojtczak
cb2a0ab858 flat_mutation_reader: Add a new filtering reader factory method
Introduce a new function creating a filtering reader using query slice
and partition range.
2021-07-20 14:00:47 +02:00
Botond Dénes
16d3cb4777 mutation_reader: remove now unused restricting_reader
Move the now orphaned new_reader_base_cost constant to
database.hh/table.cc, as its main user is now
`table::estimate_read_memory_cost()`.
2021-07-14 17:19:02 +03:00
Botond Dénes
426b46c4ed mutation_reader: reader_lifecycle_policy: add obtain_reader_permit()
This method is both a convenience method to obtain the permit, as well
as an abstraction to allow different implementations to get creative.
For example, the main implementation, the one in multishard mutation
query returns the permit of the saved reader one was successful. This
ensures that on a multi-paged read the same permit is used across as
much pages as possible. Much more importantly it ensures the evictable
reader wrapping the actual reader both use the same permit.
2021-07-14 16:48:43 +03:00
Avi Kivity
d27e88e785 Merge "compaction: prevent broken_promise or dangling reader errors" from Benny
"
This series prevents broken_promise or dangling reader errors
when (resharding) compaction is stopped, e.g. during shutdown.

At the moment compaction just closes the reader unilaterally
and this yanks the reader from under the queue_reader_handle feet,
causing dangling queue reader and broken_promise errors
as seen in #8755.

Instead, fix queue_reader::close to set value on the
_full/_not_full promises and detach from the handle,
and return _consume_fut from bucket_writer::consume
if handle is terminated.

Fixes #8755

Test: unit(dev)
DTest: materialized_views_test.py:TestMaterializedViews.interrupt_build_process_and_resharding_half_to_max_test(debug)
"

* tag 'propagate-reader-abort-v3' of github.com:bhalevy/scylla:
  mutation_writer: bucket_writer: consume: propagate _consume_fut if queue_reader_handle is_terminated
  queue_reader_handle: add get_exception method
  queue_reader: close: set value on promises on detach from handle
2021-06-22 18:52:11 +03:00
Avi Kivity
0948908502 Merge "mutation_reader: multishard_combining_reader clean-up close path" from Botond
"
The close path of the multishard combining reader is riddled with
workarounds the fact that the flat mutation reader couldn't wait on
futures when destroyed. Now that we have a close() method that can do
just that, all these workarounds can be removed.
Even more workarounds can be found in tests, where resources like the
reader concurrency semaphore are created separately for each tested
multishard reader and then destroyed after it doesn't need it, so we had
to come up with all sorts of creative and ugly workarounds to keep
these alive until background cleanup is finished.
This series fixes all this. Now, after calling close on the multishard
reader, all resources it used, including the life-cycle policy, the
semaphores created by it can be safely destroyed. This greatly
simplifies the handling of the multishard reader, and makes it much
easier to reason about life-cycle dependencies.

Tests: unit(dev, release:v2, debug:v2,
    mutation_reader_test:debug -t test_multishard,
    multishard_mutation_query_test:debug,
    multishard_combining_reader_as_mutation_source:debug)
"

* 'multishard-combining-reader-close-cleanup/v3' of https://github.com/denesb/scylla:
  mutation_reader: reader_lifecycle_policy: remove convenience methods
  mutation_reader: multishard_combining_reader: store shard_reader via unique ptr
  test/lib/reader_lifecycle_policy: destroy_reader: cleanup context
  test/lib/reader_lifecycle_policy: get rid of lifecycle workarounds
  test/lib/reader_lifecycle_policy: destroy_reader(): stop the semaphore
  test/lib/reader_lifecycle_policy: use a more robust eviction mechanism
  reader_concurrency_semaphore: wait for all permits to be destroyed in stop()
  test/lib/reader_lifcecycle_policy: fix indentation
  mutation_reader: reader_lifecycle_policy::destroy_reader(): require to be called on native shard
  reader_lifecycle_policy implementations: fix indentation
  mutation_reader: reader_lifecycle_policy::destroy_reader(): de-futurize reader parameter
  mutation_reader: shard_reader::close(): wait on the remote reader
  multishard_mutation_query: destroy remote parts in the foreground
  mutation_reader: shard_reader::close(): close _reader
  mutation_reader: reader_lifcecycle_policy::destroy_reader(): remove out-of-date comment
2021-06-16 17:25:50 +03:00
Benny Halevy
b5efc3ceac queue_reader_handle: add get_exception method
To be used by the mutation_writer in the following patch.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-06-16 17:25:16 +03:00
Botond Dénes
28c2b54875 mutation_reader: reader_lifecycle_policy: remove convenience methods
These convenience methods are not used as much anymore and they are not
even really necessary as the register/unregister inactive read API got
streamlined a lot to the point where all of these "convenience methods"
are just one-liners, which we can just inline into their few callers
without loosing readability.
2021-06-16 11:29:37 +03:00
Botond Dénes
8c7447effd mutation_reader: reader_lifecycle_policy::destroy_reader(): require to be called on native shard
Currently shard_reader::close() (its caller) goes to the remote shard,
copies back all fragments left there to the local shard, then calls
`destroy_reader()`, which in the case of the multishard mutation query
copies it all back to the native shard. This was required before because
`shard_reader::stop()` (`close()`'s) predecessor) couldn't wait on
`smp::submit_to()`. But close can, so we can get rid of all this
back-and-forth and just call `destroy_reader()` on the shard the reader
lives on, just like we do with `create_reader()`.
2021-06-16 11:29:35 +03:00
Botond Dénes
a7e59d3e2c mutation_reader: reader_lifecycle_policy::destroy_reader(): de-futurize reader parameter
The shard reader is now able to wait on the stopped reader and pass the
already stopped reader to `destroy_reader()`, so we can de-futurize the
reader parameter of said method. The shard reader was already patched to
pass a ready future so adjusting the call-site is trivial.
The most prominent implementation, the multishard mutation query, can
now also drop its `_dismantling_gate` which was put in place so it can
wait on the background stopping if readers.

A consequence of this move is that handling errors that might happen
during the stopping of the reader is now handled in the shard reader,
not all lifecycle policy implementations.
2021-06-16 11:21:38 +03:00
Tomasz Grabiec
79795a1a61 mutation_source: Introduce make_reader_v2()
Mutation sources can now produce natively either v1 or v2 streams. We
still have both v1 and v2 make_reader() variants, which wrap in
appropriate converters under the hood.
2021-06-16 00:23:49 +02:00
Botond Dénes
98e5f0429b mutation_reader: reader_lifcecycle_policy::destroy_reader(): remove out-of-date comment
About the multishard reader not being able to wait on returned future.
It can now via the `close()` method.
2021-06-15 15:23:32 +03:00
Benny Halevy
8ecc626c15 queue_reader_handle: mark copy constructor noexcept
It is trivially so, as std::exception_ptr is nothrow default
constructible.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210609135925.270883-2-bhalevy@scylladb.com>
2021-06-09 20:09:01 +03:00
Benny Halevy
3100cdcc65 queue_reader_handle: move-construct also _ex
We're only moving the other reader without the
other's exception (as it maybe already be abandoned
or aborted).

While at it, mark the constructor noexcept.

Fixes #8833

Test: unit(dev)
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210609135925.270883-1-bhalevy@scylladb.com>
2021-06-09 20:09:01 +03:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Benny Halevy
2c1edb1a94 mutation_reader: reader_lifecycle_policy: return future from destroy_reader
So we can wait on it from to-be-introduced shard_reader::close().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
7d42a71310 mutation_reader: position_reader_queue: add close method
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
32ab957f82 mutation_reader: filtering_reader: implement close method
Close underlying reader.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
d565e3fb57 reader_lifecycle_policy: retire low level try_resume method
The caller can now just call sem.unregister_inactive_read(irh) directly.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-02-08 20:32:40 +02:00
Botond Dénes
226088d12e mutation_reader: reader_lifecycle_policy::stopped_reader: drop pending_next_partition flag
Its not used anymore.
2021-01-22 16:18:59 +02:00
Benny Halevy
29002e3b48 flat_mutation_reader: return future from next_partition
To allow it to asynchronously close underlying readers
on next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Benny Halevy
75c0c05f71 mutation_reader: filtering_reader: fill_buffer: futurize inner loop
Prepare for futurizing next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Avi Kivity
f802356572 Revert "Revert "Merge "raft: fix replication if existing log on leader" from Gleb""
This reverts commit dc77d128e9. It was reverted
due to a strange and unexplained diff, which is now explained. The
HEAD on the working directory being pulled from was set back, so git
thought it was merging the intended commits, plus all the work that was
committed from HEAD to master. So it is safe to restore it.
2020-12-08 19:19:55 +02:00
Avi Kivity
dc77d128e9 Revert "Merge "raft: fix replication if existing log on leader" from Gleb"
This reverts commit 0aa1f7c70a, reversing
changes made to 72c59e8000. The diff is
strange, including unrelated commits. There is no understanding of the
cause, so to be safe, revert and try again.
2020-12-06 11:34:19 +02:00
Kamil Braun
0b36c5e116 mutation_reader: introduce clustering_order_reader_merger
This abstraction is used to merge the output of multiple readers, each
opened for a single partition query, into a non-decreasing stream
of mutation_fragments.

It is similar to `mutation_reader_merger`,
an important difference is that the new merger may select new readers
in the middle of a partition after it already returned some fragments
from that partition.  It uses the new `position_reader_queue` abstraction
to select new readers. It doesn't support multi-partition (ring range) queries.

The new merger will be later used when reading from sstable sets created
by TimeWindowCompactionStrategy. This strategy creates many sstables
that are mostly disjoint w.r.t the contained clustering keys, so we can
delay opening sstable readers when querying a partition until after we have
processed all mutation fragments with positions before the keys
contained by these sstables.
2020-11-30 11:55:44 +01:00
Kamil Braun
708093884c mutation_reader: move mutation_reader::forwarding to flat_mutation_reader.hh
Files which need this definition won't have to include
mutation_reader.hh, only flat_mutation_reader.hh (so the inclusions are
in total smaller; mutation_reader.hh includes flat_mutation_reader.hh).
2020-11-19 17:52:39 +01:00
Botond Dénes
307cdf1e0d multishard_combining_reader: reader_lifecycle_policy: add permit param to create_reader()
Allow the evictable reader managing the underlying reader to pass its
own permit to it when creating it, making sure they share the same
permit. Note that the two parts can still end up using different
permits, when the underlying reader is kept alive between two pages of a
paged read and thus keeps using the permit received on the previous
page.

Also adjust the `reader_context` in multishard_mutation_query.cc to use
the passed-in permit instead of creating a new one when creating a new
reader.
2020-10-12 15:56:56 +03:00
Botond Dénes
e09ab09fff multishard_combining_reader: add permit parameter
Don't create an own permit, take one as a parameter, like all other
readers do, so the permit can be provided by the higher layer, making
sure all parts of the logical read use the same permit.
2020-10-12 15:56:56 +03:00
Botond Dénes
dd372c8457 flat_mutation_reader: de-virtualize buffer_size()
The main user of this method, the one which required this method to
return the collective buffer size of the entire reader tree, is now
gone. The remaining two users just use it to check the size of the
reader instance they are working with.
So de-virtualize this method and reduce its responsibility to just
returning the buffer size of the current reader instance.
2020-10-06 08:22:56 +03:00
Botond Dénes
0518571e56 flat_mutation_reader: make _buffer a tracked buffer
Via a tracked_allocator. Although the memory allocations made by the
_buffer shouldn't dominate the memory consumption of the read itself,
they can still be a significant portion that scales with the number of
readers in the read.
2020-09-28 10:53:56 +03:00
Botond Dénes
3fab83b3a1 flat_mutation_reader: impl: add reader_permit parameter
Not used yet, this patch does all the churn of propagating a permit
to each impl.

In the next patch we will use it to track to track the memory
consumption of `_buffer`.
2020-09-28 10:53:48 +03:00
Avi Kivity
257c17a87a Merge "Don't depend on seastar::make_(lw_)?shared idiosyncrasies" from Rafael
"
While working on another patch I was getting odd compiler errors
saying that a call to ::make_shared was ambiguous. The reason was that
seastar has both:

template <typename T, typename... A>
shared_ptr<T> make_shared(A&&... a);

template <typename T>
shared_ptr<T> make_shared(T&& a);

The second variant doesn't exist in std::make_shared.

This series drops the dependency in scylla, so that a future change
can make seastar::make_shared a bit more like std::make_shared.
"

* 'espindola/make_shared' of https://github.com/espindola/scylla:
  Everywhere: Explicitly instantiate make_lw_shared
  Everywhere: Add a make_shared_schema helper
  Everywhere: Explicitly instantiate make_shared
  cql3: Add a create_multi_column_relation helper
  main: Return a shared_ptr from defer_verbose_shutdown
2020-08-02 19:51:24 +03:00
Rafael Ávila de Espíndola
e15c8ee667 Everywhere: Explicitly instantiate make_lw_shared
seastar::make_lw_shared has a constructor taking a T&&. There is no
such constructor in std::make_shared:

https://en.cppreference.com/w/cpp/memory/shared_ptr/make_shared

This means that we have to move from

    make_lw_shared(T(...)

to

    make_lw_shared<T>(...)

If we don't want to depend on the idiosyncrasies of
seastar::make_lw_shared.

Signed-off-by: Rafael Ávila de Espíndola <espindola@scylladb.com>
2020-07-21 10:33:49 -07:00
Botond Dénes
5de0afdab7 mutation_reader: expose new_reader_base_cost
So that test code can use it.
2020-07-20 11:23:39 +03:00
Botond Dénes
542d9c3711 mutation_reader: expose evictable_reader
Expose functions for the outside world to create evictable readers. We
expose two functions, which create an evictable reader with
`auto_pause::yes` and `auto_pause::no` respectively. The function
creating the latter also returns a handle in addition to the reader,
which can be used to pause the reader.
2020-06-23 21:08:21 +03:00
Avi Kivity
a4c44cab88 treewide: update concepts language from the Concepts TS to C++20
Seastar recently lost support for the experimental Concepts Technical
Specification (TS) and gained support for C++20 concepts. Re-enable
concepts in Scylla by updating our use of concepts to the C++20
standard.

This change:
 - peels off uses of the GCC6_CONCEPT macro
 - removes inclusions of <seastar/gcc6-concepts.hh>
 - replaces function-style concepts (no longer supported) with
   equation-style concepts
 - semicolons added and removed as needed
 - deprecated std::is_pod replaced by recommended replacement
 - updates return type constraints to use concepts instead of
   type names (either std::same_as or std::convertible_to, with
   std::same_as chosen when possible)

No attempt is made to improve the concepts; this is a specification
update only.
Message-Id: <20200531110254.2555854-1-avi@scylladb.com>
2020-06-02 09:12:21 +03:00
Botond Dénes
d68ac8bf18 treewide: remove all uses of no_reader_permit() 2020-05-28 11:34:35 +03:00
Botond Dénes
4409579352 mutation_reader: restricted_reader: work in terms of reader_permit
We want to refactor all read resource tracking code to work through the
read_permit, so refactor the restricted reader to also do so.
2020-05-28 11:34:35 +03:00
Glauber Costa
e44b2826ab compaction: avoid abandoned futures when using interposers
When using interposers, cancelling compactions can leave futures
that are not waited for (resharding, twcs)

The reason is when consume_end_of_stream gets called, it tries to
push end_of_stream into the queue_reader_handle. Because cancelling
a compaction is done through an exception, the queue_reader_handle
is terminated already at this time. Trying to push to it generates
another exception and prevents us from returning the future right
below it.

This patch adds a new method is_terminated() and if we detect
that the queue_reader_handle is already terminated by this point,
we don't try to push. We call it is_terminated() because the check
is to see if the queue_reader_handle has a _reader. The reader is
also set to null on successful destruction.

Signed-off-by: Glauber Costa <glauber@scylladb.com>
Reviewed-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200430175839.8292-1-glauber@scylladb.com>
2020-05-01 16:30:23 +03:00
Piotr Jastrzebski
e72696a8e6 sharding_info: rename the class to sharder
Also rename all variables that were named si or sinfo
to sharder.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:33 +02:00
Piotr Jastrzebski
031f589dba multishard_combining_reader: use token_for_next_shard from sharding info not partitioner
Previously this function was accessing sharding logic
through partitioner obtained from the schema.

While converting tests, dummy_partitioner is turned into
dummy_sharding_info.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2020-03-30 18:42:25 +02:00
Pavel Emelyanov
96e3d0fa36 mutation_partition: Debloat header form others
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20200317191051.12623-1-xemul@scylladb.com>
2020-03-18 11:53:36 +02:00
Avi Kivity
342c967b6a Merge "Introduce compacting reader" from Botond
"
Allow adding compacting to any reader pipeline. The intended users are
streaming and repair, with the goal to prevent wasting transfer
bandwidth with data that is purgeable.
No current user in the tree.

Tests: unit(dev), mutation_reader_test.compacting_reader_*(debug)
"

* 'compacting-reader/v3' of https://github.com/denesb/scylla:
  test: boost/mutation_reader_test: add unit test for compacting_reader
  test: lib/flat_mutation_reader_assertions: be more lenient about empty mutations
  test: lib/mutation_source_test: make data compaction friendly
  test: random_mutation_generator: add generate_uncompactable mode
  mutation_reader: introduce compacting_reader
2020-03-16 16:41:50 +02:00
Botond Dénes
8286a0b1bd mutation_reader: introduce compacting_reader
Compacting reader compacts the output of another reader on-the-fly.
Performs compaction-type compaction (`compact_for_sstables::yes`).
It will be used in streaming and repair to eliminate purgeable data from
the stream, thus prevent wasting transfer bandwidth.
2020-03-16 13:58:13 +02:00