Commit Graph

112 Commits

Author SHA1 Message Date
Benny Halevy
4476800493 flat_mutation_reader: get rid of timeout parameter
Now that the timeout is taken from the reader_permit.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Benny Halevy
e9aff2426e everywhere: make deferred actions noexcept
Prepare for updating seastar submodule to a change
that requires deferred actions to be noexcept
(and return void).

Test: unit(dev, debug)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-22 21:11:52 +03:00
Nadav Har'El
6c27000b98 Merge 'Propagate exceptions without throwing' from Piotr Sarna
NOTE: this series depends on a Seastar submodule update, currently queued in next: 0ed35c6af052ab291a69af98b5c13e023470cba3

In order to avoid needless throwing, exceptions are passed
directly wherever possible. Two mechanisms which help with that are:
 1. `make_exception_future<>` for futures
 2. `co_return coroutine::exception(...)` for coroutines
    which return `future<T>` (the mechanism does not work for `future<>`
    without parameters, unfortunately)

Tests: unit(release)

Closes #9079

* github.com:scylladb/scylla:
  system_keyspace: pass exceptions without throwing
  sstables: pass exceptions without throwing
  storage_proxy: pass exceptions without throwing
  multishard_mutation_query: pass exceptions without throwing
  client_state: pass exceptions without throwing
  flat_mutation_reader: pass exceptions without throwing
  table: pass exceptions without throwing
  commitlog: pass exceptions without throwing
  compaction: pass exceptions without throwing
  database: pass exceptions without throwing
2021-08-01 16:47:47 +03:00
Pavel Emelyanov
1bf643d4fd mutation_partition: Pin mutable access to range tombstones
Some callers of mutation_partition::row_tomstones() don't want
(and shouldn't) modify the list itself, while they may want to
modify the tombstones. This patch explicitly locates those that
need to modify the collection, because the next patch will
return immutable collection for the others.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Pavel Emelyanov
ad27bf40e6 mutation_partition: Pin mutable access to rows
Some callers of mutation_partition::clustered_rows() don't want
(and shouldn't) modify the tree of rows, while they may want to
modify the rows themselves. This patch explicitly locates those
that need to modify the collection, because the next patch will
return immutable collection for the others.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Piotr Sarna
e5925d4980 flat_mutation_reader: pass exceptions without throwing
In order to avoid needless throwing, exceptions are passed
directly wherever possible. Two mechanisms which help with that are:
 1. make_exception_future<> for futures
 2. co_return coroutine::exception(...) for coroutines
    which return future<T> (the mechanism does not work for future<>
    without parameters, unfortunately)
2021-07-26 17:04:20 +02:00
Piotr Wojtczak
cb2a0ab858 flat_mutation_reader: Add a new filtering reader factory method
Introduce a new function creating a filtering reader using query slice
and partition range.
2021-07-20 14:00:47 +02:00
Michał Radwański
67d99e02a7 flat_mutation_reader: downgrade_to_v1 - reset state of rt_assembler
The downgrade_to_v1 didn't reset the state of range tombstone assembler
in case of the calls to next_partition or fast_forward_to, which caused
a situation where the closing range tombstone change is cleared from the
buffer before being emitted, without notifying the assembler. This patch
fixes the behaviour in fast_forward_to as well.

Fixes #9022
2021-07-19 15:54:26 +02:00
Tomasz Grabiec
1f255c420e flat_mutation_reader_v2: Make is_end_of_stream() reflect consumer-side state of the stream
Currently, flat_mutation_reader_v2::is_end_of_stream() returns
flat_mutation_reader_v2::impl::_end_of_stream, which means the producer
is done. The stream may be still not yet fully consumed even if
producer is done, due to internal buffering. So consumers need to make
a more elaborate check:

  rd.is_end_of_stream() && rd.is_buffer_empty()

It would be cleaner if flat_mutation_reader_v2::is_end_of_stream()
returned the state of the consumer-side of the stream, since it
belongs to the consumer-side of the API. The consumption will be as
simple as:

  while (!rd.is_end_of_stream()) {
    consume_fragment(rd());
  }

This patch makes the change on the v2 of the reader interface. v1 is
not changed to avoid problems which could happen when backporting code
which assumes new semantics into a version with the old semantics. v2
is not in any old branch yet so it doesn't have this problem and it's
a good time to make the API change.

Note that it's always safe to use the new semantics in the context
which assumes the old semantics, so v1 users can be safely converted
to v2 even if they are unware of the change.

Fixes #3067

Message-Id: <20210715102833.146914-1-tgrabiec@scylladb.com>
2021-07-15 14:00:48 +03:00
Botond Dénes
7cf5b43bbc mutation_fragment_stream_validator: add schema() accessor 2021-07-12 07:11:29 +03:00
Tomasz Grabiec
558d88ea17 flat_mutation_reader: Trim range tombstones in make_flat_mutation_reader_from_fragments()
This is needed to change the guarantees of flat_mutation_reader v1 to
produce only range tombstones trimmed to clustering restrictions. The
reason for this is so that v2 has a canonical representation in which
all fragments have position inside clustering restrictions. Conversion
from v1 to v2 can guarantee that only if v1 trims range tombstones.
2021-06-16 00:23:49 +02:00
Tomasz Grabiec
1ef73abd82 flat_mutation_reader_v2: Implement read_mutation_from_flat_mutation_reader() 2021-06-15 13:14:45 +02:00
Tomasz Grabiec
9996b7ca18 flat_mutation_reader: Introduce adaptors between v1 and v2 of mutation fragment stream
The transition to v2 will be incremental. To support that, we need
adaptors between v1 and v2 which will be inserted at places which are
boundaries of conversion.

The v1 -> v2 converter needs to accumulate range tombstones, so has unbounded worst case memory footprint.

The v2 -> v1 converter trims range tombstones around clustering rows,
so generates more fragments than necessary.

Because of that, adpators are a temporary solution and we should not
release with them on the produciton code paths.
2021-06-15 13:10:47 +02:00
Tomasz Grabiec
e3309322c3 Clone flat_mutation_reader related classes into v2 variants
To make review easier, first clone the classes without chaning the
logic. Logic and API will change in subsequent commits.
2021-06-15 13:10:09 +02:00
Pavel Solodovnikov
76bea23174 treewide: reduce header interdependencies
Use forward declarations wherever possible.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>

Closes #8813
2021-06-07 15:58:35 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Botond Dénes
104a47699c mutation_fragment_stream_validator: add reset methods
Allow resetting the validator to a given partition or mutation fragment.
This allows a user which is able to fix corrupt streams to reset the
validator to a partition or row which the validator normally wouldn't
accept and hence it wouldn't advance its internal state to it.
2021-05-05 12:03:42 +03:00
Benny Halevy
b134640829 flat_mutation_reader: abort if not closed before destroyed
The motivation to abort if the reader is not closed before its destroyed
is mainly driven by:
1. Aborting will force us find and fix missing closes.
Otherwise, log warnings can easily be lost in the noise.
(ERRORs however are caught by dtests but won't be necessarily
caught in SCT / production environments)

2. Following patches remove existing cleanup code
in destructors that is not needed once close() is mandated.
If we don't abort on missing close we'll have to keep maintaining
both cleanup paths forever.

3. Not enforcing close exposes us to leaks and potential
use-after-free from background tasks that are left behind.
We want to stop guranteeing the safety of the background tasks
post close().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
5b22731f9a flat_mutation_reader: require close
Make flat_mutation_reader::impl::close pure virtual
so that all implementations are required to implemnt it.

With that, provide a trivial implementation to
all implementations that currently use the default,
trivial close implementation.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:35:07 +03:00
Benny Halevy
0da2eea211 flat_mutation_reader: flat_multi_range_mutation_reader: close underlying reader
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
18268ab474 flat_mutation_reader: forwardable_empty_mutation_reader: close optional underlying reader
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
e2e642b1b1 flat_mutation_reader: make_forwardable, make_nonforwardable: close underlying reader
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
978501c336 flat_mutation_reader: partition_reversing_mutation_reader: implement no-op close
We don't own _source therefore do not close it.
That said, we still need to make sure that the reversing reader
itself is closed to calm down the check when it's destroyed.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
f4dfaaa6c9 flat_mutation_reader: delegating_reader: close reader when moved to it
The underlying reader is owned by the caller if it is moved to it,
but not if it was constructed with a reference to the underlying reader.
Close the underlying reader on close() only in the former case.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
ca06d3c92a flat_mutation_reader: log a warning if destroyed without closing
We cannot close in the background since there are use cases
that require the impl to be destroyed synchronously.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Benny Halevy
a471579bd7 flat_mutation_reader: introduce close
Allow closing readers before destorying them.
This way, outstanding background operations
such as read-aheads can be gently canceled
and be waited upon.

Note that similar to destructors, close must not fail.
There is nothing to do about errors after the f_m_r is done.
Enforce that in flat_mutation_reader::close() so if the f_m_r
implementation did return a failure, report it and abort as internal
error.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-04-25 11:16:10 +03:00
Botond Dénes
727bc0f5d4 mutation_fragment_stream_validator: add token validation level
In some cases the full-blown partition key validation and especially the
associated key copy per partition might be deemed too costly. As a next
best thing this patch adds a token only validation, which should cover
99% (number pulled out of my sleeve) of the cases. Let's hope no one
gets unlucky.
2021-03-01 07:49:23 +02:00
Botond Dénes
694f8a4ec6 mutation_fragment_stream_validating_filter: make validation levels more fine-grained
Currently key order validation for the mutation fragment stream
validating filter is all or nothing. Either no keys (partition or
clustering) are validated or all of them. As we suspect that clustering
key order validation would add a significant overhead, this discourages
turning key validation on, which means we miss out on partition key
monotonicity validation which has a much more moderate cost.
This patch makes this configurable in a more fine-grained fashion,
providing separate levels for partition and clustering key monotonicity
validation.

As the choice for the default validation level is not as clear-cut as
before, the default value for the validation level is removed in the
validating filter's constructor.
2021-03-01 07:49:23 +02:00
Pavel Emelyanov
bfcd6a4bb7 flat_mutation_reader: Use clear() in destroy_current_mutation()
Currently the code uses a look of unlink_leftmost_without_rebalance
calls. B-tree does have it, but plain clearing of the tree is a bit
faster with clear().

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-02 09:30:30 +03:00
Benny Halevy
29002e3b48 flat_mutation_reader: return future from next_partition
To allow it to asynchronously close underlying readers
on next_partition().

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-01-13 17:35:07 +02:00
Botond Dénes
8dae6152bf mutation_fragment_stream_validator: make it easier to validate concrete fragment types
The current API is tailored to the `mutation_fragment` type. In
the next patch we will want to use the validator from a context where
the mutation fragments are already decomposed into their respective
concrete types, e.g. static_row, clustering_row, etc. To avoid having to
reconstruct a mutation fragment type just to use the validator, add an
API which allows validating these concrete types conveniently too.
2021-01-11 08:07:42 +02:00
Botond Dénes
495f9d54ba flat_mutation_reader: extract fragment stream validator into its own header
To allow using it without pulling in the huge `flat_mutation_reader.hh`.
2021-01-11 08:07:42 +02:00
Botond Dénes
dd372c8457 flat_mutation_reader: de-virtualize buffer_size()
The main user of this method, the one which required this method to
return the collective buffer size of the entire reader tree, is now
gone. The remaining two users just use it to check the size of the
reader instance they are working with.
So de-virtualize this method and reduce its responsibility to just
returning the buffer size of the current reader instance.
2020-10-06 08:22:56 +03:00
Botond Dénes
256140a033 mutation_fragment: memory_usage(): remove unused schema parameter
The memory usage is now maintained and updated on each change to the
mutation fragment, so it needs not be recalculated on a call to
`memory_usage()`, hence the schema parameter is unused and can be
removed.
2020-09-28 11:27:47 +03:00
Botond Dénes
6ca0464af5 mutation_fragment: add schema and permit
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.

In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
2020-09-28 11:27:23 +03:00
Botond Dénes
0518571e56 flat_mutation_reader: make _buffer a tracked buffer
Via a tracked_allocator. Although the memory allocations made by the
_buffer shouldn't dominate the memory consumption of the read itself,
they can still be a significant portion that scales with the number of
readers in the read.
2020-09-28 10:53:56 +03:00
Botond Dénes
3fab83b3a1 flat_mutation_reader: impl: add reader_permit parameter
Not used yet, this patch does all the churn of propagating a permit
to each impl.

In the next patch we will use it to track to track the memory
consumption of `_buffer`.
2020-09-28 10:53:48 +03:00
Pavel Emelyanov
f19b85b61d range_tombstone_list: Introduce and use pop-and-lock helper
There's an optimization in flat_mutation_reader_from_mutations
that folds the list from left-to-right in linear time. In case
of currently used boost::set the .unlink_leftmost_without_rebalance
helper is used, so wrap this exception with a method of the
range_tombstone_list. This is the last place where caller need
to mess with the exact internal collection.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Pavel Emelyanov
a89c7198c2 range_tombstone_list: Introduce and use pop_as<>()
The method extracts an element from the list, constructs
a desired object from it and frees. This is common usage
of range_tombstone_list. Having a helper helps encapsulating
the exact collection inside the class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Pavel Emelyanov
27912375b2 flat_mutation_reader: Use range_tombstone_list begin/end API
The goal is to stop revealing the exact collection from
the range_tombstone_list, so make use of existing begin/end
methods and extend with rbegin() where needed.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Botond Dénes
92ce39f014 query: query_class_config: use max_result_size for the max_memory_for_unlimited_query field
We want to switch from using a single limit to a dual soft/hard limit.
As a first step we switch the limit field of `query_class_config` to use
the recently introduced type for this. As this field has a single user
at the moment -- reverse queries (and not a lot of propagation) -- we
update it in this same patch to use the soft/hard limit: warn on
reaching the soft limit and abort on the hard limit (the previous
behaviour).
2020-07-28 18:00:29 +03:00
Piotr Sarna
4cb79f04b0 treewide: replace libjsoncpp usage with rjson
In order to eventually switch to a single JSON library,
most of the libjsoncpp usage is dropped in favor of rjson.
Unfortunately, one usage still remains:
test/utils/test_repl utility heavily depends on the *exact textual*
format of its output JSON files, so replacing a library results
in all tests failing because of differences in formatting.
It is possible to force rjson to print its documents in the exact
matching format, but that's left for later, since the issue is not
critical. It would be nice though if our test suite compared
JSON documents with a real JSON parser, since there are more
differences - e.g. libjsoncpp keeps children of the object
sorted, while rapidjson uses an unordered data structure.
This change should cause no change in semantics, it strives
just to replace all usage of libjsoncpp with rjson.
2020-07-03 10:27:23 +02:00
Botond Dénes
0b4ec62332 flat_mutation_reader: flat_multi_range_reader: add reader_permit parameter
Mutation sources will soon require a valid permit so make sure we have
one and pass it to the mutation sources when creating the underlying
readers.
For now, pass no_reader_permit() on call sites, deferring the obtaining
of a valid permit to later patches.
2020-05-28 11:34:35 +03:00
Botond Dénes
196dd5fa9b treewide: throw std::bad_function_call with backtraces
We typically use `std::bad_function_call` to throw from
mandatory-to-implement virtual functions, that cannot have a meaningful
implementation in the derived class. The problem with
`std::bad_function_call` is that it carries absolutely no information
w.r.t. where was it thrown from.

I originally wanted to replace `std::bad_function_call` in our codebase
with a custom exception type that would allow passing in the name of the
function it is thrown from to be included in the exception message.
However after I ended up also including a backtrace, Benny Halevy
pointed out that I might as well just throw `std:bad_function_call` with
a backtrace instead. So this is what this patch does.

All users are various unimplemented methods of the
`flat_mutation_reader::impl` interface.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200408075801.701416-1-bdenes@scylladb.com>
2020-04-08 13:54:06 +02:00
Botond Dénes
7bdeec4b00 flat_mutation_reader: make_reversing_reader(): add memory limit
If the reversing requires more memory than the limit, the read is
aborted. All users are updated to get a meaningful limit, from the
respective table object, with the exception of tests of course.
2020-02-27 18:11:54 +02:00
Botond Dénes
091d80e8c3 flat_mutation_reader: expose reverse reader as a standalone reader
Currently reverse reads just pass a flag to
`flat_mutation_reader::consume()` to make the read happen in reverse.
This is deceptively simple and streamlined -- while in fact behind the
scenes a reversing reader is created to wrap the reader in question to
reverse partitions, one-by-one.

This patch makes this apparent by exposing the reversing reader via
`make_reversing_reader()`. This now makes how reversing works more
apparent. It also allows for more configuration to be passed to the
reversing reader (in the next patches).

This change is forward compatible, as in time we plan to add reversing
support to the sstable layer, in which case the reversing reader will
go.
2020-02-27 18:11:54 +02:00
Botond Dénes
1b7725af4b mutation_fragment_stream_validator: split into low-level and high-level API
The low-level validator allows fine-grained validation of different
aspects of monotonicity of a fragment stream. It doesn't do any error
handling. Since different aspects can be validated with different
functions, this allows callers to understand what exactly is invalid.

The high-level API is the previous fragment filter one. This is now
built on the low-level API.

This division allows for advanced use cases where the user of the
validator wants to do all error handling and wants to decide exactly
what monotonicity to validate. The motivating use-case is scrubbing
compaction, added in the next patches.
2020-02-13 15:02:32 +02:00
Botond Dénes
dfc8b2fc45 treewide: replace reader_resource_tracer with reader_permit
The former was never really more than a reader_permit with one
additional method. Currently using it doesn't even save one from any
includes. Now that readers will be using reader_permit we would have to
pass down both to mutation_source. Instead get rid of
reader_resource_tracker and just use reader_permit. Instead of making it
a last and optional parameter that is easy to ignore, make it a
first class parameter, right after schema, to signify that permits are
now a prominent part of the reader API.

This -- mostly mechanical -- patch essentially refactors mutation_source
to ask for the reader_permit instead of reader_resource_tracking and
updates all usage sites.
2020-01-28 08:13:16 +02:00
Botond Dénes
a74a82d4d2 flat_mutation_reader: mutation_fragment_stream_validator: add name
Add a name parameter to the validator, so that the validator can be
identified in log messages. Schema identity information is added to the
name automatically. This should help pinpoint the problematic place
where validation failed.
Although at the moment we have a single validator, it still benefits
from having a name, as we can now include in it the name of the sstable
being written and hence trace the source of the bad data.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20200117150616.895878-1-bdenes@scylladb.com>
2020-01-20 11:06:30 +01:00
Botond Dénes
08bb0bd6aa mutation_fragment_stream_validator: wrap exceptions into own exception type
So a higher level component using the validator to validate a stream can
catch only validation errors, and let any other incidental exception
through.

This allows building data correctors on top of the
`mutation_fragment_stream_validator`, by filtering a fragment stream
through a validator, catching invalid fragment stream exceptions and
dropping the respective fragments from the stream.

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <20191220073443.530750-1-bdenes@scylladb.com>
2019-12-20 12:05:00 +01:00