Commit Graph

29 Commits

Author SHA1 Message Date
Avi Kivity
e1c326a5ba Merge "Convert multishard writer to v2" from Botond
"
Also convert the foreign_reader used by it in the process.

Tests: unit(dev)
"

* 'multishard-writer-v2/v1' of https://github.com/denesb/scylla:
  mutation_writer/multishard_writer: remove now unused v1 factory overloads
  test/boost/mutation_writer_test: test the v2 variant of distribute_reader_and_consume_on_shards()
  flat_mutation_reader: add v2 variant of make_generating_reader()
  mutation_reader: multishard_writer: migrate implementation to v2
  mutation_reader: convert foreign_reader to v2
  streaming/consumer: convert to v2
  mutation_writer/multishard_writer: add v2 variant of distribute_reader_and_consume_on_shards()
2022-03-09 19:28:05 +02:00
Botond Dénes
7a119080ee flat_mutation_reader: add v2 variant of make_generating_reader() 2022-03-02 09:56:50 +02:00
Michael Livshin
8da28d0902 flat_mutation_reader_v2: add consume_partitions()
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-28 17:11:54 +02:00
Michael Livshin
fbbe27051e flat_mutation_reader_v2: add delegating_reader_v2
Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-02-28 17:11:54 +02:00
Michał Radwański
9ada63a9cb flat_mutation_reader: allow destructing readers which are not closed and didn't initiate any IO.
In functions such as upgrade_to_v2 (excerpt below), if the constructor
of transforming_reader throws, r needs to be destroyed, however it
hasn't been closed. However, if a reader didn't start any operations, it
is safe to destruct such a reader. This issue can potentially manifest
itself in many more readers and might be hard to track down. This commit
adds a bool indicating whether a close is anticipated, thus avoiding
errors in the destructor.

Code excerpt:
flat_mutation_reader_v2 upgrade_to_v2(flat_mutation_reader r) {
    class transforming_reader : public flat_mutation_reader_v2::impl {
        // ...
    };
    return make_flat_mutation_reader_v2<transforming_reader>(std::move(r));
}

Fixes #9065.
2022-02-28 17:11:54 +02:00
Botond Dénes
1fa6537a2f mutation_fragment_v2,flat_mutation_reader_v2: mirror v1 concept organization
Currently all concepts are in mutation_fragment_v2.hh and
flat_mutation_reader_v2.hh. Organize concepts similar to how the v1 ones
are: move high-level consume concepts into
mutation_consumer_concepts.hh.
2022-02-21 12:27:55 +02:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
134601a15e Merge "Convert input side of mutation compactor to v2" from Botond
"
With this series the mutation compactor can now consume a v2 stream. On
the output side it still uses v1, so it can now act as an online
v2->v1 converter. This allows us to push out v2->v1 conversion to as far
as the compactor, usually the next to last component in a read pipeline,
just before the final consumer. For reads this is as far as we can go,
as the intra-node ABI and hence the result-sets built are v1. For
compaction we could go further and eliminate conversion altogether, but
this requires some further work on both the compactor and the sstable
writer and so it is left to be done later.
To summarize, this patchset enables a v2 input for the compactor and it
updates compaction and single partition reads to use it.
"

* 'mutation-compactor-consume-v2/v1' of https://github.com/denesb/scylla:
  table: add make_reader_v2()
  querier: convert querier_cache and {data,mutation}_querier to v2
  compaction: upgrade compaction::make_interposer_consumer() to v2
  mutation_reader: remove unecessary stable_flattened_mutations_consumer
  compaction/compaction_strategy: convert make_interposer_consumer() to v2
  mutation_writer: migrate timestamp_based_splitting_writer to v2
  mutation_writer: migrate shard_based_splitting_writer to v2
  mutation_writer: add v2 clone of feed_writer and bucket_writer
  flat_mutation_reader_v2: add reader_consumer_v2 typedef
  mutation_reader: add v2 clone of queue_reader
  compact_mutation: make start_new_page() independent of mutation_fragment version
  compact_mutation: add support for consuming a v2 stream
  compact_mutation: extract range tombstone consumption into own method
  range_tombstone_assembler: add get_range_tombstone_change()
  range_tombstone_assembler: add get_current_tombstone()
2022-01-12 14:37:19 +02:00
Michael Livshin
221cd264db convert make_flat_multi_range_reader() to flat_mutation_reader_v2
Mechanical changes and a resulting downgrade in one caller (which is
itself converted later).

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2022-01-11 10:49:26 +02:00
Botond Dénes
2d7625f4b3 flat_mutation_reader_v2: add reader_consumer_v2 typedef
v2 version of the reader_consumer typedef.
2022-01-07 13:48:36 +02:00
Botond Dénes
62d82b8b0e flat_mutation_reader: convert make_flat_mutation_reader_from_mutation() v2
Since this reader is also heavily used by tests to compose a specific
workload for readers above it, we just add a v2 variant, instead of
changing the existing v1 one.
The v2 variant was written from scratch to have built-in support for
reading in reverse. It is built-on `mutation::consume()` to avoid
duplicating the logic of consuming the contents of the mutation.

A v2 native unit test is also added.
2022-01-05 09:06:16 +02:00
Michael Livshin
9f656b96ac short-circuit flat mutation reader upgrades and downgrades
When asked to upgrade a reader that itself is a downgrade, try to
return the original v2 reader instead, and likewise when downgrading
upgraded v1 readers.

This is desirable because version transformations can result from,
say, entering/leaving a reader concurrency semaphore, and the amount
of such transformations is practically unbounded.

Such short-circuiting is only done if it is safe, that is: the
transforming reader's buffer is empty and its internal range tombstone
tracking state is discardable.

Signed-off-by: Michael Livshin <michael.livshin@scylladb.com>
2021-12-21 11:26:17 +02:00
Botond Dénes
c193bbed82 flat_mutation_reader_v2: add v2 version of empty reader
Convert the v1 implementation to v2, downgrade to v1 in the existing
`make_empty_flat_reader()`.
2021-12-16 14:57:49 +02:00
Botond Dénes
39426b1aa3 flat_mutation_reader_v2: add make_flat_mutation_reader_from_fragments()
The main difference compared to v1 (apart from having _v2 suffix at
relevant places) is how slicing and reversing works. The v2 variant has
native reverse support built-in because the reversing reader is not
something we want to convert to v2.

A native v2 mutation-source test is also added.
2021-12-10 15:48:49 +02:00
Botond Dénes
76ee3f029c flat_mutation_reader_v2: add make_forwardable()
Not a completely straightforward conversion as the v2 version has to
make sure to emit the current range tombstone change after
fast_forward_to() (if it changes compared to the current one before fast
forwarding).
Changes are around the two new members `_tombstone_to_emit` and
`maybe_emit_tombstone()`.
2021-12-10 15:48:49 +02:00
Michał Radwański
b68a6c63e9 flat_mutation_reader: remove unused reserve_one method
Closes #9410
2021-09-29 17:22:29 +02:00
Benny Halevy
4476800493 flat_mutation_reader: get rid of timeout parameter
Now that the timeout is taken from the reader_permit.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:51 +03:00
Benny Halevy
eeab5f77d9 repair: row_level: read_mutation_fragment: set reader timeout
The timeout needs to be propagated to the reader's permit.
Reset it to db::no_timeout in repair_reader::pause().

Warn if set_timeout asks to change the timeout too far into the
past (100ms).  It is possible that it will be passed a
past timeout from the rcp path, where the message timeout
is applied (as duration) over the local lowres_clock time
and parallel read_data messages that share the query may end
up having close, but different timeout values.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 16:30:40 +03:00
Benny Halevy
f25aabf1b2 flat_mutation_reader: maybe_timed_out: use permit timeout
Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-24 14:29:44 +03:00
Benny Halevy
8674746fdd flat_mutation_reader: detach_buffer: mark as noexcept
Since detach_buffer is used before closing and
destroying the reader, we want to mark it as noexcept
to simply the caller error handling.

Currently, although it does construct a new circular_buffer,
none of the constructors used may throw.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210617114240.1294501-2-bhalevy@scylladb.com>
2021-07-25 12:02:27 +03:00
Benny Halevy
0e31cdf367 flat_mutation_reader: detach_buffer: clarify buffer constructor
detach_buffer exchanges the current _buffer with
a new buffer constructed using the circular_buffer(Alloc)
constructor. The compiler implicitly constructs a
tracking_allocator(reader_permit) and passes it
to the circular_buffer constructor.

This patch just makes that explicit so it would be
clearer to the reader what's going on here.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210617114240.1294501-1-bhalevy@scylladb.com>
2021-07-25 11:59:37 +03:00
Michał Radwański
c4089007a2 flat_mutation_reader: introduce public method returning the default size
of internal buffer.

This method is useful in tests that examine behaviour after the buffer
has been filled up.
2021-07-19 15:54:13 +02:00
Tomasz Grabiec
1f255c420e flat_mutation_reader_v2: Make is_end_of_stream() reflect consumer-side state of the stream
Currently, flat_mutation_reader_v2::is_end_of_stream() returns
flat_mutation_reader_v2::impl::_end_of_stream, which means the producer
is done. The stream may be still not yet fully consumed even if
producer is done, due to internal buffering. So consumers need to make
a more elaborate check:

  rd.is_end_of_stream() && rd.is_buffer_empty()

It would be cleaner if flat_mutation_reader_v2::is_end_of_stream()
returned the state of the consumer-side of the stream, since it
belongs to the consumer-side of the API. The consumption will be as
simple as:

  while (!rd.is_end_of_stream()) {
    consume_fragment(rd());
  }

This patch makes the change on the v2 of the reader interface. v1 is
not changed to avoid problems which could happen when backporting code
which assumes new semantics into a version with the old semantics. v2
is not in any old branch yet so it doesn't have this problem and it's
a good time to make the API change.

Note that it's always safe to use the new semantics in the context
which assumes the old semantics, so v1 users can be safely converted
to v2 even if they are unware of the change.

Fixes #3067

Message-Id: <20210715102833.146914-1-tgrabiec@scylladb.com>
2021-07-15 14:00:48 +03:00
Avi Kivity
9059514335 build, treewide: enable -Wpessimizing-move warning
This warning prevents using std::move() where it can hurt
- on an unnamed temporary or a named automatic variable being
returned from a function. In both cases the value could be
constructed directly in its final destination, but std::move()
prevents it.

Fix the handful of cases (all trivial), and enable the warning.

Closes #8992
2021-07-08 17:52:34 +03:00
Benny Halevy
9bbe7b1482 sstables: mx_sstable_mutation_reader: enforce timeout
Check if the timeout has expired before issuing I/O.

Note that the sstable reader input_stream is not closed
when the timeout is detected. The reader must be closed anyhow after
the error bubbles up the chain of readers and before the
reader is destroyed.  This might already happen if the reader
times out while waiting for reader_concurrency_semaphore admission.

Test: unit(dev), auth_test.test_alter_with_timeouts(debug)

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20210624073232.551735-1-bhalevy@scylladb.com>
2021-06-24 12:26:57 +02:00
Tomasz Grabiec
1ef73abd82 flat_mutation_reader_v2: Implement read_mutation_from_flat_mutation_reader() 2021-06-15 13:14:45 +02:00
Tomasz Grabiec
9996b7ca18 flat_mutation_reader: Introduce adaptors between v1 and v2 of mutation fragment stream
The transition to v2 will be incremental. To support that, we need
adaptors between v1 and v2 which will be inserted at places which are
boundaries of conversion.

The v1 -> v2 converter needs to accumulate range tombstones, so has unbounded worst case memory footprint.

The v2 -> v1 converter trims range tombstones around clustering rows,
so generates more fragments than necessary.

Because of that, adpators are a temporary solution and we should not
release with them on the produciton code paths.
2021-06-15 13:10:47 +02:00
Tomasz Grabiec
08b5773c12 Adapt flat_mutation_reader_v2 to the new version of the API
When compacting a mutation fragment stream (e.g. for sstable
compaction, data query, repair), the compactor needs to accumulate
range tombstones which are relevant for the yet-to-be-processed range.
See range_tombstone_accumulator. One problem is that it has unbounded
memory footprint because the accumulator needs to keep track of all
the tombstoned ranges which are still active.

Another, although more benign, problem is computational complexity
needed to maintain that data structure.

The fix is to get rid of the overlap of range tombstones in the
mutation fragment stream. In v2 of the stream, there is no longer a
range_tombstone fragment. Deletions of ranges of rows within a given
partition are represented with range_tombstone_change fragments. At
any point in the stream there is a single active clustered
tombstone. It is initially equal to the neutral tombstone when the
stream of each partition starts. The range_tombstone_change fragment
type signify changes of the active clustered tombstone. All fragments
emitted while a given clustered tombstone is active are affected by
that tombstone. Like with the old range_tombstone fragments, the
clustered tombstone is independent from the partition tombstone
carried in partition_start.

The v2 stream is strict about range tombstone trimming. It emits range
tombstone changes which reflect range tombstones trimmed to query
restrictions, and fast-forwarding ranges. This makes the stream more
canonical, meaning that for a given set of writes, querying the
database should produce the same stream of fragments for a given
restrictions. There is less ambiguity in how the writes are
represented in the fragment stream. It wasn't the case with v1. For
example, A given set of deletions could be produced either as one
range_tombstone, or may, split and/or deoverlapped with other
fragments. Making a stream canonical is easier for diff-calculating.

The classes related to mutation fragment streams were cloned:
flat_mutation_reader_v2, mutation_fragment_v2, and related concepts.

Refs #8625.
2021-06-15 13:10:47 +02:00
Tomasz Grabiec
e3309322c3 Clone flat_mutation_reader related classes into v2 variants
To make review easier, first clone the classes without chaning the
logic. Logic and API will change in subsequent commits.
2021-06-15 13:10:09 +02:00