Commit Graph

29 Commits

Author SHA1 Message Date
Botond Dénes
a7d467d794 position_in_partition: add to_string(partition_region) and parse_partition_region()
And rebase operator<<(partition_region) on top of the former.
2022-06-23 11:19:55 +03:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
5515f7187d range_tombstone, code: Add range_tombstone& getters
Currently all the code operates on the range_tombstone class.
and many of those places get the range tombstone in question
from the range_tombstone_list. Next patches will make that list
carry (and return) some new object called range_tombstone_entry,
so all the code that expects to see the former one there will
need to patched to get the range_tombstone from the _entry one.

This patch prepares the ground for that by introdusing the

    range_tombstone& tombstone() { return *this; }

getter on the range_tombstone itself and patching all future
users of the _entry to call .tombstone() right now.

Next patch will remove those getters together with adding the new
range_tombstone_entry object thus automatically converting all
the patched places into using the entry in a proper way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:45 +03:00
Pavel Emelyanov
2e1b21d72b range_tombstone_list: De-templatize pop_as<>
The method pops the range tombstone from the containing list
and transparently "converts" it into some other type. Nowadays
all callers of it need range tombstone as-is, so the template
can be relaxed down to a plan call.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:45 +03:00
Benny Halevy
4439e5c132 everywhere: cleanup defer.hh includes
Get rid of unused includes of seastar/util/{defer,closeable}.hh
and add a few that are missing from source files.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
2021-08-22 21:11:39 +03:00
Pavel Emelyanov
b5fee07527 mutation_fragment: Specialize appending_hash for it
Row-level rpair hashes the mutation fragment and wraps this into a
private fragment_hasher class. For some reason it takes ~20 minutes
for clang to compile the row_level.o with -O3 level (release mode).
Putting the whole fragment_hasher into a dedicated file reduces the
compilation times ~9 times.

However, it seems more natural not to move the fragment_hasher around
but to specialize the appending_hash<> for mutation_fragment and make
row_level.cc code just call feed_hash().

Compilation times (release mode):

                       before     after
row_level.o            19m34s      2m4s
mutation_fragment.o       13s       17s

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-08-18 09:17:40 +03:00
Tomasz Grabiec
91868cf0cd range_tombstone_stream: Introduce peek_next() 2021-07-26 13:33:34 +02:00
Tomasz Grabiec
08b5773c12 Adapt flat_mutation_reader_v2 to the new version of the API
When compacting a mutation fragment stream (e.g. for sstable
compaction, data query, repair), the compactor needs to accumulate
range tombstones which are relevant for the yet-to-be-processed range.
See range_tombstone_accumulator. One problem is that it has unbounded
memory footprint because the accumulator needs to keep track of all
the tombstoned ranges which are still active.

Another, although more benign, problem is computational complexity
needed to maintain that data structure.

The fix is to get rid of the overlap of range tombstones in the
mutation fragment stream. In v2 of the stream, there is no longer a
range_tombstone fragment. Deletions of ranges of rows within a given
partition are represented with range_tombstone_change fragments. At
any point in the stream there is a single active clustered
tombstone. It is initially equal to the neutral tombstone when the
stream of each partition starts. The range_tombstone_change fragment
type signify changes of the active clustered tombstone. All fragments
emitted while a given clustered tombstone is active are affected by
that tombstone. Like with the old range_tombstone fragments, the
clustered tombstone is independent from the partition tombstone
carried in partition_start.

The v2 stream is strict about range tombstone trimming. It emits range
tombstone changes which reflect range tombstones trimmed to query
restrictions, and fast-forwarding ranges. This makes the stream more
canonical, meaning that for a given set of writes, querying the
database should produce the same stream of fragments for a given
restrictions. There is less ambiguity in how the writes are
represented in the fragment stream. It wasn't the case with v1. For
example, A given set of deletions could be produced either as one
range_tombstone, or may, split and/or deoverlapped with other
fragments. Making a stream canonical is easier for diff-calculating.

The classes related to mutation fragment streams were cloned:
flat_mutation_reader_v2, mutation_fragment_v2, and related concepts.

Refs #8625.
2021-06-15 13:10:47 +02:00
Tomasz Grabiec
e3309322c3 Clone flat_mutation_reader related classes into v2 variants
To make review easier, first clone the classes without chaning the
logic. Logic and API will change in subsequent commits.
2021-06-15 13:10:09 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
a7a5ad4ded range_tombstone_stream: Remove unused methods
Both methods apply a list of tombstones to the stream. One
was unused even before the set, the other one became unused
after previous patch.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-03-16 12:08:18 +03:00
Pavel Emelyanov
bbd7463960 range_tombstone: Remove unused trim-front arg from .apply()
The only caller of this method always passes true to it.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-11-06 15:13:05 +03:00
Botond Dénes
041d71bd6f mutation_fragment: track memory usage through the reader_permit
The memory usage of mutation fragments is now tracked through its
lifetime through a reader permit. This was the last major (to my current
knowledge) untracked piece of the reader pipeline.
2020-09-28 11:27:29 +03:00
Botond Dénes
6ca0464af5 mutation_fragment: add schema and permit
We want to start tracking the memory consumption of mutation fragments.
For this we need schema and permit during construction, and on each
modification, so the memory consumption can be recalculated and pass to
the permit.

In this patch we just add the new parameters and go through the insane
churn of updating all call sites. They will be used in the next patch.
2020-09-28 11:27:23 +03:00
Avi Kivity
19ffc9455d Merge "Don't expose exact collection from range_tombstone_list" from Pavel E
"
The range_tombstone_list provides an abstraction to work with
sorted list of range tombstones with methods to add/retrive
them. However, there's a tombstones() method that just returns
modifiable reference to the used collection (boost::intrusive_set)
which makes it hard to track the exact usage of it.

This set encapsulates the collaction of range tombstones inside
the mentioned ..._list class.

tests: unit(dev)
"

* 'br-range-tombstone-encapsulate-collection' of https://github.com/xemul/scylla:
  range_tombstone_list: Do not expose internal collection
  range_tombstone_list: Introduce and use pop-and-lock helper
  range_tombstone_list: Introduce and use pop_as<>()
  flat_mutation_reader: Use range_tombstone_list begin/end API
  repair: Mark some partition_hasher methods noexcept
  hashers: Mark hash updates noexcept
2020-09-15 10:09:15 +02:00
Pavel Emelyanov
4e264b9e4f clustering_row: Do not re-implement deletable_row
The clustering_row is deletable_row + clustering_key, all
its internals work exactly as the relevant deletable_row's
ones.

The similar relation is between static_row and row, and
the former wrapes the latter, so here's the same trick
for the non-static row classes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-08 22:21:15 +03:00
Pavel Emelyanov
a89c7198c2 range_tombstone_list: Introduce and use pop_as<>()
The method extracts an element from the list, constructs
a desired object from it and frees. This is common usage
of range_tombstone_list. Having a helper helps encapsulating
the exact collection inside the class.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-09-07 23:17:41 +03:00
Avi Kivity
6728b96df7 clustering_interval_set: split to own header file
clustering_interval_set is a rarely used class, but one that requires
boost/icl, which is quite heavyweight. To speed up compilation, move
it to its own header and sprinkle #includes where needed.

Tests: unit (dev)
Message-Id: <20200214190507.1137532-1-avi@scylladb.com>
2020-02-16 17:40:47 +02:00
Avi Kivity
488c42408a position_in_partition_view: add type-aware printer
If the position_in_partition_view represents a clustering key,
we can now see it with the clustering key decoded according to
the schema.
Message-Id: <20191231151315.602559-1-avi@scylladb.com>
2020-01-07 12:15:09 +01:00
Botond Dénes
8d59c36165 partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges
When entering a new ck range (of the partition-slice), the partition
snapshot reader will apply to its range tombstones stream all the
tombstones that are relevant to the new ck range. When the partition has
range tombstones that overlap with multiple ck ranges, these will be
applied to the range tombstone stream when entering any of the ck ranges
they overlap with. This will result in the violation of the monotonicity
of the mutation fragments emitted by the reader, as these range
tombstones will be re-emitted on each ck range, if the ck range has at
least one clustering row they apply to.
For example, given the following partition:
    rt{[1,10]}, cr{1}, cr{2}, cr{3}...

And a partition-slice with the following ck ranges:
    [1,2], [3, 4]

The reader will emit the following fragment stream:
    rt{[1,10]}, cr{1}, cr{2}, rt{[1,10]}, cr{3}, ...

Note how the range tombstone is emitted twice. In addition to violating
the monotonicity guarantee, this can also result in an explosion of the
number of emitted range tombstones.

Fix by trimming range tombstones to the start of the current ck range,
thus ensuring that they will not violate mutation fragment monotonicity
guarantees.

Refs: #4104

This is a much simpler fix for the above issue, than the already
committed one (7049cd937A). The latter is reverted by the previous
patch and this patch applies the simpler fix.
2019-01-30 10:01:13 +02:00
Botond Dénes
ff2884f25b Revert "partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges"
A much simpler and more complete fix was found. Let's revert this before
applying the simpler fix.

This reverts commit 7049cd9374.
2019-01-21 13:56:56 +02:00
Botond Dénes
7049cd9374 partition_snapshot_reader: don't re-emit range tombstones overlapping multiple ck ranges
When entering a new ck range (of the partition-slice), the partition
snapshot reader will apply to its range tombstones stream all the
tombstones that are relevant to the new ck range. When the partition has
range tombstones that overlap with multiple ck ranges, these will be
applied to the range tombstone stream when entering any of the ck ranges
they overlap with. This will result in the violation of the monotonicity
of the mutation fragments emitted by the reader, as these range
tombstones will be re-emitted on each ck range, if the ck range has at
least one clustering row they apply to.
For example, given the following partition:
    rt{[1,10]}, cr{1}, cr{2}, cr{3}...

And a partition-slice with the following ck ranges:
    [1,2], [3, 4]

The reader will emit the following fragment stream:
    rt{[1,10]}, cr{1}, cr{2}, rt{[1,10]}, cr{3}, ...

Note how the range tombstone is emitted twice. In addition to violating
the monotonicity guarantee, this can also result in an explosion of the
number of emitted range tombstones.

Fix by applying only those range tombstones to the range tombstone
stream, that have a position strictly greater than that of the last
emitted clustering row (or range tombstone), when entering a new ck
range.

Fixes: #4104

Signed-off-by: Botond Dénes <bdenes@scylladb.com>
Message-Id: <e047af76df75972acb3c32c7ef9bb5d65d804c82.1547916701.git.bdenes@scylladb.com>
2019-01-20 15:38:04 +02:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Benny Halevy
206483e6af position_in_partition_view: print bound_weight as int
Rather than a non-printable char.

Signed-off-by: Benny Halevy <bhalevy@scylladb.com>
Message-Id: <20181226091115.18530-1-bhalevy@scylladb.com>
2018-12-26 11:19:30 +02:00
Asias He
4e55d22a8f position_in_partition: Switch _bound_weight to use enum
The _bound_weight in position_in_partition will be sent on wire in rpc.
Make it enum instead of int.
2018-12-12 16:49:01 +08:00
Paweł Dziepak
637b9a7b3b atomic_cell_or_collection: make operator<< show cell content
After the new in-memory representation of cells was introduced there was
a regression in atomic_cell_or_collection::operator<< which stopped
printing the content of the cell. This makes debugging more incovenient
are time-consuming. This patch fixes the problem. Schema is propagated
to the atomic_cell_or_collection printer and the full content of the
cell is printed.

Fixes #3571.

Message-Id: <20181024095413.10736-1-pdziepak@scylladb.com>
2018-10-24 13:29:51 +03:00
Vladimir Krivopalov
7a5c4f0a63 mutation_fragment: Add range_tombstone_stream::empty() method.
The method checks if the underlying range_tombstone_list is empty.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-25 17:55:52 -07:00
Vladimir Krivopalov
0cf42e7fd2 range_tombstone_stream: Remove an unused boolean flag.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-06-18 14:22:12 -07:00
Piotr Jastrzebski
96c97ad1db Rename streamed_mutation* files to mutation_fragment*
Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
2018-01-24 20:56:49 +01:00