Commit Graph

48 Commits

Author SHA1 Message Date
David Garcia
bb21c3c869 Move dev docs to docs/dev 2022-06-24 18:07:08 +01:00
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Botond Dénes
f02632aeb0 range_tombstone_accumulator: drop _reversed flag 2021-09-09 15:42:15 +03:00
Botond Dénes
30f6f676b8 range_tombstone: add reverse()
Reversing the range-tombstone, as-if it was emitted from a table with
reversed clustering order.
2021-09-09 11:49:05 +03:00
Pavel Emelyanov
7a0e56d7c1 range_tombstone: Drop without-link constructor
The thing was used to move a range tombstone without detaching it
from the containing list (well, intrusive set). Now when the linkage
is gone this facility is no longer needed (and actually no longer
used).

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:50 +03:00
Pavel Emelyanov
f82b5f30f6 range_tombstone: Drop move_assign()
The helper was in use by move-assignment operator and by the .swap()
method. Since now the operator equals the helper, the code can be
merged and the .swap() can be prettified.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:50 +03:00
Pavel Emelyanov
d6af441eaa range_tombstone: Move linkage into range_tombstone_entry
Now it's time to remove the boost set's hook from the range_tombstone
and keep it wrapped into another class if the r._tombstone's location
is the range_tombstone_list.

Also the added previously .tombstone() getters and the _entry alias
can be removed -- all the code can work with the new class.

Two places in the code that made use of without_link{} move-constructor
are patched to get the range_tombstone part from the respective _entry
with the same result.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:45 +03:00
Pavel Emelyanov
5515f7187d range_tombstone, code: Add range_tombstone& getters
Currently all the code operates on the range_tombstone class.
and many of those places get the range tombstone in question
from the range_tombstone_list. Next patches will make that list
carry (and return) some new object called range_tombstone_entry,
so all the code that expects to see the former one there will
need to patched to get the range_tombstone from the _entry one.

This patch prepares the ground for that by introdusing the

    range_tombstone& tombstone() { return *this; }

getter on the range_tombstone itself and patching all future
users of the _entry to call .tombstone() right now.

Next patch will remove those getters together with adding the new
range_tombstone_entry object thus automatically converting all
the patched places into using the entry in a proper way.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:34:45 +03:00
Pavel Emelyanov
fcc02c6bed range_tombstone(_list): Mark some bits noexcept
The range_tombstone's .empty() and .operator bool are trivially such.

The swap()'s noexceptness comes from what it calls -- the without-link
move constructor (noexcept) and .move_assign(). The latter is noexcept
because it's already called from noexcept move-assign operator and
because it calls noexcept move operators of tombstones' fields. The
update_node() is noexcept for the same reason.

The range_tombstone_list's clear() is noexcept because both -- set
clear and disposer lambda are both such.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-09-03 19:31:43 +03:00
Pavel Emelyanov
0f53e83a8e range_tombstone_list, code: Mark external_memory_usage noexcept
The range_tombstone_list's method is at the top of the
stack of calls each not throwing anything, so do the
deep-dive noexcept marking.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Tomasz Grabiec
c9f2daaa8e range_tombstone: Introduce trim() 2021-06-15 13:14:45 +02:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Solodovnikov
fff7ef1fc2 treewide: reduce boost headers usage in scylla header files
`dev-headers` target is also ensured to build successfully.

Signed-off-by: Pavel Solodovnikov <pa.solodovnikov@scylladb.com>
2021-05-20 01:33:18 +03:00
Tomasz Grabiec
6863a5e43b row_cache: Avoid generating overlapping range tombstones
Row cache reader can produce overlapping range tombstones in the
mutation fragment stream even if there is only a single range
tombstone in sstables, due to #2581. For every range between two rows,
the row cache reader queries for tombstones relevant for that
range. The result of the query is trimmed to the current position of
the reader (=position of the previous row) to satisfy key
monotonicity. The end position of range tombstones is left
unchanged. So cache reader will split a single range tombstone around
rows. Those range tombstones are transient, they will be only
materialized in the reader's stream, they are not persisted anywhere.

That is not a problem in itself, but it interacts badly with mutation
compactor due to #8625. The range_tombstone_accumulator which is used
to compact the mutation fragment stream needs to accumulate all
tombstones which are relevant for the current clustering position in
the stream. Adding a new range tombstone is O(N) in the number of
currently active tombstones. This means that producing N rows will be
O(N^2).

In a unit test, I saw reading 137'248 rows which overlap with a range
tombstone take 245 seconds. Almost all of CPU time is in
drop_unneeded_tombstones().

The solution is to make the cache reader trim range tombstone end to
the currently emited sub-range, so that it emits non-overlapping range
tombstones.

Fixes #8626.
2021-05-12 00:10:24 +02:00
Michał Chojnowski
85048b349b memtable: fix accounting of managed_bytes in partition_snapshot_accounter
managed_bytes has a small overhead per each fragment. Due to that, managed_bytes
containing the same data can have different total memory usage in different
allocators. The smaller the preferred max allocation size setting is, the more
fragments are needed and the greater total per-fragment overhead is.
In particular, managed_bytes allocated in the LSA could grow in
memory usage when copied to the standard allocator, if the standard allocator
had a preferred max allocation setting smaller than the LSA.

partition_snapshot_accounter calculates the amount of memory used by
mutation fragments in the memtable (where they are allocated with LSA) based
on the memory usage after they are copied to the standard allocator.
This could result in an overestimation, as explained above.
But partition_snapshot_accounter must not overestimate the amount of freed
memory, as doing otherwise might result in OOM situations.

This patch prevents the overaccounting by adding minimal_external_memory_usage():
a new version of external_memory_usage(), which ignores allocator-dependent
overhead. In particular, it includes the per-fragment overhead in managed_bytes
only once, no matter how many fragments there are.
2021-01-15 18:21:13 +01:00
Pavel Emelyanov
3da3d448c8 range_tombstone: Remove unused schema arg from .set_start
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-11-06 15:13:05 +03:00
Duarte Nunes
fa2b0384d2 Replace std::experimental types with C++17 std version.
Replace stdx::optional and stdx::string_view with the C++ std
counterparts.

Some instances of boost::variant were also replaced with std::variant,
namely those that called seastar::visit.

Scylla now requires GCC 8 to compile.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
Message-Id: <20190108111141.5369-1-duarte@scylladb.com>
2019-01-08 13:16:36 +02:00
Tomasz Grabiec
78d9205a50 Merge "Multiple fixes to tests/normalizing_reader" from Vladimir
This patchset addresses multiple errors in normalizing_reader
implementation found during review.

I have decided to not make a clustering key full inside
before_key()/after_key() helpers. The reason is that for this they
would need schema to be passed as another parameter so existing
methods don't suit. OTOH, introducing new members for a class using
for testing purposes only seems an overkill.

* github.com/argenet/scylla.git projects/sstables-30/normalizing_reader_fixes/v1:
  range_tombstone: Add constructor accepting position_in_partition_views
    for range bounds.
  tests: Make sure range tombstone is properly split over rows with
    non-full keys.
  tests: Multiple fixes for draining and clearing range tombstones in
    normalizing_reader.
2018-09-27 12:51:47 +02:00
Vladimir Krivopalov
653fb37ea5 range_tombstone: Remove code that duplicates logic.
The actions performed by the call to set_start() were duplicated by the
immediately following code lines that are removed with this patch.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <20eaa1338c1719ded34f5c9ada69ec03907936f5.1537989044.git.vladimir@scylladb.com>
2018-09-27 12:05:25 +02:00
Vladimir Krivopalov
fbccae0d15 range_tombstone: Add constructor accepting position_in_partition_views for range bounds.
Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
2018-09-26 17:17:18 -07:00
Botond Dénes
33d72efa49 mutation_compactor: add detach_state()
Allow the state of the compaction to be detached. The detached state is
a set of mutation fragments, which if replayed through a new compactor
object will result in the latter being in the same state as the previous
one was.
This allows for storing the compaction state in the compacted reader by
using `unpop_mutation_fragment()` to push back the fragments that
comprise the detached state into the reader. This way, if a new
compaction object is created it can just consume the reader and continue
where the previous compaction left off.
2018-09-03 10:31:44 +03:00
Vladimir Krivopalov
82f76b0947 Use std::reference_wrapper instead of a plain reference in bound_view.
The presence of a plain reference prohibits the bound_view class from
being copyable. The trick employed to work around that was to use
'placement new' for copy-assigning bound_view objects, but this approach
is ill-formed and causes undefined behaviour for classes that have const
and/or reference members.

The solution is to use a std::reference_wrapper instead.

Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com>
Message-Id: <a0c951649c7aef2f66612fc006c44f8a33713931.1530113273.git.vladimir@scylladb.com>
2018-06-28 11:24:06 +01:00
Tomasz Grabiec
9975135110 row_cache: Make sure reader makes forward progress after each fill_buffer()
If reader's buffer is small enough, or preemption happens often
enough, fill_buffer() may not make enough progress to advance
_lower_bound. If also iteartors are constantly invalidated across
fill_buffer() calls, the reader will not be able to make progress.

See row_cache_test.cc::test_reading_progress_with_small_buffer_and_invalidation()
for an examplary scenario.

Also reproduced in debug-mode row_cache_test.cc::test_concurrent_reads_and_eviction

Message-Id: <1528283957-16696-1-git-send-email-tgrabiec@scylladb.com>
2018-06-06 16:01:52 +03:00
Paweł Dziepak
ec9d166a4f treewide: require type to compute cell memory usage 2018-05-31 15:51:11 +01:00
Duarte Nunes
a0d748c71c range_tombstone: Replace feed_hash() member function with appending_hash
Replace range_tombstone::feed_hash() with the specialization of
appending_hash, so that we can use the general feed_hash() function.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2018-02-01 00:22:50 +00:00
Paweł Dziepak
bb54af66a9 range_tombstone: drop flip()
Flipped range tombstones violated the assumption that position() <
end_position() and therefore could only be used in some specific cases.
2017-11-16 17:15:36 +00:00
Paweł Dziepak
5f08831192 streamed_mutation: fix reversing range tombstones
Right now reversed streamed mutation emits range tombstones after the
mutation fragments affected by them. This breakes the queries.

This patch reworks the way range tombstones are handled in reversed
streams:
 - range tombstones are no longer flipped -- invariant that start bound
   is smaller than the end bound always holds
 - in reversed streams they are ordered by their end_position()

Fixes #2982.
2017-11-16 17:15:36 +00:00
Tomasz Grabiec
6ce08f2f9a range_tombstone: Introduce trim_front() 2017-06-24 18:06:11 +02:00
Avi Kivity
ebaeefa02b Merge seatar upstream (seastar namespace)
- introcduced "seastarx.hh" header, which does a "using namespace seastar";
 - 'net' namespace conflicts with seastar::net, renamed to 'netw'.
 - 'transport' namespace conflicts with seastar::transport, renamed to
   cql_transport.
 - "logger" global variables now conflict with logger global type, renamed
   to xlogger.
 - other minor changes
2017-05-21 12:26:15 +03:00
Tomasz Grabiec
72d74b7b40 range_tombstone: Introduce end_position() 2017-02-13 16:12:16 +01:00
Duarte Nunes
2ab9ba995a range_tombstone_accumulator: Expose current tombstone
Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-02-06 13:36:45 +01:00
Duarte Nunes
f3c5ea392a range_tombstone_accumulator: apply() takes value
range_tombstone_accumulator::apply() now takes a value so the caller
can decide whether to move or copy the argument.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2017-02-06 13:36:45 +01:00
Paweł Dziepak
6b8bf030c0 streamed_mutation: add memory_usage() to mutation fragment types
This patch introduces memory_usage() to static_row, clustering_row and
range_tombstone so that we can avoid repeating sizeof(T) +
x.external_memory_usage().

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Paweł Dziepak
ef57b9a26f rename memory_usage() to external_memory_usage() where applicable
Renaming the function to external_memory_usage() makes it clear that
sizeof(T) is not included, something that was a source of confusion in
the past.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-11-18 11:25:36 +00:00
Piotr Jastrzebski
27726cecff Clean up position_in_partition.
Introduce position_in_partition_view and use it in
position() method in mutation_fragment, range_tombstone,
static_row and clustering_row.
Clean up comparators in position_in_partition.

Signed-off-by: Piotr Jastrzebski <piotr@scylladb.com>
Message-Id: <c65293c71a6aa23cf930ed317fb63df1fdc34fd1.1477399763.git.piotr@scylladb.com>
2016-10-25 15:13:20 +01:00
Duarte Nunes
878927d9d2 range_tombstone: Extract out bounds_view
This patch extracts bounds_view from range_tombstone so its comprator
can be reused elsewhere.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-08-22 17:52:36 +02:00
Nadav Har'El
09ede5f333 range tombstone accumulator: add method
Add to the range_tombstone_accumulator a range_tombstones_for_row(ck)
method.

Just like the existing tombstone_for_row(ck), this function drops from
the accumulator tombstones that end before ck. But while the existing
function returned just a single tombstone affecting the given row (the
most recent tombstone), the new function range_tombstones_for_row(ck)
returns all the accumulated range tombstones which cover ck.

This function will be useful for the promoted-index writing code later,
which divides a partition into blocks which may be read independently,
so each block needs to start with a repeat of the earlier tombstones
which still cover the first row in the new block.

Signed-off-by: Nadav Har'El <nyh@scylladb.com>
2016-08-07 17:47:10 +03:00
Paweł Dziepak
5a790a9b49 range_tombstone: add flip()
range_tombstone::flip() flips range bounds. This is necessary in order
to use range tombstone in reversed mutation fragment streams.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:50:07 +01:00
Paweł Dziepak
e1d306fa0d range_tombstone: add memory_usage()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:50:07 +01:00
Paweł Dziepak
91a866501d range_tombstone: add range_tombstone_accumulator
range_tombstone_accumulator is a helper class that allows determining
tombstone for a clustering row when range tombstones and clustering rows
are streamed from streamed_mutation.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:50:07 +01:00
Paweł Dziepak
cd7937d33b range_tombstone: add apply()
range_tombstone::apply() allows merging two, possibly overlapping, range
tombstones with the same start bound and produces one or two disjoint
range tombstones as a result.

It is intended to be used for merging tombstones coming from different
sources.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-07-13 09:50:07 +01:00
Paweł Dziepak
f676d1779b range_tombstone: add flip_bound_kind()
flip_bound_kind() changes start bound to end bound and vice versa while
preserving the inclusivness.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:52 +01:00
Paweł Dziepak
5a60f6d1ec range_tombstone: extract is_single_clustering_row_tombstone()
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:50 +01:00
Paweł Dziepak
df4c1c6293 range_tombstone: simplify bound_view::equal()
Bounds are equal only if they are of the same kind. No need to check
weights.

Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:48 +01:00
Paweł Dziepak
3a0e76d635 range_tombstone: check for adjacent instead of equal bounds
Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>
2016-06-20 21:29:48 +01:00
Duarte Nunes
284bb6b66f range_tombstone_list: Make it ReversiblyMergeable
This patch implements the ReversiblyMergeable cancellative monoid
for the range_tombstone_list.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:58 +02:00
Duarte Nunes
6a111fdd01 mutations: Introduce the range_tombstone class
This patch introduces the range_tombstone class, composed of
a [start, end] pair of clustering_key_prefixes, the type
of inclusiveness of each bound, and a tombstone.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:58 +02:00
Duarte Nunes
7f8c35dd8c idl: Add range tombstone IDL
This patch adds the range tombstone IDL, preserving backwards
compatibility.

Signed-off-by: Duarte Nunes <duarte@scylladb.com>
2016-06-02 16:21:36 +02:00