scylladb

mirror of https://github.com/scylladb/scylladb.git synced 2026-04-23 18:10:39 +00:00

Author	SHA1	Message	Date
Benny Halevy	2d80057617	range_tombstone_list: insert_from: correct rev.update range_tombstone in not overlapping case 2nd std::move(start) looks like a typo in `fe2fa3f20d`. Signed-off-by: Benny Halevy <bhalevy@scylladb.com> Message-Id: <20220404124741.1775076-1-bhalevy@scylladb.com>	2022-04-04 22:26:29 +02:00
Avi Kivity	fcb8d040e8	treewide: use Software Package Data Exchange (SPDX) license identifiers Instead of lengthy blurbs, switch to single-line, machine-readable standardized (https://spdx.dev) license identifiers. The Linux kernel switched long ago, so there is strong precedent. Three cases are handled: AGPL-only, Apache-only, and dual licensed. For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0), reasoning that our changes are extensive enough to apply our license. The changes we applied mechanically with a script, except to licenses/README.md. Closes #9937	2022-01-18 12:15:18 +01:00
Tomasz Grabiec	78a6474982	range_tombstone_list: Deoverlap adjacent empty ranges Appending an empty range adjacent to an existing range tombstone would not deoverlap (by dropping the empty range tombstone) resulting in different (non canoncial) result depending on the order of appending. Suppose that [a, b] covers [x, x) Appending [a, x) then [x, b) then [x, x) would give [a, b) Appending [a, x) then [x, x) then [x, b) would give [a, x), [x, x), [x, b) Fix by dropping empty range tombstones.	2021-12-13 21:31:36 +01:00
Tomasz Grabiec	fe2fa3f20d	range_tombstone_list: Convert to work in terms of position_in_partition This makes it comprehensible, and a bit simpler.	2021-12-08 15:16:18 +01:00
Michał Radwański	35b1c3ff52	range_tombstone_list: {lower,upper,}slice share comparator implementation slice (2 overloads), upper_slice, lower_slice previously had implementations of a comparator. Move out the common structs, so that all 4 of them can share implementation.	2021-11-05 10:51:58 +01:00
Michał Radwański	07e78807e6	range_tombstone_list: add lower_slice lower_slice returns the range tombstones which have end inside range [start, before).	2021-11-02 10:50:31 +01:00
Pavel Emelyanov	d6af441eaa	range_tombstone: Move linkage into range_tombstone_entry Now it's time to remove the boost set's hook from the range_tombstone and keep it wrapped into another class if the r._tombstone's location is the range_tombstone_list. Also the added previously .tombstone() getters and the _entry alias can be removed -- all the code can work with the new class. Two places in the code that made use of without_link{} move-constructor are patched to get the range_tombstone part from the respective _entry with the same result. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	b8c585c54d	range_tombstone_list: Prepare to use range_tombstone_entry A continuation of the previous patch. The range_tombstone_list works with the range_tombstone very actively, kicking every single line doing this to call .tombstone() seems excessive. Instead, declare the range_tombstone_entry alias. When the entry will appear for real, the alias would go away and the range_tombstone_list will be switched into new entity right at once. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	5515f7187d	range_tombstone, code: Add range_tombstone& getters Currently all the code operates on the range_tombstone class. and many of those places get the range tombstone in question from the range_tombstone_list. Next patches will make that list carry (and return) some new object called range_tombstone_entry, so all the code that expects to see the former one there will need to patched to get the range_tombstone from the _entry one. This patch prepares the ground for that by introdusing the range_tombstone& tombstone() { return *this; } getter on the range_tombstone itself and patching all future users of the _entry to call .tombstone() right now. Next patch will remove those getters together with adding the new range_tombstone_entry object thus automatically converting all the patched places into using the entry in a proper way. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Pavel Emelyanov	ae8a5bd046	range_tombstone_list: Factor out tombstone construction Just add a helper for constructing the managed range tombstone object. This will also help further patch have less duplicating hunks in it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-09-03 19:34:45 +03:00
Avi Kivity	a55b434a2b	treewide: extent copyright statements to present day	2021-06-06 19:18:49 +03:00
Pavel Emelyanov	c8b2079705	range_tombstone_list: Add new slice() helper There are two of them now -- one to return iterator_range that covers the given query::clustering_range, the other to return it for two given positions. In the next patch the 3rd one is needed -- the slice() to get iterator_range that's a) starts strictly after a given position b) ends after the given clustering_range's end It will be used to refresh the range tombstones iterators after some of them will have been emitted. The same thing is currently done by partition_snapshot_reader's refresh_state wrt rows: if (last_row) start = rows.upper_bound(last_row) // continuation else start = rows.lower_bound(range.start) // initial end = rows.upper_bound(range.end) // end is the same in // either case Respectively for range tombstones the goal is the same. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-16 11:55:28 +03:00
Pavel Emelyanov	7e1170ecb9	range_tombstone_list: Introduce iterator_range alias The range_tombstone_list::slice() set of methods return back pair of iterators represending a range. In the next patches this pair will be actively used, and it's handy to have a shorter alias for it. Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>	2021-03-16 11:55:28 +03:00
Avi Kivity	6f394e8e90	tombstone: use comparison operator instead of ad-hoc compare() function and with_relational_operators The comparison operator (<=>) default implementation happens to exactly match tombstone::compare(), so use the compiler-generated defaults. Also default operator== and operator!= (these are not brought in by operator<=>). These become slightly faster as they perform just an equality comparison, not three-way compare. shadowable_tombstone and row_tombstone depend on tombstone::compare(), so convert them too in a similar way. with_relational_operations.hh becomes unused, so delete it. Tests: unit (dev) Message-Id: <20200602055626.2874801-1-avi@scylladb.com>	2020-06-02 09:28:52 +03:00
Avi Kivity	f3da043230	Merge "Make in-memory partition version merging preemptable" from Tomasz " Partition snapshots go away when the last read using the snapshot is done. Currently we will synchronously attempt to merge partition versions on this event. If partitions are large, that may stall the reactor for a significant amount of time, depending on the size of newer versions. Cache update on memtable flush can create especially large versions. The solution implemented in this series is to allow merging to be preemptable, and continue in the background. Background merging is done by the mutation_cleaner associated with the container (memtable, cache). There is a single merging process per mutation_cleaner. The merging worker runs in a separate scheduling group, introduced here, called "mem_compaction". When the last user of a snapshot goes away the snapshot is slided to the oldest unreferenced version first so that the version is no longer reachable from partition_entry::read(). The cleaner will then keep merging preceding (newer) versions into it, until it merges a version which is referenced. The merging is preemtable. If the initial merging is preempted, the snapshot is enqueued into the cleaner, the worker woken up, and merging will continue asynchronously. When memtable is merged with cache, its cleaner is merged with cache cleaner, so any outstanding background merges will be continued by the cache cleaner without disruption. This reduces scheduling latency spikes in tests/perf_row_cache_update for the case of large partition with many rows. For -c1 -m1G I saw them dropping from >23ms to 1-2ms. System-level benchmark using scylla-bench shows a similar improvement. " * tag 'tgrabiec/merge-snapshots-gradually-v4' of github.com:tgrabiec/scylla: tests: perf_row_cache_update: Test with an active reader surviving memtable flush memtable, cache: Run mutation_cleaner worker in its own scheduling group mutation_cleaner: Make merge() redirect old instance to the new one mvcc: Use RAII to ensure that partition versions are merged mvcc: Merge partition version versions gradually in the background mutation_partition: Make merging preemtable tests: mvcc: Use the standard maybe_merge_versions() to merge snapshots	2018-07-01 15:32:51 +03:00
Vladimir Krivopalov	82f76b0947	Use std::reference_wrapper instead of a plain reference in bound_view. The presence of a plain reference prohibits the bound_view class from being copyable. The trick employed to work around that was to use 'placement new' for copy-assigning bound_view objects, but this approach is ill-formed and causes undefined behaviour for classes that have const and/or reference members. The solution is to use a std::reference_wrapper instead. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <a0c951649c7aef2f66612fc006c44f8a33713931.1530113273.git.vladimir@scylladb.com>	2018-06-28 11:24:06 +01:00
Tomasz Grabiec	4d3cc2867a	mutation_partition: Make merging preemtable	2018-06-27 12:48:30 +02:00
Tomasz Grabiec	40cc766cf2	database: Add API for incremental clearing of partition entries Partitions can get very large. Destroying them all at once can stall the reactor for significant amount of time. We want to avoid that by doing destruction incrementally, deferring in between. A new API is added for that at various levels: stop_iteration clear_gently() noexcept; It returns stop_iteration::yes when the object is fully cleared and can be now destroyed quickly. So a deferring destruction can look like this: return repeat([this] { return clear_gently(); }); The reason why clear_gently() doesn't return a future<> itself is that some contexts cannot defer, like memory reclamation.	2018-05-30 12:18:56 +02:00
Vladimir Krivopalov	5c3b32a9bf	Remove to_boost_visitor heler. The minimal Boost version required for Scylla now is 1.58 and this helper is no longer needed. Replaced it with more generic visitation utils from Seastar. Signed-off-by: Vladimir Krivopalov <vladimir@scylladb.com> Message-Id: <e589ace7ac411d3d55dead475a8a2271f51642f1.1520976010.git.vladimir@scylladb.com>	2018-03-14 23:49:07 +00:00
Tomasz Grabiec	dfe48bbbc7	range_tombstone_list: Fix insert_from() end_bound was not updated in one of the cases in which end and end_kind was changed, as a result later merging decision using end_bound were incorrect. end_bound was using the new key, but the old end_kind. Fixes #3083. Message-Id: <1513772083-5257-1-git-send-email-tgrabiec@scylladb.com>	2017-12-20 12:20:20 +00:00
Tomasz Grabiec	9c620e0246	range_tombstone_list: Introduce apply_monotonically()	2017-11-07 15:33:24 +01:00
Tomasz Grabiec	2fe53ac617	range_tombstone_list: Make reverter::erase() exception-safe erase_undo_op() constructor takes ownership of it, and destroys it when it goes out of scope. If emplace_back() fails, it would be destroyed before being removed from its container (_dst._tombstones). Fix by making sure _ops.emplace_back() won't fail.	2017-11-07 15:33:24 +01:00
Tomasz Grabiec	6190f9fc63	range_tombstone_list: Fix memory leaks in case of bad_alloc If insert() fails, the allocated range_tombstone would not be freed. Use alloc_strategy_unique_ptr.	2017-11-07 15:33:24 +01:00
Tomasz Grabiec	b6d349728f	range_tombstone_list: Introduce slice() working with position range	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	92d6456070	range_tombstone_list: Introduce equal()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	19edb0b535	range_tombstone_list: Make printable	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	2e75595ecf	range_tombstone_list: Introduce trim()	2017-06-24 18:06:11 +02:00
Tomasz Grabiec	3c509308ab	range_tombstone_list: Merge adjacent range tombstones in apply() Needed for equivalence to work correctly with difference and addition: m1 + (m2 - m1) = m1 + m2 Fixes #2158.	2017-05-23 13:16:03 +02:00
Tomasz Grabiec	1dea251ca2	range_tombstone_list: Avoid violating set invariant The code was inserting an entry with the same key as its successor, and only later adjusting the key of the old entry. This is violating set's invariant of unique keys, and insertion may cause rebalancing. I don't know if this violation actually causes problems currently, but it's safer not to. Fix by first updating the existing entry and then inserting the new one.	2017-05-23 12:11:12 +02:00
Tomasz Grabiec	a2a22e5f00	range_tombstone_list: Make tombstone merging commutative Example of non-commutative case: a = [1, 5]@t2 b = {[2, 3]@t1, [4, 5]@t1} a + b = [1, 5]@t2 b + a = [1, 4)@t2, [4, 5]@t2 After this patch, both will yield [1, 5]@t2. The patch also changes the logic to handle overlaps of tombstones with equal timestamps to be handled symmetrically. They are now merged instead of split on either of the boundary. Refs #2158.	2017-05-23 12:11:12 +02:00
Tomasz Grabiec	c4dac7c80f	range_tombstone_list: Add erase() operation to the reverter	2017-05-23 12:11:12 +02:00
Tomasz Grabiec	935709cddc	range_tombstone_list: Make all undo operations ordered relative to each other Later operation may depend on the result of previous operation. Same dependency is present when reverting the operations. Fixes assertion failure in update reverter.	2017-05-23 12:11:12 +02:00
Tomasz Grabiec	fb42366552	range_tombstone_list: Introduce erase()	2017-02-13 16:12:15 +01:00
Tomasz Grabiec	8b7f93175c	range_tombstone_list: Introduce slice()	2017-02-13 16:12:15 +01:00
Duarte Nunes	85315d1760	range_tombstone_list: Correctly implement difference() The difference method wasn't properly implemented. The version in this patch correctly computes the difference and returns a range tombstone list contains those range tombstones in "this" but absent from the other, specified range tombstone list. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2017-01-23 18:14:33 +01:00
Avi Kivity	056b427855	range_tombstone_list: use non-template lambda for cloning tombstones Using a template lambda invokes a bug in Fedora 24's boost where the lambda's parameter is an internal boost type rather than a range_tombestone. Constraining the parameter with an explicit type avoids the problem. Message-Id: <1466844211-17298-1-git-send-email-avi@scylladb.com>	2016-06-27 10:48:59 +02:00
Paweł Dziepak	a200189541	range_tombstone_list: mark apply() argument as const Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:50 +01:00
Paweł Dziepak	c24f08a683	range_tombstone_list: compare full tombstones not just timestamps Signed-off-by: Paweł Dziepak <pdziepak@scylladb.com>	2016-06-20 21:29:48 +01:00
Duarte Nunes	95594b8171	mutations: Encapsulate row tombstones difference This patch moves the difference between two mutation_partition's row_tombstones inside the range_tombstone_list. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:59 +02:00
Duarte Nunes	284bb6b66f	range_tombstone_list: Make it ReversiblyMergeable This patch implements the ReversiblyMergeable cancellative monoid for the range_tombstone_list. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:58 +02:00
Duarte Nunes	86030885c8	mutations: Introduce range tombstone list This class is responsible for representing a set of range tombstones as non-overlapping disjoint sets of range tombstones. Signed-off-by: Duarte Nunes <duarte@scylladb.com>	2016-06-02 16:21:58 +02:00

41 Commits