Commit Graph

18 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Pavel Emelyanov
5a405a4273 tests: Make B-tree tests use unique-ptrs for insertion
The non-smart-pointers overloads are going away, prepare
tests for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-12-10 12:35:12 +03:00
Avi Kivity
daf028210b build: enable -Winconsistent-missing-override warning
This warning can catch a virtual function that thinks it
overrides another, but doesn't, because the two functions
have different signatures. This isn't very likely since most
of our virtual functions override pure virtuals, but it's
still worth having.

Enable the warning and fix numerous violations.

Closes #9347
2021-09-15 12:55:54 +03:00
Michał Chojnowski
2aa0a2e6a1 test: perf: perf_collection: use the optimized version of bptree
Since key_compare does not conform to SimpleLessCompare, the benchmark
tests the non-optimized version of bptree (without SIMD key search).
We want to test the optimized version.

Closes #9180
2021-08-10 17:04:34 +03:00
Avi Kivity
0909e3c17d treewide: remove redundant "x <=> 0" compares
If x is of type std::strong_ordering, then "x <=> 0" is equivalent to
x. These no-ops were inserted during #1449 fixes, but are now unnecessary.
They have potential for harm, since they can hide an accidental of the
type of x to an arithmetic type, so remove them.

Ref #1449.
2021-07-28 13:30:32 +03:00
Avi Kivity
11fa402ecc test: change some internal comparators to std::strong_ordering
Ref #1449.
2021-07-28 13:28:51 +03:00
Pavel Emelyanov
e652b03b4e btree tests: Dont use iterator erase
Next patches will mark btree::iterator methods that modify
the tree itself as private, so stop using them in tests.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-07-27 20:06:53 +03:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Pavel Emelyanov
8bbe2eae5e btree: Convert comparator to <=>
It turned out that all the users of btree can already be converted
to use safer std::strong_ordering. The only meaningful change here
is the btree code itself -- no more ints there.

tests: unit(dev)

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
Message-Id: <20210330153648.27049-1-xemul@scylladb.com>
2021-04-01 12:56:08 +03:00
Pavel Emelyanov
aa85bc790b test: Add tests for radix tree
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-15 20:27:00 +03:00
Pavel Emelyanov
0ad361b380 test/perf_collection: Add callback to check the speed of clone
In some places scylla clones collections of objects, so it's
sometimes needed to measure the speed of this operation.

This patch adds a placeholder for it, but no implementations
for any supported collections. It will be added soon for radix
tree.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-15 17:46:37 +03:00
Pavel Emelyanov
2f7c03d84c utils: Intrusive B-tree (with tests)
The design of the tree goes from the row-cache needs, which are

1. Insert/Remove do not invalidate iterators
2. Elements are LSA-manageable
3. Low key overhead
4. External tri-comparator
5. As little actions on insert/remove as possible

With the above the design is

Two types of nodes -- inner and leaf. Both types keep pointer on parent nodes
and N pointers on keys (not keys themselves). Two differences: inner nodes have
array of pointers on kids, leaf nodes keep pointer on the tree (to update left-
and rightmost tree pointers on node move).

Nodes do not keep pointers/references on trees, thus we have O(1) move of any
object, but O(logN) to get the tree size. Fortunately, with big keys-per-node
value this won't result in too many steps.

In turn, the tree has 3 pointers -- root, left- and rightmost leaves. The latter
is for constant-time begin() and end().

Keys are managed by user with the help of embeddable member_hook instance,
which is 1 pointer in size.

The code was copied from the B+ tree one, then heavily reworked, the internal
algorythms turned out to differ quite significantly.

For the sake of mutation_partition::apply_monotonically(), which needs to move
an element from one tree into another, there's a key_grabber helping wrapper
that allows doing this move respecting the exception-safety requirement.

As measured by the perf_collections test the B-tree with 8 keys is faster, than
the std::set, but slower than the B+tree:

            vs set        vs b+tree
   fill:     +13%           -6%
   find:     +23%          -35%

Another neat thing is that 1-key insertion-removal is ~40% faster than
for BST (the same number of allocations, but the key object is smaller,
less pointers to set-up and less instructions to execute when linking
node with root).

v4:
- equip insertion methods with on_alloc_point() calls to catch
  potential exception guarantees violations eariler

- add unlink_leftmost_without_rebalance. The method is borrowed from
  boost intrusive set, and is added to kill two birds -- provide it,
  as it turns out to be popular, and use a bit faster step-by-step
  tree destruction than plain begin+erase loop

v3:
- introduce "inline" root node that is embedded into tree object and in
  which the 1st key is inserted. This greatly improves the 1-key-tree
  performance, which is pretty common case for rows cache

v2:
- introduce "linear" root leaf that grows on demand

  This improves the memory consumption for small trees. This linear node may
  and should over-grow the NodeSize parameter. This comes from the fact that
  there are two big per-key memory spikes on small trees -- 1-key root leaf
  and the first split, when the tree becomes 1-key root with two half-filled
  leaves. If the linear extention goes above NodeSize it can flatten even the
  2nd peak

- mitigate the keys indirection a bit

  Prefetching the keys while doing the intra-node linear scan and the nodes
  while descending the tree gives ~+5% of fill and find

- generalize stress tests for B and B+ trees

- cosmetic changes

TODO:

- fix few inefficincies in the core code (walks the sub-tree twice sometimes)
- try to optimize the leaf nodes, that are not lef-/righmost not to carry
  unused tree pointer on board

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-02-02 09:30:29 +03:00
Pavel Emelyanov
8558339c63 perf_collection: Add test for full scan time
Scan here means walking the collection forward using iterator.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00
Pavel Emelyanov
7284469b24 perf_collection: Add test for destruction with .clear()
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00
Pavel Emelyanov
72ccc43380 perf_collection: Add test for single element insertion
In some cases a collection is used to keep several elements,
so it's good to know this timing.

For example, a mutation_partition keeps a set of rows, if used
in cache it can grow large, if used in mutation to apply, it's
typically small. Plain replacement of bst into b-tree caused
performance degardation of mutation application because b-tree
is only better at big sizes.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00
Pavel Emelyanov
207e1aa48f perf_collection: Add intrusive_set_external_comparator
This collection is widely used, any replacement should be
compared against it to better understand pros-n-cons.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00
Pavel Emelyanov
2d09864627 perf_collection: Clear collection between itartions
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00
Pavel Emelyanov
c891f274dc test: Generalize perf_bptree into perf_collection
Rename into perf_collection and localize the B+ code.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-10-06 09:57:37 +03:00