Commit Graph

12 Commits

Author SHA1 Message Date
Avi Kivity
fcb8d040e8 treewide: use Software Package Data Exchange (SPDX) license identifiers
Instead of lengthy blurbs, switch to single-line, machine-readable
standardized (https://spdx.dev) license identifiers. The Linux kernel
switched long ago, so there is strong precedent.

Three cases are handled: AGPL-only, Apache-only, and dual licensed.
For the latter case, I chose (AGPL-3.0-or-later and Apache-2.0),
reasoning that our changes are extensive enough to apply our license.

The changes we applied mechanically with a script, except to
licenses/README.md.

Closes #9937
2022-01-18 12:15:18 +01:00
Avi Kivity
a55b434a2b treewide: extent copyright statements to present day 2021-06-06 19:18:49 +03:00
Avi Kivity
e9e5663731 build, utils/bptree.hh: drop -Wno-gnu-designator warning
Drop the warning about old-stye GNU designated initializers and
convert two violations in bptree.hh to the standard C++20 syntax.

Closes #8743
2021-05-31 18:51:49 +03:00
Pavel Emelyanov
0c4ba56594 bptree: Require dynamic object for nodes reconstruct
The B+ tree is not intrusive and supports both kinds of objects --
dynamic (in sense of previous patch) and fixed-size. Respectively,
the nodes provide .storage_size() method and get the embedded object
storage size themselves. Thus, if a dynamic object is used with the
tree but it misses the .storage_size() itself this would come
unnoticed. Fortunately, dynamic objects use the .reconstruct()
method, so the method should be equipeed with the DybnamicObject
concept.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Pavel Emelyanov
9216a5bc08 allocation_strategy, code: Conceptualize dynamic objects
Usually lsa allocation is performed with the construct() helper that
allocates a sizeof(T) slot and constructs it in place. Some rare
objects have dynamic size, so they are created by alloc()ating a
slot of some specific size and (!) must provide the correct overload
of size_for_allocation_strategy that reports back the relevant
storage size.

This "must provide" is not enforced, if missed a default sizer would
be instantiated, but won't work properly. This patch makes all users
of alloc() conform to DynamicObject concept which requires the
presense of .storage_size() method to tell that size.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Pavel Emelyanov
28f01aadc9 allocation_strategy, code: Simplify alloc()
Todays alloc() accepts migrate-fn, size and alignment. All the callers
don't really need to provide anything special for the migrate-fn and
are just happy with default alignof() for alignment. The simplification
is in providing alloc() that only accepts size arg and does the rest
itself.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2021-05-19 09:23:49 +03:00
Avi Kivity
9038a81317 treewide: drop SEASTAR_CONCEPT
Since Scylla requires C++20, there is no need to protect
concept definitions or usages with SEASTAR_CONCEPT; it just
clutters the code. This patch therefore removes all uses.

Closes #8236
2021-03-08 16:04:20 +01:00
Avi Kivity
765e632626 utils: bptree: remove redundant and possibly wrong friend declaration
Clang complains about befriending a constructor. It's possibly correct.
In any case it's redundant, so remove it.
2020-09-22 17:24:33 +03:00
Avi Kivity
c7105019b2 utils: bptree: add missing typename for clang
Clang does not implement p0634r3, so we must add more typenames.
2020-09-22 17:24:33 +03:00
Pavel Emelyanov
35a22ac48a bptree: Special intra-node key search when possible
If the key type is int64_t and the less-comparator is "natural" (i.e. it's
literally 'a < b') we may use the SIMD instructions to search for the key
on a node. Before doing so, the maybe_key and the searcher should be prepared
for that, in particular:

1. maybe_key should set unused keys to the minimal value
2. the searcher for this case should call the gt() helper with
   primitive types -- int64_t search key and array of int64_t values

To tell to B+ code that the key-less pair is such the less-er should define
the simplify_key() method converting search keys to int64_t-s.

This searcher is selected automatically, if any mismatch happens it silently
falls back to default one. Thus also add a static assertion to the row-cache
to mitigate this.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-06 15:41:31 +03:00
Pavel Emelyanov
14f0cdb779 bptree: Add lesses to maybe_key template
The way maybe_key works will be in-sync with the intra-node searching
code and will require to know what the Less type is, so prepare for that.

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-08-06 15:41:31 +03:00
Pavel Emelyanov
95f15ea383 utils: B+ tree implementation
// The story is at
// https://groups.google.com/forum/#!msg/scylladb-dev/sxqTHM9rSDQ/WqwF1AQDAQAJ

This is the B+ version which satisfies several specific requirements
to be suitable for row-cache usage.

1. Insert/Remove doesn't invalidate iterators
2. Elements should be LSA-compactable
3. Low overhead of data nodes (1 pointer)
4. External less-only comparator
5. As little actions on insert/delete as possible
6. Iterator walks the sorted keys

The design, briefly is:

There are 3 types of nodes: inner, leaf and data, inner and leaf
keep build-in array of N keys and N(+1) nodes. Leaf nodes sit in
a doubly linked list. Data nodes live separately from the leaf ones
and keep pointers on them. Tree handler keeps pointers on root and
left-most and right-most leaves. Nodes do _not_ keep pointers or
references on the tree (except 3 of them, see below).

changes in v9:

- explicitly marked keys/kids indices with type aliases
- marked the whole erase/clear stuff noexcept
- disposers now accept object pointer instead of reference
- clear tree in destructor
- added more comments
- style/readability review comments fixed

Prior changes

**
- Add noexcepts where possible
- Restrict Less-comparator constraint -- it must be noexcept
- Generalized node_id
- Packed code for beging()/cbegin()

**
- Unsigned indices everywhere
- Cosmetics changes

**
- Const iterators
- C++20 concepts

**
- The index_for() implmenetation is templatized the other way
  to make it possible for AVX key search specialization (further
  patching)

**
- Insertion tries to push kids to siblings before split

  Before this change insertion into full node resulted into this
  node being split into two equal parts. This behaviour for random
  keys stress gives a tree with ~2/3 of nodes half-filled.

  With this change before splitting the full node try to push one
  element to each of the siblings (if they exist and not full).
  This slows the insertion a bit (but it's still way faster than
  the std::set), but gives 15% less total number of nodes.

- Iterator method to reconstruct the data at the given position

  The helper creates a new data node, emplaces data into it and
  replaces the iterator's one with it. Needed to keep arrays of
  data in tree.

- Milli-optimize erase()
  - Return back an iterator that will likely be not re-validated
  - Do not try to update ancestors separation key for leftmost kid

  This caused the clear()-like workload work poorly as compared to
  std:set. In particular the row_cache::invalidate() method does
  exactly this and this change improves its timing.

- Perf test to measure drain speed
- Helper call to collect tree counters

**
- Fix corner case of iterator.emplace_before()
- Clean heterogenous lookup API
- Handle exceptions from nodes allocations
- Explicitly mark places where the key is copied (for future)
- Extend the tree.lower_bound() API to report back whether
  the bound hit the key or not
- Addressed style/cleanness review comments

Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
2020-07-14 16:29:43 +03:00