mirror of
https://github.com/scylladb/scylladb.git
synced 2026-05-30 03:30:49 +00:00
The design of the tree goes from the row-cache needs, which are
1. Insert/Remove do not invalidate iterators
2. Elements are LSA-manageable
3. Low key overhead
4. External tri-comparator
5. As little actions on insert/remove as possible
With the above the design is
Two types of nodes -- inner and leaf. Both types keep pointer on parent nodes
and N pointers on keys (not keys themselves). Two differences: inner nodes have
array of pointers on kids, leaf nodes keep pointer on the tree (to update left-
and rightmost tree pointers on node move).
Nodes do not keep pointers/references on trees, thus we have O(1) move of any
object, but O(logN) to get the tree size. Fortunately, with big keys-per-node
value this won't result in too many steps.
In turn, the tree has 3 pointers -- root, left- and rightmost leaves. The latter
is for constant-time begin() and end().
Keys are managed by user with the help of embeddable member_hook instance,
which is 1 pointer in size.
The code was copied from the B+ tree one, then heavily reworked, the internal
algorythms turned out to differ quite significantly.
For the sake of mutation_partition::apply_monotonically(), which needs to move
an element from one tree into another, there's a key_grabber helping wrapper
that allows doing this move respecting the exception-safety requirement.
As measured by the perf_collections test the B-tree with 8 keys is faster, than
the std::set, but slower than the B+tree:
vs set vs b+tree
fill: +13% -6%
find: +23% -35%
Another neat thing is that 1-key insertion-removal is ~40% faster than
for BST (the same number of allocations, but the key object is smaller,
less pointers to set-up and less instructions to execute when linking
node with root).
v4:
- equip insertion methods with on_alloc_point() calls to catch
potential exception guarantees violations eariler
- add unlink_leftmost_without_rebalance. The method is borrowed from
boost intrusive set, and is added to kill two birds -- provide it,
as it turns out to be popular, and use a bit faster step-by-step
tree destruction than plain begin+erase loop
v3:
- introduce "inline" root node that is embedded into tree object and in
which the 1st key is inserted. This greatly improves the 1-key-tree
performance, which is pretty common case for rows cache
v2:
- introduce "linear" root leaf that grows on demand
This improves the memory consumption for small trees. This linear node may
and should over-grow the NodeSize parameter. This comes from the fact that
there are two big per-key memory spikes on small trees -- 1-key root leaf
and the first split, when the tree becomes 1-key root with two half-filled
leaves. If the linear extention goes above NodeSize it can flatten even the
2nd peak
- mitigate the keys indirection a bit
Prefetching the keys while doing the intra-node linear scan and the nodes
while descending the tree gives ~+5% of fill and find
- generalize stress tests for B and B+ trees
- cosmetic changes
TODO:
- fix few inefficincies in the core code (walks the sub-tree twice sometimes)
- try to optimize the leaf nodes, that are not lef-/righmost not to carry
unused tree pointer on board
Signed-off-by: Pavel Emelyanov <xemul@scylladb.com>
27 lines
985 B
YAML
27 lines
985 B
YAML
# Suite test type. Supported types: unit, boost, cql
|
|
type: unit
|
|
# A list of tests that are only run in dev and release modes
|
|
skip_in_debug_mode:
|
|
- lsa_async_eviction_test
|
|
- lsa_sync_eviction_test
|
|
- row_cache_alloc_stress_test
|
|
- row_cache_stress_test
|
|
# Custom command line arguments for some of the tests
|
|
custom_args:
|
|
lsa_async_eviction_test:
|
|
- '-c1 -m200M --size 1024 --batch 3000 --count 2000000'
|
|
lsa_sync_eviction_test:
|
|
- '-c1 -m100M --count 10 --standard-object-size 3000000'
|
|
- '-c1 -m100M --count 24000 --standard-object-size 2048'
|
|
- '-c1 -m1G --count 4000000 --standard-object-size 128'
|
|
row_cache_alloc_stress_test:
|
|
- '-c1 -m2G'
|
|
row_cache_stress_test:
|
|
- '-c1 -m1G --seconds 10'
|
|
btree_stress_test:
|
|
- '-c1 -m1G --count=4132 --iter=9'
|
|
- '-c1 -m1G --count=27 --iter=312'
|
|
btree_compaction_test:
|
|
- '-c1 -m1G --count=10000 --iter=13'
|
|
- '-c1 -m1G --count=17 --iter=3'
|