Zach Brown efd9763355 scoutfs: use efficient btree block structures
This btree implementation was first built for the relatively light duty
of indexing segments in the LSM item implementation.  We're now using it
as the core metadata index.  It's already using a lot of cpu to do its
job with small blocks and it only gets more expensive as the block size
increases.  These changes reduce the CPU use of working with the btree
block structures.

We use a balanced binary tree to index items by key in the block.  This
gives us rare tree balancing cost on insertion and deletion instead of
the memmove overhead of maintaining a dense array of item offsets sorted
by key.  The keys are stored in the item struct which are stored in an
array at the front of the block so searching for an item uses contiguous
cachelines.

We add a trailing owner offset to values so that we can iterate through
them.  This is used to track space freed up by values instead of paying
the memmove cost of keeping all the values at the end of the block.  We
occasionally reclaim the fragmented value free space instead of
splitting the block.

Direct item lookups use a small hash table at the end of the block
which maps offsets to items.  It uses linear probing and is guaranteed
to have a light load factor so lookups are very likely to only need
a single cache lookup.

We adjust the watermark for triggering a join from half of a block down
to a quarter.  This results in less utilized blocks on average.  But it
creates distance between the join and split thresholds so we get less
cpu use from constantly joining and splitting if item populations happen
to hover around the previously shared threshold.

While shifting the implementation we choose not to add support for some
features that no longer make sense.  There are no longer callers of
_before and _after, and having synthetic tests to use small btree blocks
no longer makes ense when we can easily create very tall trees.  Both
those btree interfaces and the tiny btree block support will be removed.

Signed-off-by: Zach Brown <zab@versity.com>
2020-08-26 14:39:12 -07:00
Description
No description provided
8 MiB
Languages
C 87%
Shell 9.3%
Roff 2.5%
TeX 0.8%
Makefile 0.4%