Fix btree join item movement

Refilling a btree block by moving items from its siblings as it falls
under the join threshold had some pretty serious mistakes.  It used the
target block's total item count instead of the siblings when deciding
how many items to move.  It didn't take item moving overruns into
account when deciding to compact so it could run out of contiguous free
space as it moved the last item.  And once it compacted it returned
without moving because the return was meant to be in the error case.

This is all fixed by correctly examining the sibling block to determine
if we should join a block up to 75% full or move a big chunk over,
compacting if the free space doesn't have room for an excessive worst
case overrun, and fixing the compaction error checking return typo.

Signed-off-by: Zach Brown <zab@versity.com>
This commit is contained in:
Zach Brown
2021-06-14 17:10:16 -07:00
parent a7828a6410
commit 5c3fdb48af

View File

@@ -108,10 +108,11 @@ static inline unsigned int item_bytes(struct scoutfs_btree_item *item)
}
/*
* Join blocks when they both are 1/4 full. This puts some distance
* between the join threshold and the full threshold for splitting.
* Blocks that just split or joined need to undergo a reasonable amount
* of item modification before they'll split or join again.
* Refill blocks from their siblings when they're under 1/4 full. This
* puts some distance between the join threshold and the full threshold
* for splitting. Blocks that just split or joined need to undergo a
* reasonable amount of item modification before they'll split or join
* again.
*/
static unsigned int join_low_watermark(void)
{
@@ -815,6 +816,7 @@ static int try_join(struct super_block *sb,
struct scoutfs_btree_block *sib;
struct scoutfs_block *sib_bl;
struct scoutfs_block_ref *ref;
const unsigned int lwm = join_low_watermark();
unsigned int sib_tot;
bool move_right;
int to_move;
@@ -840,18 +842,23 @@ static int try_join(struct super_block *sb,
return ret;
sib = sib_bl->data;
sib_tot = le16_to_cpu(bt->total_item_bytes);
if (sib_tot < join_low_watermark())
/* combine if resulting block would be up to 75% full, move big chunk otherwise */
sib_tot = le16_to_cpu(sib->total_item_bytes);
if (sib_tot <= lwm * 2)
to_move = sib_tot;
else
to_move = sib_tot - join_low_watermark();
to_move = lwm;
if (le16_to_cpu(bt->mid_free_len) < to_move) {
/* compact to make room for over-estimate of worst case move overrun */
if (le16_to_cpu(bt->mid_free_len) <
(to_move + item_len_bytes(SCOUTFS_BTREE_MAX_VAL_LEN))) {
ret = compact_values(sb, bt);
if (ret < 0)
if (ret < 0) {
scoutfs_block_put(sb, sib_bl);
return ret;
return ret;
}
}
move_items(bt, sib, move_right, to_move);
/* update our parent's item */