Preserve item cache page max_seq as items move

The max_seq and active reader mechanisms in the item cache stop readers
from reading old items and inserting them in the cache after newer items
have been reclaimed by memory pressure.  The max_seq field in the pages
must reflect the greatest seq of the items in the page so that reclaim
knows that the page contains items newer than old readers and must not
be removed.

We update the page max_seq as items are inserted or as they're dirtied
in the page.   There's an additional subtle effect that the max_seq can
also protect items which have been erased.  Deletion items are erased
from the pages as a commit completes.   The max_seq in that page will
still protect it from being reclaimed even though no items have that seq
value themselves.

That protection fails if the range of keys containing the erased item is
moved to another page with a lower max_seq.   The item mover only
updated the destination page's max_seq for each item that was moved.  It
missed that the empty space between the items might have a larger
max_seq from an erased item.  We don't know where the erased item is so
we have to assume that a larger max_seq in the source page must be set
on the destination page.

This could explain very rare item cache corruption where nodes were
seeing deleted directory entry items reappearing.  It would take a
specific sequence of events involving large directories with an isolated
removal, a delayed item cache reader, a commit, and then enough
insertions to split the page all happening in precisely the wrong
sequence.

Signed-off-by: Zach Brown <zab@versity.com>
This commit is contained in:
Zach Brown
2022-01-12 09:06:54 -08:00
parent 166ab58b99
commit 99a1cc704f
2 changed files with 18 additions and 0 deletions

View File

@@ -14,6 +14,15 @@ v1.x
unmounted. This can be used to change the mounts that will
participate in quorum and the IP addresses they use.
* **Fix Rare Risk of Item Cache Corruption**
\
Code review found a rare potential source of item cache corruption.
If this happened it would look as though deleted parts of the filesystem
returned, but only at the time they were deleted. Old deleted items are
not affected. This problem only affected the item cache, never
persistent storage. Unmounting and remounting would drop the bad item
cache and resync it with the correct persistent data.
---
v1.0
\

View File

@@ -685,6 +685,12 @@ static void erase_page_items(struct cached_page *pg,
* to the dirty list after the left page, and by adding items to the
* tail of right's dirty list in key sort order.
*
* The max_seq of the source page might be larger than all the items
* while protecting an erased item from being reclaimed while an older
* read is in flight. We don't know where it might be in the source
* page so we have to assume that it's in the key range being moved and
* update the destination page's max_seq accordingly.
*
* The caller is responsible for page locking and managing the lru.
*/
static void move_page_items(struct super_block *sb,
@@ -726,6 +732,9 @@ static void move_page_items(struct super_block *sb,
erase_item(left, from);
}
if (left->max_seq > right->max_seq)
right->max_seq = left->max_seq;
}
enum page_intersection_type {