mirror of
https://github.com/versity/scoutfs.git
synced 2026-01-03 19:04:00 +00:00
Preserve item cache page max_seq as items move
The max_seq and active reader mechanisms in the item cache stop readers from reading old items and inserting them in the cache after newer items have been reclaimed by memory pressure. The max_seq field in the pages must reflect the greatest seq of the items in the page so that reclaim knows that the page contains items newer than old readers and must not be removed. We update the page max_seq as items are inserted or as they're dirtied in the page. There's an additional subtle effect that the max_seq can also protect items which have been erased. Deletion items are erased from the pages as a commit completes. The max_seq in that page will still protect it from being reclaimed even though no items have that seq value themselves. That protection fails if the range of keys containing the erased item is moved to another page with a lower max_seq. The item mover only updated the destination page's max_seq for each item that was moved. It missed that the empty space between the items might have a larger max_seq from an erased item. We don't know where the erased item is so we have to assume that a larger max_seq in the source page must be set on the destination page. This could explain very rare item cache corruption where nodes were seeing deleted directory entry items reappearing. It would take a specific sequence of events involving large directories with an isolated removal, a delayed item cache reader, a commit, and then enough insertions to split the page all happening in precisely the wrong sequence. Signed-off-by: Zach Brown <zab@versity.com>
This commit is contained in:
@@ -14,6 +14,15 @@ v1.x
|
||||
unmounted. This can be used to change the mounts that will
|
||||
participate in quorum and the IP addresses they use.
|
||||
|
||||
* **Fix Rare Risk of Item Cache Corruption**
|
||||
\
|
||||
Code review found a rare potential source of item cache corruption.
|
||||
If this happened it would look as though deleted parts of the filesystem
|
||||
returned, but only at the time they were deleted. Old deleted items are
|
||||
not affected. This problem only affected the item cache, never
|
||||
persistent storage. Unmounting and remounting would drop the bad item
|
||||
cache and resync it with the correct persistent data.
|
||||
|
||||
---
|
||||
v1.0
|
||||
\
|
||||
|
||||
@@ -685,6 +685,12 @@ static void erase_page_items(struct cached_page *pg,
|
||||
* to the dirty list after the left page, and by adding items to the
|
||||
* tail of right's dirty list in key sort order.
|
||||
*
|
||||
* The max_seq of the source page might be larger than all the items
|
||||
* while protecting an erased item from being reclaimed while an older
|
||||
* read is in flight. We don't know where it might be in the source
|
||||
* page so we have to assume that it's in the key range being moved and
|
||||
* update the destination page's max_seq accordingly.
|
||||
*
|
||||
* The caller is responsible for page locking and managing the lru.
|
||||
*/
|
||||
static void move_page_items(struct super_block *sb,
|
||||
@@ -726,6 +732,9 @@ static void move_page_items(struct super_block *sb,
|
||||
|
||||
erase_item(left, from);
|
||||
}
|
||||
|
||||
if (left->max_seq > right->max_seq)
|
||||
right->max_seq = left->max_seq;
|
||||
}
|
||||
|
||||
enum page_intersection_type {
|
||||
|
||||
Reference in New Issue
Block a user