The max_seq and active reader mechanisms in the item cache stop readers from reading old items and inserting them in the cache after newer items have been reclaimed by memory pressure. The max_seq field in the pages must reflect the greatest seq of the items in the page so that reclaim knows that the page contains items newer than old readers and must not be removed. We update the page max_seq as items are inserted or as they're dirtied in the page. There's an additional subtle effect that the max_seq can also protect items which have been erased. Deletion items are erased from the pages as a commit completes. The max_seq in that page will still protect it from being reclaimed even though no items have that seq value themselves. That protection fails if the range of keys containing the erased item is moved to another page with a lower max_seq. The item mover only updated the destination page's max_seq for each item that was moved. It missed that the empty space between the items might have a larger max_seq from an erased item. We don't know where the erased item is so we have to assume that a larger max_seq in the source page must be set on the destination page. This could explain very rare item cache corruption where nodes were seeing deleted directory entry items reappearing. It would take a specific sequence of events involving large directories with an isolated removal, a delayed item cache reader, a commit, and then enough insertions to split the page all happening in precisely the wrong sequence. Signed-off-by: Zach Brown <zab@versity.com>
950 B
Versity ScoutFS Release Notes
v1.x
TBD
-
Add scoutfs(1) change-quorum-config command
Add a change-quorum-config command to scoutfs(1) to change the quorum configuration stored in the metadata device while the file system is unmounted. This can be used to change the mounts that will participate in quorum and the IP addresses they use. -
Fix Rare Risk of Item Cache Corruption
Code review found a rare potential source of item cache corruption. If this happened it would look as though deleted parts of the filesystem returned, but only at the time they were deleted. Old deleted items are not affected. This problem only affected the item cache, never persistent storage. Unmounting and remounting would drop the bad item cache and resync it with the correct persistent data.
v1.0
Nov 8, 2021
- Initial Release
Version 1.0 marks the first GA release.