Files
scoutfs/kmod
Zach Brown 71ed4512dc Include primary lock write_seq for write_only vers
FS items are deleted by logging a deletion item that has a greater item
version than the item to delete.  The versions are usually maintained by
the write_seq of the exclusive write lock that protects the item.  Any
newer write hold will have a greater version than all previous write
holds so any items created under the lock will have a greater vers than
all previous items under the lock.  All deletion items will be merged
with the older item and both will be dropped.

This doesn't work for concurrent write-only locks.  The write-only locks
match with each other so their write_seqs are asssigned in the order
that they are granted.  That grant order can be mismatched with item
creation order.  We can get deletion items with lesser versions than the
item to delete because of when each creation's write-only lock was
granted.

Write only locks are used to maintain consistency between concurrent
writers and readers, not between writers.  Consistency between writers
is done with another primary write lock.  For example, if you're writing
seq items to a write-only region you need to have the write lock on the
inode for the specific seq item you're writing.

The fix, then, is to pass these primary write locks down to the item
cache so that it can chose an item version that is the greatest amongst
the transaction, the write-only lock, and the primary lock.  This now
ensures that the primary lock's increasing write_seq makes it down to
the item, bringing item version ordering in line with exclusive holds of
the primary lock.

All of this to fix concurrent inode updates sometimes leaving behind
duplicate meta_seq items because old seq item deletions ended up with
older versions than the seq item they tried to delete, nullifying the
deletion.

Signed-off-by: Zach Brown <zab@versity.com>
2022-11-15 13:26:32 -08:00
..