Zach Brown 57c7caf348 scoutfs: fix forest dirty log tracking
The forest code is responsible for constructing a consistent fs image
out of the items spread across all the btrees written by mounts in the
system.

Usually readers walk a btree looking for log trees that they should
read.  As a mount modifies items in its dirty log tree, readers need to
be sure to check that in-memory dirty log tree even though it isn't
present in the btree that records persistent log trees.

The code did this by setting a flag to indicate that readers using a
lock should check the dirty log tree.  But the flag usage wasn't
properly locked and left a race where a reader and writer could race,
leaving future readers to not know that they should check the dirty log
tree.  When we rarely hit that race we'd see item errors that made no
sense, like not being able to find an inode item to update after having
just created it in the current transaction.

To fix this, we clean up the tree tracking in the forest code.

We get rid of the static forest_root structs in the lock_private that
were used to track the two special-case roots that aren't found in log
tree items: the in-memory dirty log root and the final fs root.  All
roots are now dynamically allocated.  We use a flag in the root to
identify it as the dirty log root, and identify the fs root by its
rid/nr.  This results in a bunch of caller churn as we remove lpriv from
root identifying functions.

We get rid of the idea of the writer adding a static root to the list as
well as marking the log as needing to read the root.  Instead we make
all root management happen as we refresh the list.  The forest maintains
a commit sequence and writers set state in the lock to indicate that the
lock has dirty items in the log during this transaction.  Iteration then
compares the state set by the commit, writer, and the last refresh to
determine if a new refresh needs to happen.

Properly tracking the presence of dirty items lets us recognize when the
lock no longer has dirty items in the log and we can stop locking and
reading the dirty log and fall back to reading the committed stable
version.  The previous code didn't do that, it would lock and read the
dirty root forever.

While we're in here, we fix the locking around setting bloom bits and
have it track the version of the log tree that was set so that we don't
have to clear set bits as the log version is rotated out by the server.

There was also a subtle bug where we could hit to stale errors for the
same root and return -EIO because we triggering refresh returned stale.
We rework the retrying logic to use a separate error code to force
refreshing so that we can't accidentally trigger eio by conflating
reading stale blocks and forcing refreshing.

And finally, we no longer record that we need the dirty log tree in a
root if we have a lock that could never read.  It's a minor optimization
that doesn't change functional behaviour.

Signed-off-by: Zach Brown <zab@versity.com>
2020-08-26 14:39:12 -07:00
Description
No description provided
9.1 MiB
Languages
C 86.2%
Shell 10.2%
Roff 2.5%
TeX 0.8%
Makefile 0.3%