The userspace fencing process wasn't careful about handling underlying
directories that disappear while it was working.
On the server/fenced side, fencing requests can linger after they've
been resolved by writing 1 to fenced or error. The script could come
back around to see the directory before the server finally removes it,
causing all later uses of the request dir to fail. We saw this in the
logs as a bunch of cat errors for the various request files.
On the local fence script side, all the mounts can be in the process of
being unmounted so both the /sys/fs dirs and the mount it self can be
removed while we're working.
For both, when we're working with the /sys/fs files we read them without
logging errors and then test that the dir still exists before using what
we read. When fencing a mount, we stop if findmnt doesn't find the
mount and then raise a umount error if the /sys/fs dir exists after
umount fails.
And while we're at it, we have each scripts logging append instead of
truncating (if, say, it's a log file instead of an interactive tty).
Signed-off-by: Zach Brown <zab@versity.com>
The fence script we use for our single node multi-mount tests only knows
how to fence by using forced unmount to destroy a mount. As of now, the
tests only generate failing nodes that need to be fenced by using forced
unmount as well. This results in the awkward situation where the
testing fence script doesn't have anything to do because the mount is
already gone.
When the test fence script has nothing to do we might not notice if it
isn't run. This adds explicit verification to the fencing tests that
the script was really run. It adds per-invocation logging to the fence
script and the test makes sure that it was run.
While we're at it, we take the opportunity to tidy up some of the
scripting around this. We use a sysfs file with the data device
major:minor numbers so that the fencing script can find and unmount
mounts without having to ask them for their rid. They may not be
operational.
Signed-off-by: Zach Brown <zab@versity.com>
The local-force-unmount fenced fencing script only works when all the
mounts are on the local host and it uses force unmount. It is only
used in our specific local testing scripts. Packaging it as an example
lead people to believe that it could be used to cobble together a
multi-host testing network, however temporary.
Move it from being in utils and packged to being private to our tests so
that it doesn't present an attractive nuisance.
Signed-off-by: Zach Brown <zab@versity.com>