Commit Graph

2146 Commits

Author SHA1 Message Date
Greg Cymbalski
7a1d9d0aba Consistency: set -x when VERBOSE set to 1 2025-12-15 18:41:40 -08:00
Greg Cymbalski
e3a6d31b1d Reorder for clarity/work around el7 naming conventions 2025-12-15 18:02:00 -08:00
Greg Cymbalski
3fae02a3db Set build scripts executable 2025-12-15 17:23:40 -08:00
Greg Cymbalski
9b007e434d Disable verbose logging of builds by default 2025-12-15 17:21:22 -08:00
Greg Cymbalski
dd94d689aa Remove vestigial configuration 2025-12-15 17:19:00 -08:00
Greg Cymbalski
8737af1fa5 Parameterize proxy used by mock 2025-12-15 17:17:21 -08:00
Greg Cymbalski
309e9c0cb0 Use tried-and-true non-docker builds for el7 2025-12-15 17:16:42 -08:00
Greg Cymbalski
d0eaaa2738 Don't use devel kernels as a proxy for available kernel packages 2025-12-03 17:51:55 -08:00
Greg Cymbalski
ba7a8fcbba Special cases for EL7 builds 2025-12-03 14:09:48 -08:00
Greg Cymbalski
a8ba5a6a5e Support building against specific kvers 2025-12-03 14:08:23 -08:00
Greg Cymbalski
7ccb33cd30 Consistency: Use 0/1 for false/true everywhere
And consolidate 'edge' distro configuration.
2025-12-03 14:07:44 -08:00
Greg Cymbalski
28a8dc83ef Critical dep 2025-12-02 14:00:12 -08:00
Greg Cymbalski
a85fff1934 Automatically handle building scoutfs containers 2025-12-02 13:53:15 -08:00
Greg Cymbalski
087f756e78 Dockerfile for scoutfs build environments 2025-12-02 13:47:55 -08:00
Greg Cymbalski
51a4a41f26 Revert "Experiment: conflict with nonequal packages to prevent upgrades to kernels that do not have scoutfs kmods prebuilt"
This reverts commit 680dcd4653.
2025-12-02 11:36:01 -08:00
Greg Cymbalski
680dcd4653 Experiment: conflict with nonequal packages to prevent upgrades to kernels that do not have scoutfs kmods prebuilt 2025-12-01 16:36:46 -08:00
Greg Cymbalski
cc55ae4ee8 Provide a version to the virtual package we provide 2025-12-01 16:10:31 -08:00
Greg Cymbalski
f3fffa9e4b WIP: Tooling to build all kversions, needs major refactoring before pushing 2025-12-01 11:08:24 -08:00
Greg Cymbalski
329b9e817b Closer target kernel tracking
- This makes ScoutFS packages more directly tied to a given kernel while
  still allowing for weak modules usage when possible.
- For EL9, this still prevents the installation of kmod packages across
  minor releases, which no longer have strict kABI gurantees.
2025-11-26 13:57:35 -08:00
Zach Brown
1c7678b6f5 Merge pull request #263 from versity/zab/v1.26
v1.26 Release
2025-11-18 09:39:27 -08:00
Zach Brown
22b5e79bbd v1.26 Release
Finish the release notes for the 1.26 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.26
2025-11-17 14:42:14 -08:00
Zach Brown
259e639271 Merge pull request #262 from versity/zab/ino_alloc_per_lock
Add ino_alloc_per_lock option
2025-11-14 13:57:49 -08:00
Zach Brown
4d66c38c71 Remove redundant WARN in commit_log_trees
The server's commit_log_trees has an error message that includes the
source of the error, but it's not used for all errors.  The WARN_ON is
redundant with the message and is removed because it isn't filtered out
when we see errors from forced unmount.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-14 10:04:30 -08:00
Zach Brown
7ef62894bd Add ino_alloc_per_lock option
Add an option that can limit the number of inode numbers that are
allocated per lock group.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 17:19:04 -08:00
Zach Brown
1f363a1ead Merge pull request #261 from versity/zab/log_merge_double_free
Zab/log merge double free
2025-11-13 17:18:30 -08:00
Zach Brown
8ddf9b8c8c Handle disappearing fencing requests and targets
The userspace fencing process wasn't careful about handling underlying
directories that disappear while it was working.

On the server/fenced side, fencing requests can linger after they've
been resolved by writing 1 to fenced or error.  The script could come
back around to see the directory before the server finally removes it,
causing all later uses of the request dir to fail.  We saw this in the
logs as a bunch of cat errors for the various request files.

On the local fence script side, all the mounts can be in the process of
being unmounted so both the /sys/fs dirs and the mount it self can be
removed while we're working.

For both, when we're working with the /sys/fs files we read them without
logging errors and then test that the dir still exists before using what
we read.  When fencing a mount, we stop if findmnt doesn't find the
mount and then raise a umount error if the /sys/fs dir exists after
umount fails.

And while we're at it, we have each scripts logging append instead of
truncating (if, say, it's a log file instead of an interactive tty).

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
fd80c17ab6 Filter out kernel message when guests are slow
Ignore more kernel messages when debug guests are being slow.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
991e2cbdf8 Ignore slow quorum hb transfers in tests
We're getting test failures from messages that our guests can be
unresponsive.  They sure can be.  We don't need to fail for this one
specific case.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
92ac132873 Silence merge splice error when forcing
Silence another error warning and assertion that's assuming that the
result of the errors is going to be persistent.  When we're forcing an
unmount we've severed storage and networking.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Auke Kok
ad078cd93c Avoid lock stalling mmap_stress
mmap_stress gets completely stalled in lock messaging and starving
most of the mmap_stress threads, which causes it to delay and even
time out in CI.

Instead of spawning threads over all 5 test nodes, we reduce it
to spawning over only 2 artificially. This still does a good number
of operations on those node, and now the work is spread across the
two nodes evenly.

Additionaly, I've added a miniscule (10ms) delay in between operations
that should hopefully be sufficient for other locking attempts to
settle and allow the threads to better spread the work.

This now shows that all the threads exit within < 0.25s on my test
machine, which is a lot better than the 40s variation that I was seeing
locally. Hopefully this fares better in CI.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-11-13 12:43:31 -08:00
Auke Kok
90cb458cd5 Make mmap_stress not exceed a fixed amount of time.
There's a scenarion where mmap_stress gets enough resources that
twoe of the threads will starve the others, which then all take
a very long time catching up committing changes.

Because this test program didn't finish until all the threads had
completed a fixed amount of work, essentially these threads all
ended up tripping over eachother. In CI this would exceed 6h+,
while originally I intended this to run in about 100s or so.

Instead, cap the run time to ~30s by default. If threads exceed
this time, they will immediately exit, which causes any clog in
contention between the threads to drain relatively quickly.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
1ab798e7eb Silence inconsistent srch on forced unmount
Assembling a srch compaction operation creates an item and populates it
with allocator state.  It doesn't cleanly unwind the allocation and undo
the compaction item change if allocation filling fails and issues a
warning.

This warning isn't needed if the error shows that we're in forced
unmount.  The inconsistent state won't be applied, it will be dropped on
the floor as the mount is torn down.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
e182914e51 Fix double free of metadata blocks in log merging
The log merging process is meant to provide parallelism across workers
in mounts.  The idea is that the server hands out a bunch of concurrent
non-intersecting work that's based on the structure of the stable input
fs_root btree.

The nature of the parallel work (cow of the blocks that intersect a key
range) means that the ranges of concurrently issued work can't overlap
or the work will all cow the same input blocks, freeing that input
stable block multiple times.  We're seeing this in testing.

Correctness was intended by having an advancing key that sweeps sorted
ranges.  Duplicate ranges would never be hit as the key advanced past
each it visited.  This was broken by the mapping of the fs item keys to
log merge tree keys by clobbering the sk_zone key value.  It effectively
interleaves the ranges of each zone in the fs root (meta indexes,
orphans, fs items).  With just the right log merge conditions that
involve logged items in the right places and partial completed work to
insert remaining ranges behind the key, ranges can be stored at mapped
keys that end up with ranges out of order.  The server iterates over
these and ends up issueing overlapping work, which results in duplicated
frees of the input blocks.

The fix, without changing the format of the stored log tree items, is to
perform a full sweep of all the range items and determine the next item
by looking at the full precision stored keys.  This ensures that the
processed ranges always advance and never overlap.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
8484a58dd6 Have xfstest pass when using args
The xfstests's golden output includes the full set of tests we expect to
run when no args are specified.  If we specify args then the set of
tests can change and the test will always fail when they do.

This fixes that by having the test check the set of tests itself, rather
than relying on golden output.  If args are specified then our xfstest
only fails if any of the executed xfstest tests failed.  Without args,
we perform the same scraping of the check output and compare it against
the expected results ourself.

It would have been a bit much to put that large file inline in the test
file, so we add a dir of per-test files in revision control.  We can
also put the list of exclusions there.

We can also clean up the output redirection helper functions to make
them more clear.  After xfstests has executed we want to redirect output
back to the compared output so that we can catch any unexpected output.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
a077104531 Add crash monitor to run-tests
Add a little background function that runs during the test which
triggers a crash if it finds catastrophic failure conditions.

This is the second bg task we want to kill and we can only have one
function run on the EXIT trap, so we create a generic process killing
trap function.

We feed it the fenced pid as well.  run-tests didn't log much of value
into the fenced log, and we're not logging the kills into anymore, so we
just remove run-tests fenced logging.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-13 12:43:31 -08:00
Zach Brown
23aaa994df Add -l to run-tests for looping over tests
Add an option to run-tests to have it loop over each test that will be
run a number of times.  Looping stops if the test doesn't pass.

Most of the change in the per-test execution is indenting as we add the
for loop block.  The stats and kmsg output are lifted up before of the
loop.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-06 12:07:42 -08:00
Zach Brown
7d14b57b2d Export PATH once in run-tests
Might as well just export the PATH once as we change it, no need to
export it in every test iteration.

Signed-off-by: Zach Brown <zab@versity.com>
2025-11-06 11:02:38 -08:00
Zach Brown
3f252be4be Merge pull request #241 from versity/auke/waiter_err_data_version_obsolete
Ignore data_version in scoutfs_ioc_data_wait_err.
2025-11-04 10:09:57 -08:00
Auke Kok
a4d25d9b55 Ignore data_version in scoutfs_ioc_data_wait_err.
The data_wait_err ioctl currently requires the correct data_version
for the inode to be passed in, or else the ioctl returns -ESTALE. But
the ioctl itself is just a passthrough mechanism for notifying data
waiters, which doesn't involve the data_version at all.

Instead, we can just drop checking the value. The field remains in the
headers, but we've marked it as being ignored from now on. The reason
for the change is documented in the header file as well.

This all is a lot simpler than having to modify/rev the data_waiters
interface to support passing back the data_version, because there isn't
any space left to easily do this, and then userspace would just pass it
back to the data_wait_err ioctl.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-10-31 12:24:03 -04:00
Zach Brown
79cd25f693 Merge pull request #255 from versity/zab/compact_error_block_leak
Don't leak alloc blocks on srch compact error
2025-10-31 09:08:17 -07:00
Zach Brown
f2646130ae Don't leak alloc blocks on srch compact error
scoutfs_alloc_prepare_commit() is badly named.  All it really does is
put the references to the two dirty alloc list blocks in the allocator.
It must allways be called if allocation was attempted, but it's easier
to require that it always be paired with _alloc_init().

If the srch compaction worker in the client sees an error it will send
the error back to the server without writing its dirty blocks.  In
avoiding the write it also avoided putting the two block references,
leading to leaked blocks.  We've been seeing rare messages with leaked
blocks in tests.

Signed-off-by: Zach Brown <zab@versity.com>
2025-10-30 14:47:18 -07:00
Zach Brown
1c66f9a9a5 Merge pull request #227 from versity/auke/el96
RHEL 9.6 support.
2025-10-30 13:39:05 -07:00
Auke Kok
afb6ba00ad POSIX ACL changes.
The .get_acl() method now gets passed a mnt_idmap arg, and we can now
choose to implement either .get_acl() or .get_inode_acl(). Technically
.get_acl() is a new implementation, and .get_inode_acl() is the old.
That second method now also gets an rcu flag passed, but we should be
fine either way.

Deeper under the covers however we do need to hook up the .set_acl()
method for inodes, otherwise setfacl will just fail with -ENOTSUPP. To
make this not super messy (it already is) we tack on the get_acl()
changes here.

This is all roughly ca. v6.1-rc1-4-g7420332a6ff4.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-10-30 13:59:44 -04:00
Auke Kok
29e486e411 All vfs methods now take a mnt_idmap instead of user_namespace arg.
Similar to before when namespaces were added, they are now translated to
a mnt_idmap, since v6.2-rc1-2-gabf08576afe3.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-10-30 13:58:34 -04:00
Zach Brown
8f3177fe33 Merge pull request #254 from versity/zab/shrink_cleanup
Zab/shrink cleanup
2025-10-30 08:56:33 -07:00
Zach Brown
419079e606 Merge pull request #239 from versity/auke/keepalive
Add tcp_keepalive_timeout_ms option, change default to 60s
2025-10-29 17:15:17 -07:00
Zach Brown
6a70ee03b5 Dump block alloc stacks for leaked blocks
The typical pattern of spinning isolating a list_lru results in a
livelock if there are blocks with leaked refcounts.  We're rarely seeing
this in testing.

We can have a modest array in each block that records the stack of the
caller that initially allocated the block and dump that stack for any
blocks that we're unable to shrink/isolate.  Instead of spinning
shrinking, we can give it a good try and then print the blocks that
remain and carry on with unmount, leaking a few blocks.  (Past events
have had 2 blocks.)

Signed-off-by: Zach Brown <zab@versity.com>
2025-10-29 16:16:58 -07:00
Zach Brown
38a2ffe0c7 Add stacktrace kernelcompat
Signed-off-by: Zach Brown <zab@versity.com>
2025-10-29 10:12:52 -07:00
Zach Brown
4b41cf9789 Centralize port numbers and avoid ephemeral
The tests were using high ephemeral port numbers for the mount server's
listening port.  This caused occasional failure if the client's
ephemeral ports happened to collide with the ports used by the tests.

This ports all the port number configuration in one place and has a
quick check to make sure it doesn't wander into the current ephemeral
range.  Then it updates all the tests to use the chosen ports.

Signed-off-by: Zach Brown <zab@versity.com>
2025-10-29 10:12:52 -07:00
Zach Brown
102899290e Allow harmless srch compact commit errors
The server's srch commit error warnings were a bit severe.  The
compaction operations are a function of persistent state.  If they fail
then the inputs still exist and the next attempt will retry whatever
failed.  Not all errors are a problem, only those that result in partial
commits that leave inconsistent state.

In particular, we have to support the case where a client retransmits a
compaction request to a new server after a first server performed the
commit but couldn't respond.  Throwing warnings when the new server gets
ENOENT looking for the busy compaction item isn't helpful.  This came in
tests as background compaction was in flight as tests unmounted and
mounted servers repeatedly to test lock recovery.

Signed-off-by: Zach Brown <zab@versity.com>
2025-10-29 10:12:52 -07:00