Commit Graph

2206 Commits

Author SHA1 Message Date
Chris Kirby
9dfd02078d Suppress another forced shutdown error message
The "server error emptying freed" error was causing a
fence-and-reclaim test failure. In this case, the error
was -ENOLINK, which we should ignore for messaging purposes.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2026-04-13 16:46:24 -05:00
Chris Kirby
b45620f899 Don't emit empty blocks in kway_merge()
It's possible for a srch compaction to collapse down to nothing
if given evenly paired create/delete entries. In this case, we
were emitting an empty block. This could cause problems for
search_sorted_file(), which assumes that every block it sees
has a valid first and last entry.

Fix this by keeping a temp entry and only emitting it if it differs
from the next entry in the block. Be sure to flush out a straggling
temp entry if we have one when we're done with the last block of
the merge.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2026-04-13 16:46:24 -05:00
Chris Kirby
5e27552f83 Change the looping logic in run-tests.sh
If a set of tests is provided, loop over the entire set the
requested number of times. Start and stop any requested tracing
across the set boundary.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2026-04-13 16:46:24 -05:00
Chris Kirby
ef70205853 Improve tracing for get_file_block()
Print the first and last entries, the entry_nr and entry_bytes.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2026-04-13 16:46:24 -05:00
Chris Kirby
a374be3beb Fix trigger firing race in srch-safe-merge-pos
Because the srch triggers are inherently async to the test,
we can't be sure they won't fire prematurely just because a
compact worker started running at an inconvenient time.

Make the trigger arming silent to avoid spurious test failures.
Move the trigger arming closer to the point of interest to
increase the chances that we're actually testing what we want.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2026-04-13 16:46:24 -05:00
Zach Brown
fc0fc1427f Merge pull request #296 from versity/auke/indx_key_delete
Fix indx delete using wrong xid, leaving orphans. && Add basic-xattr-indx tests.
2026-04-13 14:34:37 -07:00
Zach Brown
ec68845201 Merge pull request #289 from versity/auke/merge_read_item_stale_seq
Update seq when merging deltas from partial log merge.
2026-04-13 14:10:37 -07:00
Auke Kok
5e2009f939 Avoid double counting deltas from non-input finalized log trees.
Readers currently accumulate all finalized log tree deltas into
a single bucket for deciding whether they are already in fs_root
or not, but, finalized trees that aren't inputs to a current merge
will have higher seqs, and thus we may be double applying deltas
already merged into fs_root.

To distinguish, scoutfs_totl_merge_contribute() needs to know the
merge status item seq.  We change wkic's get_roots() from using the
SCOUTFS_NET_CMD_GET_ROOTS RPC to reading the superblock directly.
This is needed because totl merge resolution has to use the same data
as the btree roots it is operating on, thus we can't grab it from a
SCOUTFS_NET_CMD_GET_ROOTS packet - it likely is different.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 13:50:21 -07:00
Auke Kok
8bdc20af21 Rename/reword FINALIZED to MERGE_INPUT.
These mislabeled members and enums were clearly not describing
the actual data being handled and obfuscating the intent of
avoiding mixing merge input items with non-merge input items.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 13:50:21 -07:00
Auke Kok
857a39579e Clear roots when retrying due to stale btree blocks.
Before deltas were added this code path was correct, but with
deltas we can't just retry this without clearing &root, since
it would potentially double count.

The condition where this could happen is when there are deltas in
several finalized log trees, and we've made progress towards reading
some of them, and then encounter a stale btree block. The retry
would not clear the collected trees, apply the same delta as was
already applied before the retry, and thus double count.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 13:50:21 -07:00
Auke Kok
38d36c9f5c Update seq when merging deltas from partial log merge.
Two different clients can write delta's for totl indexes at the same
time, recording their changes. When merged, a reader should apply both
in order, and only once. To do so, the seq determines whether the delta
has been applied already.

The code fails to update the seq while walking the trees for deltas to
apply. Subsequently, when processing subsequent trees, it could
re-process deltas already applied. In case of a large negative delta
(e.g. removal of large amounts of files), the totl value could become
negative, resulting in quota lockout.

The fix is simple: advance the seq when reading partial delta merges
to avoid double counting.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 13:50:21 -07:00
Auke Kok
b724567b2a Add log_merge_force_partial trigger for testing partial merges.
Add a trigger that forces btree_merge() to return -ERANGE after
modifying a leaf's worth of items, causing many small partial merges
per merge cycle. This is used by tests to reliably reproduce races
that depend on partial merges splicing items into fs_root while
finalized logs still exist.

The trigger check lives inside btree_merge() where it can observe
actual item modification progress, rather than overriding the
caller's dirty byte limit argument which applies to the whole
writer context.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 12:25:30 -07:00
Auke Kok
add1da10dc Add test for stale seq in merge delta combining.
merge_read_item() fails to update found->seq when combining delta items
from multiple finalized log trees. Add a test case to replicate the
conditions of this issue.

Each of 5 mounts sets totl value 1 on 2500 shared keys, giving an
expected total of 5 per key.  Any total > 5 proves double-counting
from a stale seq.

The log_merge_force_partial trigger forces many partial merges per
cycle, creating the conditions where stale-seq items get spliced into
fs_root while finalized logs still exist.  Parallel readers on all
mounts race against this window to detect double-counted values.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-10 12:25:30 -07:00
Auke Kok
b9c49629a2 Add basic-xattr-indx tests.
We had no basic testing for `scoutfs read-xattr-index` whatsoever. This
adds your basic negative argument tests, lifecycle tests, the
deduplicated reads, and partial removal.

This exposes a bug in deletion where the indx entry isn't cleaned up
on inode delete.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-08 13:45:56 -07:00
Auke Kok
9737009437 Fix indx delete using wrong xid, leaving orphans.
During inode deletion, scoutfs_xattr_drop forgot to set the xid
of the xattr after calling parse_indx_key, which hardcodes xid=0, and it
is the callers' responsibility. delete_force then deletes the wrong
key, and returns no errors on nonexistant keys.

So now there is a pending deletion for a non-existant indx and an
orphan indx entry in the tree. Subsequent calls to `scoutfs
read-xattr-index` will thus return entries for deleted inodes.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-08 11:48:47 -07:00
Zach Brown
3d54ae03e6 Merge pull request #295 from versity/auke/xfs_lockdep_ignore
Avoid xfs lockdep false positive dmesg errors.
2026-04-03 09:46:44 -07:00
Auke Kok
e27ec0add6 Avoid xfs lockdep false positive dmesg errors.
This xfs lockdep stack trace has at least 2 variants around
fs_reclaim, so try and capture it not too precisely here.

We can remove "lockdep disabled" in the $re grep -v, because it
can affect both this and the kasan one.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-01 14:25:48 -07:00
Zach Brown
5457741672 Merge pull request #292 from versity/zab/v1.29
v1.29 Release
2026-03-25 22:36:28 -07:00
Zach Brown
4bd7a38b05 v1.29 Release
Finish the release notes for the 1.29 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.29
2026-03-25 16:33:31 -07:00
Zach Brown
087b2e85ab Merge pull request #291 from versity/auke/orphan-log-merge
Auke/orphan log merge
2026-03-25 16:26:24 -07:00
Auke Kok
8a730464ab Add orphan-log-trees test and reclaim_skip_finalize trigger
Add a reclaim_skip_finalize trigger that prevents reclaim from
setting FINALIZED on log_trees entries.  The test arms this trigger,
force-unmounts a client to create an orphan, and verifies the log
merge succeeds without timeout and the orphan reclaim message
appears in dmesg.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-25 10:39:40 -07:00
Auke Kok
daea8d5bc1 Reclaim orphaned log_trees entries from unmounted clients
An unfinalized log_trees entry whose rid is not in mounted_clients
is an orphan left behind by incomplete reclaim.  Previously this
permanently blocked log merges because the finalize loop treated it
as an active client that would never commit.

Call reclaim_open_log_tree for orphaned rids before starting a log
merge.  Once reclaimed, the existing merge and freeing paths include
them normally.

Also skip orphans in get_stable_trans_seq so their open transaction
doesn't artificially lower the stable sequence.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-25 06:47:22 -07:00
Zach Brown
1d60f684d2 Merge pull request #237 from versity/auke/hole_punch_ioctl_test
Punch Offline Ioctl, tests, scoutfs subcmd.
2026-03-19 13:51:44 -07:00
Zach Brown
a62708ac19 Merge pull request #286 from versity/auke/more-inode-deletion
Also use orphan scan wait code for remote unlink parts.
2026-03-16 14:33:20 -07:00
Auke Kok
16b1710541 Punch-offline tests.
Basic testing for the punch-offline ioctl code. The tests consist of a
bunch of negative testing to make sure things that are expressly not
allowed fail, followed by a bunch of known-expected outcome tests that
punches holes in several patterns, verifying them.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-13 15:45:52 -07:00
Auke Kok
440c3dc769 Add punch-offline scoutfs subcommand.
A minimal punch_offline ioctl wrapper. Argument style is adopted from
stage/release.

Following the syntax for the option of stage/release, this calls the
punch offline ioctl, punching any offline extent within the designated
range from offset with length.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-13 15:45:52 -07:00
Zach Brown
0fd172c5d9 Add punch_offline ioctl
Add an archive layer ioctl for converting offline extents into sparse
extents without relying on or modifying data_version.  This is helpful
when working with files with very large sparse regions.

Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-13 15:44:18 -07:00
Zach Brown
48c1f221b3 Merge pull request #285 from versity/auke/s-i-i-grep-awk-fix
Use awk matching for ino.
2026-03-13 13:51:39 -07:00
Zach Brown
34713f3559 Merge pull request #290 from versity/auke/dirent_zero_pad
Auke/dirent zero pad
2026-03-06 10:41:04 -08:00
Auke Kok
137abc1fe2 Zero scoutfs_data_extent_val padding.
The initialization here avoids clearing __pad[], which leaks
to disk. Use a struct initializer to avoid it.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-05 16:20:06 -08:00
Auke Kok
64fcbdc15e Zero out dirent padding to avoid leaking to disk.
This allocation here currently leaks through __pad[7] which
is written to disk. Use the initializer to enforce zeroing
the pad. The name member is written right after.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-03-05 16:20:06 -08:00
Zach Brown
d9c951ff48 Merge pull request #287 from versity/auke/misc_fixes
Unsorted misc. fixes for minor/cosmetic issues.
2026-03-02 10:12:26 -08:00
Auke Kok
eaae92d983 Don't send -EINVAL as u8, over the network.
The caller sends the return value of this inline as u8. If we return
-EINVAL, it maps to (234) which is outside of our enum range. Assume
this was meant to return SCOUTFS_NET_ERR_EINVAL which is a defined
constant.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-26 14:02:42 -05:00
Auke Kok
43f3dd7259 Invalid address check logic.
These boolean checks are all mutually exclusive, meaning this
check will always succeed due to the negative. Instead of && it
needs to use ||.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-26 14:02:42 -05:00
Auke Kok
7d96cf9b96 Remove copy/paste duplicate op flag check.
The exact 2 lines here are repeated. It suggests that there may
have been the intent of an additional check, but, there isn't
anything left from what I can see that needs checking here.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-26 14:02:41 -05:00
Auke Kok
03e22164db Return error on scoutfs_forest_setup().
This setup function always returned 0, even on error, causing
initialization to continue despite the error.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-26 14:02:41 -05:00
Zach Brown
e0948ec6de Merge pull request #281 from versity/auke/dotfull-file-seqres
Put `.full` file in $T_TMPDIR.
2026-02-26 09:15:22 -08:00
Auke Kok
d0c1c28438 Use awk matching for ino.
This test regularly fails here because the grep is greedy and can
match inodes ending in the same digits as the one we're looking for.
Make it use the same awk pattern used below.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-25 13:43:20 -05:00
Auke Kok
65808c2cb2 Also use orphan scan wait code for remote unlink parts.
The fix added in v1.26-17-gef0f6f8a does a good job of avoiding the
intermittent test failures for the part that it was added. The remote
unlink section could use it as well, as it suffers from the same
intermediate failures.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-24 14:12:03 -08:00
Zach Brown
73573d2c2b Merge pull request #283 from versity/auke/rever
Delete stray file from golden directory.
2026-02-20 10:12:21 -08:00
Auke Kok
f5db935afc Delete stray file from golden directory.
Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-11 14:05:32 -05:00
Zach Brown
831faff7d2 Merge pull request #282 from versity/zab/v1.28
v1.28 Release
2026-02-06 09:28:52 -08:00
Zach Brown
8dad826f88 v1.28 Release
Finish the release notes for the 1.28 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.28
2026-02-05 09:47:05 -08:00
Auke Kok
e2f3f2e060 Put .full file in $T_TMPDIR.
This file was put into $CWD by the test scripts for no real good
reason. I suppose somewhere $seqres was supposed to be set before
these writes happened. Just write them to the test temp folder for
good measure for now.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-02-02 14:37:40 -08:00
Zach Brown
3a05c69643 Merge pull request #279 from versity/auke/basic-acl-consistency
Auke/basic acl consistency (test/reproduction)
2026-02-02 10:32:30 -08:00
Auke Kok
533f309aec Switch to .get_inode_acl() to avoid rcu corruption.
In el9.6, the kernel VFS no longer goes through xattr handlers to
retreive ACLs, but instead calls the FS drivers' .get_{inode_}acl
method.  In the initial compat version we hooked up to .get_acl given
the identical name that was used in the past.

However, this results in caching issues, as was encountered by customers
and exposed in the added test case `basic-acl-consistency`. The result
is that some group ACL entries may appear randomly missing. Dropping
caches may temporarily fix the issue.

The root cause of the issue is that the VFS now has 2 separate paths to
retreive ACL's from the FS driver, and, they have conflicting
implications for caching. `.get_acl` is purely meant for filesystems
like overlay/ecryptfs where no caching should ever go on as they are
fully passthrough only. Filesystems with dentries (i.e. all normal
filesystems should not expose this interface, and instead expose the
.get_inode_acl method. And indeed, in introducing the new interface, the
upstream kernel converts all but a few fs's to use .get_inode_acl().

The functional change in the driver is to detach KC_GET_ACL_DENTRY and
introduce KC_GET_INODE_ACL to handle the new (and required) interface.
KC_SET_ACL_DENTRY is detached due to it being a different changeset in
the kernel and we should separate these for good measure now.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-01-30 11:31:43 -08:00
Auke Kok
0ef22b3c44 Add basic ACL consistency test case.
This test case is used to detect and reproduce a customer issue we're
seeing where the new .get_acl() method API and underlying changes in
el9_6+ are causing ACL cache fetching to return inconsistent results,
which shows as missing ACLs on directories.

This particular sequence is consistent enough that it warrants making
it into a specific test.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-01-22 12:23:38 -08:00
Auke Kok
85ffba5329 Update existing tests to use scratch helpers.
Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-01-20 12:35:43 -08:00
Auke Kok
553e6e909e Scratch mount test helpers.
Adds basic mkfs/mount/umount helpers that handle all the basics
for making, mounting and unmounting scratch devices. The mount/unmount
create "$T_MSCR", which lives in "$T_TMPDIR".

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-01-20 12:35:09 -08:00
Zach Brown
9b569415f2 Merge pull request #276 from versity/zab/v1.27
v1.27 Release
2026-01-15 19:36:38 -08:00