Compare commits

...

15 Commits
v1.26 ... main

Author SHA1 Message Date
Zach Brown
50bff13f21 Merge pull request #266 from versity/zab/increase_move_empty_budget
Increase server commit block budget for alloc move
2025-12-18 12:44:20 -08:00
Zach Brown
de70ca2372 Increase server commit block budget for alloc move
A few callers of alloc_move_empty in the server were providing a budget
that was too small.  Recent changes to extent_mod_blocks increased the
max budget that is necessary to move extents between btrees.  The
existing WAG of 100 was too small for trees of height 2 and 3.  This
caused looping in production.

We can increase the move budget to half the overall commit budget, which
leaves room for a height of around 7 each.  This is much greater than we
see in practice because the size of the per-mount btrees is effectiely
limited by both watermarks and thresholds to commit and drain.

Signed-off-by: Zach Brown <zab@versity.com>
2025-12-17 14:22:04 -06:00
Zach Brown
5af1412d5f Merge pull request #270 from versity/auke/bdev_autoloading
Avoid block device autoloading warning.
2025-12-17 11:06:32 -08:00
Zach Brown
0a2b2ad409 Merge pull request #269 from versity/auke/tap_status_msg
Include t_fail status in tap output.
2025-12-17 11:04:00 -08:00
Auke Kok
6c4590a8a0 Avoid block device autoloading warning.
It's possible to trigger the block device autoloading mechanism
with a mknod()/stat(), and this mechanism has long been declared
obsolete, thus triggering a dmesg warning since el9_7, which then
fails the test. You may need to `rmmod loop` to reproduce.

Avoid this by avoiding to trigger a loop autoload - we just make a
different blockdev. Chosing `42` here should avoid any autoload
mechanism as this number is explicitly for demo drivers and should
never trigger an autoload.

We also just ignore the warning line in dmesg. Other tests can and
might perhaps still trigger this, as well as background noise running
during the test.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-12-08 13:04:58 -08:00
Zach Brown
1768f69c3c Merge pull request #224 from versity/auke/renameat2-test-sub-dir
Use T_D0/1 instead of T_M0 here.
2025-12-08 10:05:46 -08:00
Zach Brown
dcb0fd5805 Merge pull request #268 from versity/auke/dont_use_bash_special_stdfiles
Avoid using bash special device nodes.
2025-12-08 09:47:19 -08:00
Auke Kok
660f874488 Use T_D0/1 instead of T_M0 here.
Use of T_M0 and variants should be reserved for e.g. scoutfs
<subcommand> -p <mountpoint> type of usages. Tests should create
individual content files in the assigned subdirectory.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-12-04 14:34:02 -05:00
Auke Kok
e1a6689a9b Include t_fail status in tap output.
The tap output file was not yet complete as it failed to include
the contents of `status.msg`. In a few cases, that would mean it
lacks important context.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-12-04 14:09:39 -05:00
Auke Kok
2884a92408 Avoid using bash special device nodes.
Bash has special handling when these standard IO files, but
there are cases where customers have special restrictions set
on them. Likely to avoid leaking error data out of system logs
as part of IDS software.

In any case, we can just reopen existing file descriptors here
in both these cases to avoid this entirely. This will always
work.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-12-04 13:24:48 -05:00
Zach Brown
e194714004 Merge pull request #264 from versity/auke/findmnt_retval
Findmnt returns 1 when no matching entries found
2025-12-03 14:29:31 -08:00
Auke Kok
8bb2f83cf9 Findmnt returns 1 when no matching entries found
Our local fence script attempts to interpret errors executing `findmnt`
as critical errors, but the program exit code explicitly returns
EXIT_FAILURE when the total number of matching mount entries is zero.

This can happen if the mount disappeared while we're attempting to
fence the mount, but, the scoutfs sysfs files are still in place as
we read them. It's a small window, but, it's a fork/exec plus full
parse of /etc/fstab, and a lot can happen in the 0.015s findmnt takes
on my system.

There's no other exit codes from findmnt other than 0 and 1. At that
point, we can only assume that if the stdout is empty, the mount
isn't there anymore.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-12-02 12:55:11 -08:00
Zach Brown
6a9a6789d5 Merge pull request #267 from versity/clk/merge_enoent
Handle ENOENT when getting log merge status item
2025-12-02 09:34:28 -08:00
Chris Kirby
ee630b164f Handle ENOENT when getting log merge status item
Tests that cause client retries can fail with this error
from server_commit_log_merge():

error -2 committing log merge: getting merge status item

This can happen if the server has already committed and resolved
the log merge that is being retried. We can safely ignore ENOENT here
just like we do a few lines later.

Signed-off-by: Chris Kirby <ckirby@versity.com>
2025-12-01 08:58:24 -06:00
Zach Brown
1c7678b6f5 Merge pull request #263 from versity/zab/v1.26
v1.26 Release
2025-11-18 09:39:27 -08:00
7 changed files with 40 additions and 16 deletions

View File

@@ -1618,7 +1618,8 @@ static int server_get_log_trees(struct super_block *sb,
goto update;
}
ret = alloc_move_empty(sb, &super->data_alloc, &lt.data_freed, 100);
ret = alloc_move_empty(sb, &super->data_alloc, &lt.data_freed,
COMMIT_HOLD_ALLOC_BUDGET / 2);
if (ret == -EINPROGRESS)
ret = 0;
if (ret < 0) {
@@ -1913,9 +1914,11 @@ static int reclaim_open_log_tree(struct super_block *sb, u64 rid)
scoutfs_alloc_splice_list(sb, &server->alloc, &server->wri, server->other_freed,
&lt.meta_avail)) ?:
(err_str = "empty data_avail",
alloc_move_empty(sb, &super->data_alloc, &lt.data_avail, 100)) ?:
alloc_move_empty(sb, &super->data_alloc, &lt.data_avail,
COMMIT_HOLD_ALLOC_BUDGET / 2)) ?:
(err_str = "empty data_freed",
alloc_move_empty(sb, &super->data_alloc, &lt.data_freed, 100));
alloc_move_empty(sb, &super->data_alloc, &lt.data_freed,
COMMIT_HOLD_ALLOC_BUDGET / 2));
mutex_unlock(&server->alloc_mutex);
/* only finalize, allowing merging, once the allocators are fully freed */
@@ -3036,7 +3039,13 @@ static int server_commit_log_merge(struct super_block *sb,
SCOUTFS_LOG_MERGE_STATUS_ZONE, 0, 0,
&stat, sizeof(stat));
if (ret < 0) {
err_str = "getting merge status item";
/*
* During a retransmission, it's possible that the server
* already committed and resolved this log merge. ENOENT
* is expected in that case.
*/
if (ret != -ENOENT)
err_str = "getting merge status item";
goto out;
}

View File

@@ -9,7 +9,7 @@
echo "$0 running rid '$SCOUTFS_FENCED_REQ_RID' ip '$SCOUTFS_FENCED_REQ_IP' args '$@'"
echo_fail() {
echo "$@" >> /dev/stderr
echo "$@" >&2
exit 1
}
@@ -27,8 +27,7 @@ for fs in /sys/fs/scoutfs/*; do
nr="$(quiet_cat $fs/data_device_maj_min)"
[ ! -d "$fs" -o "$fs_rid" != "$rid" ] && continue
mnt=$(findmnt -l -n -t scoutfs -o TARGET -S $nr) || \
echo_fail "findmnt -t scoutfs -S $nr failed"
mnt=$(findmnt -l -n -t scoutfs -o TARGET -S $nr)
[ -z "$mnt" ] && continue
if ! umount -qf "$mnt"; then

View File

@@ -170,6 +170,9 @@ t_filter_dmesg()
# some ci test guests are unresponsive
re="$re|longest quorum heartbeat .* delay"
# creating block devices may trigger this
re="$re|block device autoloading is deprecated and will be removed."
egrep -v "($re)" | \
ignore_harmless_unwind_kasan_stack_oob
}

View File

@@ -43,9 +43,14 @@ t_tap_progress()
local testname=$1
local result=$2
local stmsg=""
local diff=""
local dmsg=""
if [[ -s $T_RESULTS/tmp/${testname}/status.msg ]]; then
stmsg="1"
fi
if [[ -s "$T_RESULTS/tmp/${testname}/dmesg.new" ]]; then
dmsg="1"
fi
@@ -61,6 +66,7 @@ t_tap_progress()
echo "# ${testname} ** skipped - permitted **"
else
echo "not ok ${i} - ${testname}"
case ${result} in
101)
echo "# ${testname} ** skipped **"
@@ -70,6 +76,13 @@ t_tap_progress()
;;
esac
if [[ -n "${stmsg}" ]]; then
echo "#"
echo "# status:"
echo "#"
cat $T_RESULTS/tmp/${testname}/status.msg | sed 's/^/# - /'
fi
if [[ -n "${diff}" ]]; then
echo "#"
echo "# diff:"

View File

@@ -72,7 +72,7 @@ touch $T_D0/dir/file
mkdir $T_D0/dir/dir
ln -s $T_D0/dir/file $T_D0/dir/symlink
mknod $T_D0/dir/char c 1 3 # null
mknod $T_D0/dir/block b 7 0 # loop0
mknod $T_D0/dir/block b 42 0 # SAMPLE block dev - nonexistant/demo use only number
for name in $(ls -UA $T_D0/dir | sort); do
ino=$(stat -c '%i' $T_D0/dir/$name)
$GRE $ino | filter_types

View File

@@ -8,19 +8,19 @@ t_require_mounts 2
echo "=== renameat2 noreplace flag test"
# give each mount their own dir (lock group) to minimize create contention
mkdir $T_M0/dir0
mkdir $T_M1/dir1
mkdir $T_D0/dir0
mkdir $T_D1/dir1
echo "=== run two asynchronous calls to renameat2 NOREPLACE"
for i in $(seq 0 100); do
# prepare inputs in isolation
touch "$T_M0/dir0/old0"
touch "$T_M1/dir1/old1"
touch "$T_D0/dir0/old0"
touch "$T_D1/dir1/old1"
# race doing noreplace renames, both can't succeed
dumb_renameat2 -n "$T_M0/dir0/old0" "$T_M0/dir0/sharednew" 2> /dev/null &
dumb_renameat2 -n "$T_D0/dir0/old0" "$T_D0/dir0/sharednew" 2> /dev/null &
pid0=$!
dumb_renameat2 -n "$T_M1/dir1/old1" "$T_M1/dir0/sharednew" 2> /dev/null &
dumb_renameat2 -n "$T_D1/dir1/old1" "$T_D1/dir0/sharednew" 2> /dev/null &
pid1=$!
wait $pid0
@@ -31,7 +31,7 @@ for i in $(seq 0 100); do
test "$rc0" == 0 -a "$rc1" == 0 && t_fail "both renames succeeded"
# blow away possible files for either race outcome
rm -f "$T_M0/dir0/old0" "$T_M1/dir1/old1" "$T_M0/dir0/sharednew" "$T_M1/dir1/sharednew"
rm -f "$T_D0/dir0/old0" "$T_D1/dir1/old1" "$T_D0/dir0/sharednew" "$T_D1/dir1/sharednew"
done
t_pass

View File

@@ -7,7 +7,7 @@ message_output()
error_message()
{
message_output "$@" >> /dev/stderr
message_output "$@" >&2
}
error_exit()