Don't overrun the block budget in server_log_merge_free_work().

This fixes a potential fence post failure like the following: error: 1 holders exceeded alloc budget av: bef 7407 now 7392, fr: bef 8185 now 7672 The code is only accounting for the freed btree blocks, not the dirtying of other items. So it's possible to be at exactly (COMMIT_HOLD_ALLOC_BUDGET / 2), dirty some log btree blocks, loop again, then consume another (COMMIT_HOLD_ALLOC_BUDGET / 2) and blow past the total budget. In this example, we went over by 13 blocks. By only consuming up to 1/8 of the budget on each loop, and committing when we have consumed 3/4 of the budget, we can avoid the fence post condition. Signed-off-by: Chris Kirby <ckirby@versity.com>
Merge pull request #223 from versity/auke/el9_5_wmaybe-uninit
2026-01-07 20:45:18 +00:00 · 2025-05-28 13:48:22 -05:00 · 2025-05-12 12:21:02 -07:00 · 2025-05-09 11:27:04 -07:00 · 2025-05-09 11:17:24 -07:00 · 2025-05-09 11:15:13 -07:00
6 changed files with 38 additions and 22 deletions
--- a/kmod/src/server.c
+++ b/kmod/src/server.c
@@ -2531,7 +2531,7 @@ static void server_log_merge_free_work(struct work_struct *work)

 		ret = scoutfs_btree_free_blocks(sb, &server->alloc,
 						&server->wri, &fr.key,
-						&fr.root, COMMIT_HOLD_ALLOC_BUDGET / 2);
+						&fr.root, COMMIT_HOLD_ALLOC_BUDGET / 8);
 		if (ret < 0) {
 			err_str = "freeing log btree";
 			break;
@@ -2550,7 +2550,7 @@ static void server_log_merge_free_work(struct work_struct *work)
 		/* freed blocks are in allocator, we *have* to update fr */
 		BUG_ON(ret < 0);

-		if (server_hold_alloc_used_since(sb, &hold) >= COMMIT_HOLD_ALLOC_BUDGET / 2) {
+		if (server_hold_alloc_used_since(sb, &hold) >= (COMMIT_HOLD_ALLOC_BUDGET * 3) / 4) {
 			mutex_unlock(&server->logs_mutex);
 			ret = server_apply_commit(sb, &hold, ret);
 			commit = false;
@@ -4153,7 +4153,7 @@ static void fence_pending_recov_worker(struct work_struct *work)
 	struct server_info *server = container_of(work, struct server_info,
 						  fence_pending_recov_work);
 	struct super_block *sb = server->sb;
-	union scoutfs_inet_addr addr;
+	union scoutfs_inet_addr addr = {{0,}};
 	u64 rid = 0;
 	int ret = 0;

--- a/tests/funcs/exec.sh
+++ b/tests/funcs/exec.sh
@@ -80,3 +80,15 @@ t_compare_output()
 {
 	"$@" >&7 2>&1
 }
+
+#
+# usually bash prints an annoying output message when jobs
+# are killed.  We can avoid that by redirecting stderr for
+# the bash process when it reaps the jobs that are killed.
+#
+t_silent_kill() {
+	exec {ERR}>&2 2>/dev/null
+	kill "$@"
+	wait "$@"
+	exec 2>&$ERR {ERR}>&-
+}
--- a/tests/run-tests.sh
+++ b/tests/run-tests.sh
@@ -532,12 +532,15 @@ for t in $tests; do
 	cmd rm -rf "$T_TMPDIR"
 	cmd mkdir -p "$T_TMPDIR"

-	# create a test name dir in the fs
+	# create a test name dir in the fs, clean up old data as needed
 	T_DS=""
 	for i in $(seq 0 $((T_NR_MOUNTS - 1))); do
 		dir="${T_M[$i]}/test/$test_name"

-		test $i == 0 && cmd mkdir -p "$dir"
+		test $i == 0 && (
+			test -d "$dir" && cmd rm -rf "$dir"
+			cmd mkdir -p "$dir"
+		)

 		eval T_D$i=$dir
 		T_D[$i]=$dir
--- a/tests/tests/enospc.sh
+++ b/tests/tests/enospc.sh
@@ -88,6 +88,11 @@ rm -rf "$SCR/xattrs"

 echo "== make sure we can create again"
 file="$SCR/file-after"
+C=120
+while (( C-- )); do
+	touch $file 2> /dev/null && break
+	sleep 1
+done
 touch $file
 setfattr -n user.scoutfs-enospc -v 1 "$file"
 sync
--- a/tests/tests/lock-recover-invalidate.sh
+++ b/tests/tests/lock-recover-invalidate.sh
@@ -38,6 +38,6 @@ while [ "$SECONDS" -lt "$END" ]; do
 done

 echo "== stopping background load"
-kill $load_pids
+t_silent_kill $load_pids

 t_pass
--- a/tests/tests/orphan-inodes.sh
+++ b/tests/tests/orphan-inodes.sh
@@ -5,18 +5,6 @@
 t_require_commands sleep touch sync stat handle_cat kill rm
 t_require_mounts 2

-#
-# usually bash prints an annoying output message when jobs
-# are killed.  We can avoid that by redirecting stderr for
-# the bash process when it reaps the jobs that are killed.
-#
-silent_kill() {
-	exec {ERR}>&2 2>/dev/null
-	kill "$@"
-	wait "$@"
-	exec 2>&$ERR {ERR}>&-
-}
-
 #
 # We don't have a great way to test that inode items still exist.   We
 # don't prevent opening handles with nlink 0 today, so we'll use that.
@@ -52,7 +40,7 @@ inode_exists $ino || echo "$ino didn't exist"

 echo "== orphan from failed evict deletion is picked up"
 # pending kill signal stops evict from getting locks and deleting
-silent_kill $pid
+t_silent_kill $pid
 t_set_sysfs_mount_option 0 orphan_scan_delay_ms 1000
 sleep 5
 inode_exists $ino && echo "$ino still exists"
@@ -70,7 +58,7 @@ for nr in $(t_fs_nrs); do
 	rm -f "$path"
 done
 sync
-silent_kill $pids
+t_silent_kill $pids
 for nr in $(t_fs_nrs); do
 	t_force_umount $nr
 done
@@ -82,7 +70,15 @@ done
 # wait for orphan scans to run
 t_set_all_sysfs_mount_options orphan_scan_delay_ms 1000
 # also have to wait for delayed log merge work from mount
-sleep 15
+C=120
+while (( C-- )); do
+	brk=1
+	for ino in $inos; do
+		inode_exists $ino && brk=0
+	done
+	test $brk -eq 1 && break
+	sleep 1
+done
 for ino in $inos; do
 	inode_exists $ino && echo "$ino still exists"
 done
@@ -131,7 +127,7 @@ while [ $SECONDS -lt $END ]; do
 	done

 	# trigger eviction deletion of each file in each mount
-	silent_kill $pids
+	t_silent_kill $pids

 	wait || t_fail "handle_fsetxattr failed"
Author	SHA1	Message	Date
Chris Kirby	253f049251	Don't overrun the block budget in server_log_merge_free_work(). This fixes a potential fence post failure like the following: error: 1 holders exceeded alloc budget av: bef 7407 now 7392, fr: bef 8185 now 7672 The code is only accounting for the freed btree blocks, not the dirtying of other items. So it's possible to be at exactly (COMMIT_HOLD_ALLOC_BUDGET / 2), dirty some log btree blocks, loop again, then consume another (COMMIT_HOLD_ALLOC_BUDGET / 2) and blow past the total budget. In this example, we went over by 13 blocks. By only consuming up to 1/8 of the budget on each loop, and committing when we have consumed 3/4 of the budget, we can avoid the fence post condition. Signed-off-by: Chris Kirby <ckirby@versity.com>	2025-05-28 13:48:22 -05:00
Zach Brown	7865ee9f54	Merge pull request #223 from versity/auke/el9_5_wmaybe-uninit Fix -Wmaybe-uninitalized since rhel9.5	2025-05-12 12:21:02 -07:00
Zach Brown	624eb128c6	Merge pull request #221 from versity/auke/enospc-test Give enospc test more time to commit unlink.	2025-05-09 11:27:04 -07:00
Zach Brown	091eb3b683	Merge pull request #219 from versity/auke/fix-tests-failing-dirty-test-dirs Fix test cases that don't run cleanly in a semi-dirty env.	2025-05-09 11:17:24 -07:00
Zach Brown	04e8cc6295	Merge pull request #220 from versity/auke/orphan-inodes Extend orphan-inodes timeout.	2025-05-09 11:15:13 -07:00
Zach Brown	0f6fdb3eb5	Merge pull request #222 from versity/auke/t_kill_silent Properly silently kill background tasks.	2025-05-09 11:11:24 -07:00
Auke Kok	2f48a606e8	Fix -Wmaybe-uninitalized since rhel9.5 Looks like the compiler isn't smart enough to understand the pass by pointer value, and we can initialize it here easily. make[1]: Entering directory '/usr/src/kernels/5.14.0-503.26.1.el9_5.x86_64' CC [M] /home/auke/scoutfs/kmod/src/server.o /home/auke/scoutfs/kmod/src/server.c: In function ‘fence_pending_recov_worker’: /home/auke/scoutfs/kmod/src/server.c:4170:23: error: ‘addr.v4.addr’ may be used uninitialized in this function [-Werror=maybe-uninitialized] 4170 \| ret = scoutfs_fence_start(sb, rid, le32_to_be32(addr.v4.addr), \| ^~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ 4171 \| SCOUTFS_FENCE_CLIENT_RECOVERY); \| ~~~~~~~~~~~~~~~~~~~~~~~~~~~~~~ cc1: all warnings being treated as errors There's still the obvious issue here that we'd intended to support ipv6 but just disregard that here. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 15:20:50 -07:00
Auke Kok	377e49caf1	Properly silently kill background tasks. Occasionally, we have some tests fail because these kills produce: tests/lock-recover-invalidate.sh: line 42: 9928 Terminated Even though we expected them to be silent. In these particular cases we already don't care about this output. We borrow the silent_kill() function from orphan-inodes and promote it to t_silent_kill() in funcs/exec.sh, and then use it everywhere where appropriate. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 12:03:04 -07:00
Auke Kok	d08eb66adc	Give enospc test more time to commit unlink. The current test sequence performs the unlink and immediately tests whether enough resources are available to create new files again, and this consistently fails. One of my crummy VMs takes a good 12 seconds before the `touch` actually succeeds. We care about the filesystem eventually returning from ENOSPC, and certainly we don't want it to take forever, but there is a period after our first ENOSPC error and cleanup that we expect ENOSPC to fail for a bit longer. Make the timeout 120s. As soon as the `touch` completes, exit the wait loop. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 11:40:13 -07:00
Auke Kok	1d0cde7cc3	Clean up old test data as needed. If run without `-m` (explicit mkfs) in subsequent testing, old test data files may break several tests. Most failures are -EEXIST, but there are some more subtle ones. This change erases any existing test dir as needed just before we run the tests, and avoids the issue entirely. I considered doing a `mv dir dir.$$ && rm -rf dir.$$ &` alternative solution but that likely will interfere disproportionally with tests that do disconnects and other thing that can be impacted by an unlink storm. This has an obvious performance aspect - tests will be a little slower to start on subsequent runs. In CI, this will effectively be a no-op though. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 10:10:01 -07:00
Auke Kok	138c7c6b49	Extend orphan-inodes timeout. This test regularly fails in CI when the 15 seconds elapses and the system still hasn't concluded the mount log merges and orphan inode scans needed to unlink the test files. Instead of just extending the timeout value, we test-and-retry for 120s. This hopefully is faster in most cases. My smallest VM needs about 6s-8s on average. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 09:56:45 -07:00