scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-04-25 15:50:29 +00:00

Author	SHA1	Message	Date
Auke Kok	4e7cf76afc	Drain conn workers before nulling client->conn in destroy scoutfs_client_destroy nulled client->conn before scoutfs_net_free_conn had a chance to drain the conn's workqueue. An in-flight proc_worker running client_lock_recover dispatches scoutfs_lock_recover_request synchronously, which in turn calls scoutfs_client_lock_recover_response. That helper reads client->conn and hands it to scoutfs_net_response, so a racing NULL made submit_send dereference conn->lock and trip a KASAN null-ptr-deref followed by a GPF. Only became reachable in practice once reconnect started draining pending client requests with -ECONNRESET, because the farewell can now return while the server is still sending requests on the re-established socket. Reorder so scoutfs_net_free_conn runs first; its shutdown_worker drains conn->workq before any memory is freed, then client->conn is nulled. The original intent of nulling to catch buggy late callers is preserved. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:33 -07:00
Auke Kok	64200ed61c	Bound RPC waits in idempotent background workers The srch compact and orphan scan workers called sync request RPCs that would block indefinitely if the server stopped answering. Both workers are idempotent and reschedule on error, so blocking forever buys nothing compared to treating a stalled RPC as a failure and trying again on the next tick. Add scoutfs_net_sync_request_timeout, a bounded-wait variant that returns -ETIMEDOUT if the response doesn't arrive in time. Response state lives on a refcounted heap allocation rather than the caller's stack so a late callback can't scribble into freed memory. On timeout we race with an arriving response for the msend under conn->lock: if find_request wins we queue_dead_free and drop the callback's ref; otherwise we wait for the in-flight callback to complete before returning. Add _timeout typed wrappers for the four RPCs these workers use and thread a 5 minute bound in from each worker. All other callers keep the unbounded client_sync_request path with its reconnect retries. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:27 -07:00
Auke Kok	fe43c624aa	Fail pending client requests when reconnecting to new server Previously, client_greeting spliced pending requests back onto send_queue when reconnecting to a new server. Those requests carried state from the old server (sequence numbers, log tree references, lock modes) that was reclaimed at fence time, so resending against the new server was incorrect. Drain pending requests with -ECONNRESET at greeting time, mirroring the forcing_unmount drain in the shutdown worker. Thread the lock pointer through scoutfs_client_lock_request so the response callback can clear request_pending and wake waiters on error; otherwise a lock_key_range waiter would block forever because the new server's lock recovery only reports granted modes, not pending requests. Wrap the sync request senders in client_sync_request so userspace paths (statfs, mkdir, sysfs volopt, resize ioctl, walk-inodes ioctl) retry transparently across failover instead of surfacing a new -ECONNRESET that callers never saw before. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:20 -07:00
Auke Kok	0360462a35	Clear ref_blkno output when block is already dirty block_dirty_ref() skipped setting *ref_blkno when the block was already dirty, leaving the caller with a stale value. Set it to 0 on the already-dirty fast path so callers do not try to free a block that was not allocated. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:19 -07:00
Auke Kok	283564f9a2	Validate freed ref consistency in dirty_alloc_blocks Add a WARN_ON_ONCE check that the freed list ref blkno matches the block header blkno after dirtying alloc blocks. Also save and restore freed.first_nr on the error path, and initialize av_old/fr_old to 0 so the diagnostic message has valid values. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:18 -07:00
Auke Kok	e8f4d0b8cc	Break retry_forever loop on normal unmount retry_forever() only checked scoutfs_forcing_unmount(), so a normal unmount with a network error in the commit path would loop forever. Also check scoutfs_unmounting() so the write worker can exit cleanly. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:17 -07:00
Auke Kok	d52fa5d71b	lock: clear coverage and skip invalidation during unmount During normal unmount, lock_invalidate_worker can hang in scoutfs_trans_sync(sb, 1) because the trans commit path may return network errors that cause an infinite retry loop. Skip full lock_invalidate() during shutdown and unmount, and extract lock_clear_coverage() to still clean up coverage items in those paths and in scoutfs_lock_destroy(). Without this, coverage items can remain attached to locks being freed. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:16 -07:00
Auke Kok	2e45e9dc8c	net: break out of sync request wait during unmount Replace unbounded wait_for_completion() in scoutfs_net_sync_request() with a 60 second timeout loop that checks scoutfs_unmounting(). Cancel the queued request before returning -ESHUTDOWN so that sync_response cannot fire on freed stack memory after the caller returns. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:15 -07:00
Chris Kirby	d6a4034564	Suppress another forced shutdown error message The "server error emptying freed" error was causing a fence-and-reclaim test failure. In this case, the error was -ENOLINK, which we should ignore for messaging purposes. Signed-off-by: Chris Kirby <ckirby@versity.com>	2026-04-22 13:49:14 -07:00
Auke Kok	c2e8675a8c	Wake up lock waiters to prevent hangs during unmount. Add unmounting checks to lock_wait_cond() and lock_key_range() so that lock waiters wake up and new lock requests fail with -ESHUTDOWN during unmount. Replace the unbounded wait_event() with a 60 second timeout to prevent indefinite hangs. Relax the WARN_ON_ONCE at lock_key_range entry to only warn when not unmounting, since late lock attempts during shutdown are expected. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:13 -07:00
Auke Kok	7c6c3d223d	Add client timeout to farewell completion wait. Replace unbounded wait_for_completion() with a 120 second timeout to prevent indefinite hangs during unmount if the server never responds to the farewell request. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:12 -07:00
Auke Kok	6ec131da03	Add cond_resched in block_free_work I'm seeing consistent CPU soft lockups in block_free_work on my bare metal system that aren't reached by VM instances. The reason is that the bare metal machine has a ton more memory available causing the block free work queue to grow much larger in size, and then it has so much work that it can take 30+ seconds before it goes through it all. This is all with a debug kernel. A non debug kernel will likely zoom through the outstanding work here at a much faster rate. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:11 -07:00
Auke Kok	d76a217ff8	Set BLOCK_BIT_ERROR on bio submit failure during forced unmount block_submit_bio will return -ENOLINK if called during a forced shutdown, the bio is never submitted, and thus no completion callback will fire to set BLOCK_BIT_ERROR. Any other task waiting for this specific bp will end up waiting forever. To fix, fall through to the existing block_end_io call on the error path instead of returning directly. That means moving the forcing_unmount check past the setup calls so block_end_io's bookkeeping stays balanced. block_end_io then sets BLOCK_BIT_ERROR and wakes up waiters just as it would on a failed async completion. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-22 13:49:08 -07:00
Zach Brown	af31b9f1e8	Merge pull request #306 from versity/zab/v1.30 v1.30 Release	2026-04-22 10:43:17 -07:00
Zach Brown	ad65116d8f	v1.30 Release Finish the release notes for the 1.30 release. Signed-off-by: Zach Brown <zab@versity.com> v1.30	2026-04-21 16:43:12 -07:00
Zach Brown	e20765a9c7	Merge pull request #300 from versity/auke/more_false_positive_failures Auke/more false positive failures: xfs lockdep miss, newline	2026-04-17 09:17:50 -07:00
Zach Brown	066da5c2a2	Merge pull request #297 from versity/auke/quota_mod_trans_hold Hold transaction in scoutfs_quota_mod_rule to prevent alloc corruption.	2026-04-17 09:16:41 -07:00
Auke Kok	7eacc7139c	Hold transaction in scoutfs_quota_mod_rule to prevent alloc corruption. scoutfs_quota_mod_rule calls scoutfs_item_create/delete which use the transaction allocator but it never held it. Without the hold, a concurrent transaction commit can call scoutfs_alloc_init to reinitialize the allocator while dirty_alloc_blocks is in the middle of setting up the freed list block. This overwrites alloc->freed with the server's fresh (empty) state, causing a blkno mismatch BUG_ON in list_block_add. Reproduced by stressing concurrent quota add/del operations across mounts. Crashdump analysis confirms dirty_list_block COW'd a freed block (fr_old=9842, new blkno=9852) but by the time list_block_add ran, freed.ref.blkno was 0 with first_nr=0 and total_nr=0: the freed list head had been zeroed by a concurrent alloc_init. Fix by adding scoutfs_hold_trans/scoutfs_release_trans around the item modification in scoutfs_quota_mod_rule, preventing transaction commit from racing with the allocator use. Rename the 'unlock' label to 'release' since 'out' now directly does the unlock. The unlock safely handles a NULL lock. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-16 16:20:47 -07:00
Auke Kok	9e3b01b3b4	Filter newlines out dmesg.new Without overly broad filtering empty lines from dmesg, filter them so dmesg.new doesn't trigger a test failure. I don't want to overly process dmesg, so do this as late as possible. The xfs lockdep patterns can forget a leading/trailing empty line, causing a failure despite the explicit removal of the lockdep false positive. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-15 10:36:28 -07:00
Auke Kok	876c233f06	Ignore another xfs lockdep class This already caught xfs_nondir_ilock_class, but recent CI runs have been hitting xfs_dir_ilock_class, too. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-15 10:36:28 -07:00
Zach Brown	6aa5876c71	Merge pull request #301 from versity/auke/el7_uninit_read_seq Squelch gcc uninitialized warning on el7	2026-04-15 09:58:23 -07:00
Auke Kok	7a9f9ec698	Squelch gcc uninitialized warning on el7 The gcc version in el7 can't determine that scoutfs_block_check_stale won't return ret = 0 when the input ret value is < 0, and errors because we might call alloc_wpage with an uninitialized read_seq. Initialize it to 0 to avoid it. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-14 15:09:20 -04:00
Zach Brown	fc0fc1427f	Merge pull request #296 from versity/auke/indx_key_delete Fix indx delete using wrong xid, leaving orphans. && Add basic-xattr-indx tests.	2026-04-13 14:34:37 -07:00
Zach Brown	ec68845201	Merge pull request #289 from versity/auke/merge_read_item_stale_seq Update seq when merging deltas from partial log merge.	2026-04-13 14:10:37 -07:00
Auke Kok	5e2009f939	Avoid double counting deltas from non-input finalized log trees. Readers currently accumulate all finalized log tree deltas into a single bucket for deciding whether they are already in fs_root or not, but, finalized trees that aren't inputs to a current merge will have higher seqs, and thus we may be double applying deltas already merged into fs_root. To distinguish, scoutfs_totl_merge_contribute() needs to know the merge status item seq. We change wkic's get_roots() from using the SCOUTFS_NET_CMD_GET_ROOTS RPC to reading the superblock directly. This is needed because totl merge resolution has to use the same data as the btree roots it is operating on, thus we can't grab it from a SCOUTFS_NET_CMD_GET_ROOTS packet - it likely is different. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	8bdc20af21	Rename/reword FINALIZED to MERGE_INPUT. These mislabeled members and enums were clearly not describing the actual data being handled and obfuscating the intent of avoiding mixing merge input items with non-merge input items. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	857a39579e	Clear roots when retrying due to stale btree blocks. Before deltas were added this code path was correct, but with deltas we can't just retry this without clearing &root, since it would potentially double count. The condition where this could happen is when there are deltas in several finalized log trees, and we've made progress towards reading some of them, and then encounter a stale btree block. The retry would not clear the collected trees, apply the same delta as was already applied before the retry, and thus double count. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	38d36c9f5c	Update seq when merging deltas from partial log merge. Two different clients can write delta's for totl indexes at the same time, recording their changes. When merged, a reader should apply both in order, and only once. To do so, the seq determines whether the delta has been applied already. The code fails to update the seq while walking the trees for deltas to apply. Subsequently, when processing subsequent trees, it could re-process deltas already applied. In case of a large negative delta (e.g. removal of large amounts of files), the totl value could become negative, resulting in quota lockout. The fix is simple: advance the seq when reading partial delta merges to avoid double counting. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	b724567b2a	Add log_merge_force_partial trigger for testing partial merges. Add a trigger that forces btree_merge() to return -ERANGE after modifying a leaf's worth of items, causing many small partial merges per merge cycle. This is used by tests to reliably reproduce races that depend on partial merges splicing items into fs_root while finalized logs still exist. The trigger check lives inside btree_merge() where it can observe actual item modification progress, rather than overriding the caller's dirty byte limit argument which applies to the whole writer context. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 12:25:30 -07:00
Auke Kok	add1da10dc	Add test for stale seq in merge delta combining. merge_read_item() fails to update found->seq when combining delta items from multiple finalized log trees. Add a test case to replicate the conditions of this issue. Each of 5 mounts sets totl value 1 on 2500 shared keys, giving an expected total of 5 per key. Any total > 5 proves double-counting from a stale seq. The log_merge_force_partial trigger forces many partial merges per cycle, creating the conditions where stale-seq items get spliced into fs_root while finalized logs still exist. Parallel readers on all mounts race against this window to detect double-counted values. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 12:25:30 -07:00
Auke Kok	b9c49629a2	Add basic-xattr-indx tests. We had no basic testing for `scoutfs read-xattr-index` whatsoever. This adds your basic negative argument tests, lifecycle tests, the deduplicated reads, and partial removal. This exposes a bug in deletion where the indx entry isn't cleaned up on inode delete. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-08 13:45:56 -07:00
Auke Kok	9737009437	Fix indx delete using wrong xid, leaving orphans. During inode deletion, scoutfs_xattr_drop forgot to set the xid of the xattr after calling parse_indx_key, which hardcodes xid=0, and it is the callers' responsibility. delete_force then deletes the wrong key, and returns no errors on nonexistant keys. So now there is a pending deletion for a non-existant indx and an orphan indx entry in the tree. Subsequent calls to `scoutfs read-xattr-index` will thus return entries for deleted inodes. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-08 11:48:47 -07:00
Zach Brown	3d54ae03e6	Merge pull request #295 from versity/auke/xfs_lockdep_ignore Avoid xfs lockdep false positive dmesg errors.	2026-04-03 09:46:44 -07:00
Auke Kok	e27ec0add6	Avoid xfs lockdep false positive dmesg errors. This xfs lockdep stack trace has at least 2 variants around fs_reclaim, so try and capture it not too precisely here. We can remove "lockdep disabled" in the $re grep -v, because it can affect both this and the kasan one. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-01 14:25:48 -07:00
Zach Brown	5457741672	Merge pull request #292 from versity/zab/v1.29 v1.29 Release	2026-03-25 22:36:28 -07:00
Zach Brown	4bd7a38b05	v1.29 Release Finish the release notes for the 1.29 release. Signed-off-by: Zach Brown <zab@versity.com> v1.29	2026-03-25 16:33:31 -07:00
Zach Brown	087b2e85ab	Merge pull request #291 from versity/auke/orphan-log-merge Auke/orphan log merge	2026-03-25 16:26:24 -07:00
Auke Kok	8a730464ab	Add orphan-log-trees test and reclaim_skip_finalize trigger Add a reclaim_skip_finalize trigger that prevents reclaim from setting FINALIZED on log_trees entries. The test arms this trigger, force-unmounts a client to create an orphan, and verifies the log merge succeeds without timeout and the orphan reclaim message appears in dmesg. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-25 10:39:40 -07:00
Auke Kok	daea8d5bc1	Reclaim orphaned log_trees entries from unmounted clients An unfinalized log_trees entry whose rid is not in mounted_clients is an orphan left behind by incomplete reclaim. Previously this permanently blocked log merges because the finalize loop treated it as an active client that would never commit. Call reclaim_open_log_tree for orphaned rids before starting a log merge. Once reclaimed, the existing merge and freeing paths include them normally. Also skip orphans in get_stable_trans_seq so their open transaction doesn't artificially lower the stable sequence. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-25 06:47:22 -07:00
Zach Brown	1d60f684d2	Merge pull request #237 from versity/auke/hole_punch_ioctl_test Punch Offline Ioctl, tests, scoutfs subcmd.	2026-03-19 13:51:44 -07:00
Zach Brown	a62708ac19	Merge pull request #286 from versity/auke/more-inode-deletion Also use orphan scan wait code for remote unlink parts.	2026-03-16 14:33:20 -07:00
Auke Kok	16b1710541	Punch-offline tests. Basic testing for the punch-offline ioctl code. The tests consist of a bunch of negative testing to make sure things that are expressly not allowed fail, followed by a bunch of known-expected outcome tests that punches holes in several patterns, verifying them. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-13 15:45:52 -07:00
Auke Kok	440c3dc769	Add punch-offline scoutfs subcommand. A minimal punch_offline ioctl wrapper. Argument style is adopted from stage/release. Following the syntax for the option of stage/release, this calls the punch offline ioctl, punching any offline extent within the designated range from offset with length. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-13 15:45:52 -07:00
Zach Brown	0fd172c5d9	Add punch_offline ioctl Add an archive layer ioctl for converting offline extents into sparse extents without relying on or modifying data_version. This is helpful when working with files with very large sparse regions. Signed-off-by: Zach Brown <zab@versity.com> Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-13 15:44:18 -07:00
Zach Brown	48c1f221b3	Merge pull request #285 from versity/auke/s-i-i-grep-awk-fix Use awk matching for ino.	2026-03-13 13:51:39 -07:00
Zach Brown	34713f3559	Merge pull request #290 from versity/auke/dirent_zero_pad Auke/dirent zero pad	2026-03-06 10:41:04 -08:00
Auke Kok	137abc1fe2	Zero scoutfs_data_extent_val padding. The initialization here avoids clearing __pad[], which leaks to disk. Use a struct initializer to avoid it. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-05 16:20:06 -08:00
Auke Kok	64fcbdc15e	Zero out dirent padding to avoid leaking to disk. This allocation here currently leaks through __pad[7] which is written to disk. Use the initializer to enforce zeroing the pad. The name member is written right after. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-05 16:20:06 -08:00
Zach Brown	d9c951ff48	Merge pull request #287 from versity/auke/misc_fixes Unsorted misc. fixes for minor/cosmetic issues.	2026-03-02 10:12:26 -08:00
Auke Kok	eaae92d983	Don't send -EINVAL as u8, over the network. The caller sends the return value of this inline as u8. If we return -EINVAL, it maps to (234) which is outside of our enum range. Assume this was meant to return SCOUTFS_NET_ERR_EINVAL which is a defined constant. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-02-26 14:02:42 -05:00

1 2 3 4 5 ...

2223 Commits