scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-04-30 18:05:43 +00:00

Author	SHA1	Message	Date
Auke Kok	98b2fe2510	Switch to .iterate_shared Since v4.6-rc3-29-g6192269444eb there has been a special readdir VFS method that can be called for the same directory multiple times in parallel, without any additional VFS locking. The VFS has provided a WRAP_DIR_ITER() macro to re-wrap the method with extra locking, in case the method wasn't safe for this. With el10, the old .readdir method is now gone, and we have no choice but to either use the wrapper, or just hook up our readdir() method to the .iterate_shared op. From what I can see, our implementation is safe to do this. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 17:48:01 -07:00
Auke Kok	e69d4426d8	generic_file_splice_read is removed Based on my reading of the gfs2 driver, it appears it's likely the safer approach to take copy_splice_read instead of filemap_splice_read as it may potentially lead to cluster deadlocks. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 17:47:30 -07:00
Auke Kok	f8d40497bd	Obsolete scoutfs_writepage Due to folios, the kernel will call scoutfs_writepages() and this becomes unused. It could be ported but the helper function to call isn't exported anymore. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 17:47:30 -07:00
Auke Kok	5eaea548f8	Hook up buffer_migrate_folio This works together with the dropped block_write_full_page(), allowing us to drop the _writepage() method as long as we implement _writepages(). Since v5.19-rc3-395-g67235182a41c. This used to be the .migratepage() method. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 17:46:54 -07:00
Auke Kok	da6dff7336	Fix unlocked pt_excl in scoutfs_readahead This caller of scoutfs_get_block is now actively used in el10 and the WARN_ON_ONCE(!lock) in data.c:567 triggers. Add the scoutfs_per_task_add_excl/del calls in scoutfs_readpage, scoutfs_readpages, and scoutfs_readahead to register the cluster lock for scoutfs_get_block_read. Add unconditionally rather than guarded by the add_excl return, since these methods can be reached reentrantly from a top-level read that already added the entry. Skipping the I/O in that case left BUG_ON(!list_empty(pages)) in scoutfs_readpages and the page locked in scoutfs_readpage. Move scoutfs_per_task_del before scoutfs_unlock to match the ordering used by file.c read/write paths. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	788e12d6c8	Add sysfs default_groups usage Since v5.1-rc3-29-gaa30f47cf666, and in el9, there are changes to reduce the amount of boilerplate code needed to hook up lots of attribute files using a .default_groups member. In el10, this is the required method as .default_attrs has been removed. This touches every sysfs part that we have. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	8bc8312441	set_blocksize() takes struct file argument In v6.9-rc4-8-gead083aeeed9, this now takes a struct file argument, adding to the ifdef salad we've got going on here. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	ac006094ff	generic_fillattr() now wants the request_mask arg from caller Since ~v6.5-rc1-95-g0d72b92883c6, generic_fillattr() asks us to pass through the request_mask from the caller. This allows it to only request a subset. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	800bced7d6	Shrinker API v4 Yet another major shrinker API evolution in v6.6-rc4-53-gc42d50aefd17. The struct shrinker now has to be dynamically allocated. This is purposely a backwards incompatible break. Collapse the previous KC_ALLOC_SHRINKER, KC_INIT_SHRINKER_FUNCS, and KC_REGISTER_SHRINKER macros into a single KC_SETUP_SHRINKER macro. The three operations have to happen in different orders on different kernel APIs (the name is needed at alloc time on el10 and at register time on KC_SHRINKER_NAME kernels), so coupling them keeps the ordering correct per kernel. Add KC_SHRINKER_IS_NULL so callers can detect shrinker_alloc() failure on el10 and return -ENOMEM. The macro compiles to a constant 0 on older kernels where the shrinker is an embedded struct that cannot fail allocation. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	cc9b3ae3a9	bio_add_page is now __must_check The return type always has been int, so, we just need to add return value checking and do something with it. We could return -ENOMEM here as well, either way it'll fall all the way through no matter what. This is since v6.4-rc2-100-g83f2caaaf9cb. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	414b63004c	Adjust for __assign_str() losing second argument In v6.8-9146-gc759e609030c, the second argument for __assign_str() was removed, as the second parameter is already derived from the __string() definition and no longer needed. We have to do a little digging in headers here to find the definition. Note the missing `;` at a few places... it has to be added now. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	85997d0b80	RIP bd_inode v6.9-rc4-29-g203c1ce0bb06 removes bd_inode. The canonical replacement is bd_mapping->host, where applicable. We have one use where we directly need the mapping instead of the inode, as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	a41f3050ae	Fix compiler warnings for flex array definitions Instead of defining a struct that ends with a flex array member with `val[0]`, the compiler now balks at this since technically, the spec considers this unsanitary. As a result however, we can't memcpy to `struct->val` since that's a pointer and now we're writing something of a different length (u8's in our case) into something that's of pointer size. So there we have to do the opposite, and memcpy to &struct->val[0]. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	82b2684e9b	unaligned.h moved from asm/ to linux/ In v6.12-rc1-3-g5f60d5f6bbc1, asm/unaligned.h only included asm-generic/unaligned.h and that was cleaned up from architecture specific things. Everyone should now include linux/unaligned.h and the former include was removed. A quick peek at server.c shows that while included, it no longer uses any function from this header at all, so it can just be dropped. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	313945cbcf	Use a/m/c_time accessor functions In v6.6-rc5-1-g077c212f0344, one can no longer directly access the inode m_time and a_time etc. We have to go through these static inline functions to get to them. The compat is matched closely to mimic the new functions. Further back, ctime accessors were added in v6.5-rc1-7-g9b6304c1d537, and need to be applied as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	2b221c1ba7	prandom_bytes and family removed, switch to get_random_bytes variants In v6.1-rc5-2-ge9a688bcb193, get_random_u32_below() becomes available and can start replacing prandom_bytes_max(). Switch to it where we can. get_random_bytes() has been available since el7, so also replace prandom_bytes() where we're using it. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	93442500a1	Avoid \Z negative pattern in test exclude list In RHEL10, the grep version is bumped from 3.6 to 3.11, and grep no longer recognizes the \Z character anymore. We have 2 solutions: We can either choose to use `grep -P` to continue using it, or, alternatively, we can choose a different `null` match to have an effectively empty exclude list. The latter seems easy enough: By default, we can just exclude empty lines ("^$") obtaining the exact same behavior as before. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	3616783836	mv overwrite error format changes in el10 This is somewhat cumbersome, we want to see the error message, but the format changes enough to make this messy. We opt to change the golden to the new format, which only shows one of the arguments in its error output: the thing that cannot be overwritten. We then add a filter that rewrites the old output format with sed patterns to be exactly like the new format, so this will work everywhere again, without changing or adding filters to obscure error messages. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	4184d7557c	Account for difference in `stat` output format for device nodes The new format in el10 has non-hex output, separated by a comma. Add the additional filter string so this works as expected. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	a5780b7d00	Fix el10 not skipping the format-version-forward-back test The logic only accounted for single-digit versions. With el10, that breaks. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Auke Kok	ea9e4e9013	Stop using egrep egrep is no longer in el10, so replace it with `grep -E` everywhere. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-29 13:55:55 -07:00
Zach Brown	af31b9f1e8	Merge pull request #306 from versity/zab/v1.30 v1.30 Release	2026-04-22 10:43:17 -07:00
Zach Brown	ad65116d8f	v1.30 Release Finish the release notes for the 1.30 release. Signed-off-by: Zach Brown <zab@versity.com> v1.30	2026-04-21 16:43:12 -07:00
Zach Brown	e20765a9c7	Merge pull request #300 from versity/auke/more_false_positive_failures Auke/more false positive failures: xfs lockdep miss, newline	2026-04-17 09:17:50 -07:00
Zach Brown	066da5c2a2	Merge pull request #297 from versity/auke/quota_mod_trans_hold Hold transaction in scoutfs_quota_mod_rule to prevent alloc corruption.	2026-04-17 09:16:41 -07:00
Auke Kok	7eacc7139c	Hold transaction in scoutfs_quota_mod_rule to prevent alloc corruption. scoutfs_quota_mod_rule calls scoutfs_item_create/delete which use the transaction allocator but it never held it. Without the hold, a concurrent transaction commit can call scoutfs_alloc_init to reinitialize the allocator while dirty_alloc_blocks is in the middle of setting up the freed list block. This overwrites alloc->freed with the server's fresh (empty) state, causing a blkno mismatch BUG_ON in list_block_add. Reproduced by stressing concurrent quota add/del operations across mounts. Crashdump analysis confirms dirty_list_block COW'd a freed block (fr_old=9842, new blkno=9852) but by the time list_block_add ran, freed.ref.blkno was 0 with first_nr=0 and total_nr=0: the freed list head had been zeroed by a concurrent alloc_init. Fix by adding scoutfs_hold_trans/scoutfs_release_trans around the item modification in scoutfs_quota_mod_rule, preventing transaction commit from racing with the allocator use. Rename the 'unlock' label to 'release' since 'out' now directly does the unlock. The unlock safely handles a NULL lock. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-16 16:20:47 -07:00
Auke Kok	9e3b01b3b4	Filter newlines out dmesg.new Without overly broad filtering empty lines from dmesg, filter them so dmesg.new doesn't trigger a test failure. I don't want to overly process dmesg, so do this as late as possible. The xfs lockdep patterns can forget a leading/trailing empty line, causing a failure despite the explicit removal of the lockdep false positive. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-15 10:36:28 -07:00
Auke Kok	876c233f06	Ignore another xfs lockdep class This already caught xfs_nondir_ilock_class, but recent CI runs have been hitting xfs_dir_ilock_class, too. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-15 10:36:28 -07:00
Zach Brown	6aa5876c71	Merge pull request #301 from versity/auke/el7_uninit_read_seq Squelch gcc uninitialized warning on el7	2026-04-15 09:58:23 -07:00
Auke Kok	7a9f9ec698	Squelch gcc uninitialized warning on el7 The gcc version in el7 can't determine that scoutfs_block_check_stale won't return ret = 0 when the input ret value is < 0, and errors because we might call alloc_wpage with an uninitialized read_seq. Initialize it to 0 to avoid it. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-14 15:09:20 -04:00
Zach Brown	fc0fc1427f	Merge pull request #296 from versity/auke/indx_key_delete Fix indx delete using wrong xid, leaving orphans. && Add basic-xattr-indx tests.	2026-04-13 14:34:37 -07:00
Zach Brown	ec68845201	Merge pull request #289 from versity/auke/merge_read_item_stale_seq Update seq when merging deltas from partial log merge.	2026-04-13 14:10:37 -07:00
Auke Kok	5e2009f939	Avoid double counting deltas from non-input finalized log trees. Readers currently accumulate all finalized log tree deltas into a single bucket for deciding whether they are already in fs_root or not, but, finalized trees that aren't inputs to a current merge will have higher seqs, and thus we may be double applying deltas already merged into fs_root. To distinguish, scoutfs_totl_merge_contribute() needs to know the merge status item seq. We change wkic's get_roots() from using the SCOUTFS_NET_CMD_GET_ROOTS RPC to reading the superblock directly. This is needed because totl merge resolution has to use the same data as the btree roots it is operating on, thus we can't grab it from a SCOUTFS_NET_CMD_GET_ROOTS packet - it likely is different. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	8bdc20af21	Rename/reword FINALIZED to MERGE_INPUT. These mislabeled members and enums were clearly not describing the actual data being handled and obfuscating the intent of avoiding mixing merge input items with non-merge input items. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	857a39579e	Clear roots when retrying due to stale btree blocks. Before deltas were added this code path was correct, but with deltas we can't just retry this without clearing &root, since it would potentially double count. The condition where this could happen is when there are deltas in several finalized log trees, and we've made progress towards reading some of them, and then encounter a stale btree block. The retry would not clear the collected trees, apply the same delta as was already applied before the retry, and thus double count. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	38d36c9f5c	Update seq when merging deltas from partial log merge. Two different clients can write delta's for totl indexes at the same time, recording their changes. When merged, a reader should apply both in order, and only once. To do so, the seq determines whether the delta has been applied already. The code fails to update the seq while walking the trees for deltas to apply. Subsequently, when processing subsequent trees, it could re-process deltas already applied. In case of a large negative delta (e.g. removal of large amounts of files), the totl value could become negative, resulting in quota lockout. The fix is simple: advance the seq when reading partial delta merges to avoid double counting. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 13:50:21 -07:00
Auke Kok	b724567b2a	Add log_merge_force_partial trigger for testing partial merges. Add a trigger that forces btree_merge() to return -ERANGE after modifying a leaf's worth of items, causing many small partial merges per merge cycle. This is used by tests to reliably reproduce races that depend on partial merges splicing items into fs_root while finalized logs still exist. The trigger check lives inside btree_merge() where it can observe actual item modification progress, rather than overriding the caller's dirty byte limit argument which applies to the whole writer context. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 12:25:30 -07:00
Auke Kok	add1da10dc	Add test for stale seq in merge delta combining. merge_read_item() fails to update found->seq when combining delta items from multiple finalized log trees. Add a test case to replicate the conditions of this issue. Each of 5 mounts sets totl value 1 on 2500 shared keys, giving an expected total of 5 per key. Any total > 5 proves double-counting from a stale seq. The log_merge_force_partial trigger forces many partial merges per cycle, creating the conditions where stale-seq items get spliced into fs_root while finalized logs still exist. Parallel readers on all mounts race against this window to detect double-counted values. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-10 12:25:30 -07:00
Auke Kok	b9c49629a2	Add basic-xattr-indx tests. We had no basic testing for `scoutfs read-xattr-index` whatsoever. This adds your basic negative argument tests, lifecycle tests, the deduplicated reads, and partial removal. This exposes a bug in deletion where the indx entry isn't cleaned up on inode delete. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-08 13:45:56 -07:00
Auke Kok	9737009437	Fix indx delete using wrong xid, leaving orphans. During inode deletion, scoutfs_xattr_drop forgot to set the xid of the xattr after calling parse_indx_key, which hardcodes xid=0, and it is the callers' responsibility. delete_force then deletes the wrong key, and returns no errors on nonexistant keys. So now there is a pending deletion for a non-existant indx and an orphan indx entry in the tree. Subsequent calls to `scoutfs read-xattr-index` will thus return entries for deleted inodes. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-08 11:48:47 -07:00
Zach Brown	3d54ae03e6	Merge pull request #295 from versity/auke/xfs_lockdep_ignore Avoid xfs lockdep false positive dmesg errors.	2026-04-03 09:46:44 -07:00
Auke Kok	e27ec0add6	Avoid xfs lockdep false positive dmesg errors. This xfs lockdep stack trace has at least 2 variants around fs_reclaim, so try and capture it not too precisely here. We can remove "lockdep disabled" in the $re grep -v, because it can affect both this and the kasan one. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-04-01 14:25:48 -07:00
Zach Brown	5457741672	Merge pull request #292 from versity/zab/v1.29 v1.29 Release	2026-03-25 22:36:28 -07:00
Zach Brown	4bd7a38b05	v1.29 Release Finish the release notes for the 1.29 release. Signed-off-by: Zach Brown <zab@versity.com> v1.29	2026-03-25 16:33:31 -07:00
Zach Brown	087b2e85ab	Merge pull request #291 from versity/auke/orphan-log-merge Auke/orphan log merge	2026-03-25 16:26:24 -07:00
Auke Kok	8a730464ab	Add orphan-log-trees test and reclaim_skip_finalize trigger Add a reclaim_skip_finalize trigger that prevents reclaim from setting FINALIZED on log_trees entries. The test arms this trigger, force-unmounts a client to create an orphan, and verifies the log merge succeeds without timeout and the orphan reclaim message appears in dmesg. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-25 10:39:40 -07:00
Auke Kok	daea8d5bc1	Reclaim orphaned log_trees entries from unmounted clients An unfinalized log_trees entry whose rid is not in mounted_clients is an orphan left behind by incomplete reclaim. Previously this permanently blocked log merges because the finalize loop treated it as an active client that would never commit. Call reclaim_open_log_tree for orphaned rids before starting a log merge. Once reclaimed, the existing merge and freeing paths include them normally. Also skip orphans in get_stable_trans_seq so their open transaction doesn't artificially lower the stable sequence. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-25 06:47:22 -07:00
Zach Brown	1d60f684d2	Merge pull request #237 from versity/auke/hole_punch_ioctl_test Punch Offline Ioctl, tests, scoutfs subcmd.	2026-03-19 13:51:44 -07:00
Zach Brown	a62708ac19	Merge pull request #286 from versity/auke/more-inode-deletion Also use orphan scan wait code for remote unlink parts.	2026-03-16 14:33:20 -07:00
Auke Kok	16b1710541	Punch-offline tests. Basic testing for the punch-offline ioctl code. The tests consist of a bunch of negative testing to make sure things that are expressly not allowed fail, followed by a bunch of known-expected outcome tests that punches holes in several patterns, verifying them. Signed-off-by: Auke Kok <auke.kok@versity.com>	2026-03-13 15:45:52 -07:00

1 2 3 4 5 ...

2231 Commits