scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-01-03 10:55:20 +00:00

Author	SHA1	Message	Date
Zach Brown	99a20bc383	Put scratch mount point in test tmp dirs Some tests had grown a bad pattern of making a mount point for the scratch mount in the root /mnt directory. Change them to use a mount point in their test's temp directory outside the testing fs. Signed-off-by: Zach Brown <zab@versity.com>	2023-04-17 12:47:50 -07:00
Zach Brown	18903ce500	Alphabetize command listing in scoutfs man page List the scoutfs utility commands in the man page in alphabetical order. Signed-off-by: Zach Brown <zab@versity.com>	2023-04-17 12:47:50 -07:00
Zach Brown	b76e22ffcf	Refactor user util functions for device size Split the existing device_size() into get_device_size() and limit_device_size(). An upcoming command wants to get the device size without applying limiting policy. Signed-off-by: Zach Brown <zab@versity.com>	2023-04-17 12:47:50 -07:00
Zach Brown	d6863d6832	Merge pull request #119 from versity/zab/inode_nsec Set sb->s_time_gran to support nsecs	2023-04-17 12:39:13 -07:00
Zach Brown	bb01a3990f	Set sb->s_time_gran to support nsecs We missed initializing sb->s_time_gran which controls how some parts of the kernel truncate the granularity of nsec in timespec. Some paths don't use it at all so time would be maintained at full precision. But other paths, particularly setattr_copy() from userspace and notify_change() from the kernel use it to truncate as times are set. Setting s_time_gran to 1 maintains full nsec precision. Signed-off-by: Zach Brown <zab@versity.com>	2023-03-24 10:50:34 -07:00
Zach Brown	409631ceb1	Merge pull request #117 from versity/zab/rename_into_root Zab/rename into root	2023-03-13 09:28:57 -07:00
Zach Brown	f1264c7e47	Add test to rename into root directory The ancestor tests in rename were preventing renaming into the root directory. Signed-off-by: Zach Brown <zab@versity.com>	2023-03-08 11:00:59 -08:00
Zach Brown	a61b8d9961	Fix renaming into root directory The VFS performs a lot of checks on renames before calling the fs method. We acquire locks and refresh inodes in the rename method so we have to duplciate a lot of the vfs checks. One of the checks involves loops with ancestors and subdirectories. We missed the case where the root directory is the destination and doesn't have any parent directories. The backref walker it calls returns -ENOENT instead of 0 with an empty set of parents and that error bubbled up to rename. The fix is to notice when we're asking for ancestors of the one directory that can't have ancestors and short circuit the test. Signed-off-by: Zach Brown <zab@versity.com>	2023-03-08 11:00:59 -08:00
Zach Brown	eac57a1f7a	Merge pull request #116 from versity/zab/v1.11 v1.11 Release	2023-02-02 12:02:45 -08:00
Zach Brown	5512d5c03e	v1.11 Release Finish the release notes for the 1.11 release. Signed-off-by: Zach Brown <zab@versity.com> v1.11	2023-02-02 11:00:38 -08:00
Zach Brown	8cf7be4651	Merge pull request #115 from versity/zab/utils_flush Zab/utils flush	2023-02-02 10:25:12 -08:00
Zach Brown	3363b4fb79	Flush device caches in buffered util cmds Add calls to our new device cache flushing helper in commands that use buffered reads. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-18 10:52:02 -08:00
Zach Brown	ddb5cce2a5	Add quick utils flush_device helper Add a quick helper that just calls cache flushing ioctls on different kinds of files. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-18 10:27:47 -08:00
Zach Brown	1b0e9c45f4	Merge pull request #114 from versity/zab/commit_lt_dirty Allow replaying srch file rotation	2023-01-17 16:07:13 -08:00
Zach Brown	2e2ccb6f61	Allow replaying srch file rotation When a client no longer needs to append to a srch file, for whatever reason, we move the reference from the log_trees item into a specific srch file btree item in the server's srch file tracking btree. Zeroing the log_trees item and inserting the server's btree item are done in a server commit and should be written atomically. But commit_log_trees had an error handling case that could leave the newly inserted item dirty in memory without zeroing the srch file reference in the existing log_trees item. Future attempts to rotate the file reference, perhaps by retrying the commit or by reclaiming the client's rid, would get EEXIST and fail. This fixes the error handling path to ensure that we'll keep the dirty srch file btree and log_trees item in sync. The desynced items can still exist in the world so we'll tolerate getting EEXIST on insertion. After enough time has passed, or if repair zeroed the duplicate reference, we could remove this special case from insertion. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-17 14:33:27 -08:00
Zach Brown	01c8bba56d	Merge pull request #109 from versity/zab/server_statfs_stable_blocks Zab/server statfs stable blocks	2023-01-12 09:58:48 -08:00
Zach Brown	17cb1fe84b	Merge pull request #110 from versity/zab/partial_alloc_move Allow partial extent motion	2023-01-12 09:58:12 -08:00
Zach Brown	78ae87031b	Merge pull request #112 from versity/zab/tmpfile_umask Zab/tmpfile umask	2023-01-12 09:57:56 -08:00
Zach Brown	bf93ea73c4	Merge pull request #113 from versity/zab/move_blocks_loop_fixes Fix move_blocks loop exit conditions	2023-01-12 09:56:25 -08:00
Zach Brown	a23e7478a0	Fix move_blocks loop exit conditions The move_blocks ioctl intends to only move extents whose bytes fall inside i_size. This is easy except for a final extent that straddles an i_size that isn't aligned to 4K data blocks. The code that either checked for an extent being entirely past i_size or for limiting the number of blocks to move by i_size clumsily compared i_size offsets in bytes with extent counts in 4KB blocks. In just the right circumstances, probably with the help of a byte length to move that is much larger than i_size, the length calculation could result in trying to move 0 blocks. Once this hit the loop would keep finding that extent and calculating 0 blocks to move and would be stuck. We fix this by clamping the count of blocks in extents to move in terms of byte offsets at the start of the loop. This gets rid of the extra size checks and byte offset use in the loop. We also add a sanity check to make sure that we can't get stuck if, say, corruption resulted in an otherwise impossible zero length extent. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-10 09:34:52 -08:00
Zach Brown	9ba2ee5c88	Add testing of O_TMPFILE umask There were kernels that didn't apply the current umask to inode modes created with O_TMPFILE without acls. Let's have a test running to make sure that we're not surprised if we come across one. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-09 14:49:23 -08:00
Zach Brown	fe33a492c2	Make o_tmpfile test more generic The o_tmpfile test only did one thing, clean it up a bit so we can add more tests to the file. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-09 10:14:40 -08:00
Zach Brown	77c0ff89fb	Rename stage-tmpfile to o_tmpfile We had a one-off test that was overly specific to staging from tmpfile. This renames it to a more generic test where we can add more tests of o_tmpfile in general. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-09 10:07:15 -08:00
Zach Brown	7c2d83e2f8	Remove saved super block in scoutfs_sb_info Now that we've removed its users we can remove the global saved copy of the super block from scoutfs_sb_info. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-06 11:15:45 -08:00
Zach Brown	40aa47c888	Have the server keep a private dirty super block As the server does its work its transactions modify a dirty super block in memory. This used the global super block in scoutfs_sb_info which was visible to everything, including the client. Move the dirty super block over to the private server info so that only the server can see it. This is mostly boring storage motion but we do change that the quorum code hands the server a static copy of the quorum config to use as it starts up before it reads the most recent super block. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-06 11:15:45 -08:00
Zach Brown	c1bd7bcce5	Allow partial extent motion Refilling a client's data_avail is the only alloc_move call that doesn't try and limit the number of blocks that it dirties. If it doesn't find sufficiently large extents it can exhaust the server's alloc budget without hitting the target. It'll try to dirty blocks and return a hard error. This changes that behaviour to allow returning 0 if it moved any extents. Other callers can deal with partial progress as they already limit the blocks they dirty. This will also return ENOSPC if it hadn't moved anything just as the current code would. The result is that data fill can not necessarily hit the target. It might take multiple commits to fill the data_avail btree. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-15 20:47:41 -08:00
Zach Brown	7720222588	Have statfs use unlocked stable roots The server's statfs request handler was intending to lock dirty structures as they were walked to get sums used for statfs fields. Other callers walk stable structures, though, so the summation calls had grown iteration over other structures that the server didn't know it had to lock. This meant that the server was walking unlocked dirty structures as they were being modified. The races are very tight, but it can result in request handling errors that shut down connections and IO errors from trying to read inconsistent refs as they were modified by the locked writer. We've built up infrastructure so the server can now walk stable structures just like the other callers. It will no longer wander into dirty blocks so it doesn't need to lock them and it will retry if its walk of stale data crosses a broken reference. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	fff07ce19c	Use stale block read retrying helper Transition from manual checking for persistent ESTALE to the shared helper that we just added. This should not change behavior. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	464de56d28	Add stale block read retrying helper Many readers had little implementations of the logic to decide to retry stale reads with different refs or decide that they're persistent and return hard errors. Let's move that into a small helper. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	342c206550	Have scoutfs_forest_inode_count return stale reads scoutfs_forest_inode_count() assumed it was called with stable refs and would always translate ESTALE to EIO. Change it so that it passes ESTALE to the caller who is responsible for handling it. The server will use this to retry reading from stable supers that it's storing in memory. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	fe4734d019	Save a full stable super in the server The server has a mechanism for tracking the last stable roots used by network rpcs. We expand it a bit to include the entire super so that we can add users in the server which want the last full stable super. We can still use the stable super to give out the stable roots. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	b1a43bb312	Make quorum config use more precise The quorum code was using the copy of the super block in the sb info for its config. With that going away we make different users more carefully reference the config. The quorum agent has a copy that it reads on setup, the client rarely reads a copy when trying to connect, and the server uses its super. This is about data access isolation and should have no functional effect other than to cause more super reads. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	929703213f	Add fsid sbi field A few paths throughout the code get the fsid for the current mount by using the copy of the super block that we store in the scoutfs_sb_info for the mount. We'd like to remove the super block from the sbi and it's cleaner to have a specific constant field for the fsid of the mount which will not change. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-12 14:59:22 -08:00
Zach Brown	78279ffb4a	Merge pull request #108 from versity/zab/v1.10 v1.10 Release	2022-12-07 13:33:45 -08:00
Zach Brown	0b919e2ba7	v1.10 Release Finish the release notes for the 1.10 release. Signed-off-by: Zach Brown <zab@versity.com> v1.10	2022-12-07 12:30:17 -08:00
Zach Brown	bb5267f0c9	Merge pull request #107 from versity/zab/write_truncated_zero_tail Zab/write truncated zero tail	2022-12-06 11:31:52 -08:00
Zach Brown	6d4916954b	Add basic-truncate test Signed-off-by: Zach Brown <zab@versity.com>	2022-12-06 10:31:31 -08:00
Zach Brown	8e067b3d3f	Truncate dirties zero tail extension When we truncate away from a partial block we need to zero its tail that was past i_size and dirty it so that it's written. We missed the typical vfs boilerplate of calling block_truncate_page from setattr->set_size that does this. We need to be a little careful to pass our file lock down to get_block and then queue the inode for writeback so its written out with the transaction. This follows the pattern in .write_end. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-06 10:31:31 -08:00
Zach Brown	87500e8bb5	Merge pull request #106 from versity/zab/invalidation_dprune_iput Zab/invalidation dprune iput	2022-12-02 13:23:56 -08:00
Zach Brown	41174867ed	Add t_get_sysfs_mount_option test func Add a quick little function to get the value of a mount option. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-02 12:28:13 -08:00
Zach Brown	276fbebdac	Avoid dput in lock invalidation The d_prune_aliases in lock invalidation was thought to be safe because the caller had an inode refernece, surely it can't get into iput_final. I missed the fundamental dcache pattern that dput can ascend through parents and end up in inode eviction for entirely unrelated inodes. It's very easy for this to deadlock, imagine if nothing else that the inode invalidation is blocked on in dput->iput->evict->delete->lock is itself in the list of locks to invalidate in the caller. We fix this by always kicking off d_prune and dput into async work. This increases the chance that inodes will still be referenced after invalidation and prevent inline deletion. More deletions can be deferred until the orphan scanner finds them. It should be rare, though. We're still likely to put and drop invalidated inodes before a writer gets around to removing the final unlink and asking us for the omap that describes our cached inodes. To perform the d_prune in work we make it a behavioural flag and make our queued iputs a little more robust. We use much safer and understandable locking to cover the count and the new flags and we put the work in re-entrant work in their own workqueue instead of one work instance in the system_wq. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-02 12:28:13 -08:00
Zach Brown	03df993e14	Merge pull request #105 from versity/zab/cw_item_vers Zab/cw item vers	2022-11-30 11:10:18 -08:00
Zach Brown	701f1a9538	Add test that checks duplicate meta_seq entries Add a quick test of the index items to make sure that rapid inode updates don't create duplicate meta_seq items. Signed-off-by: Zach Brown <zab@versity.com>	2022-11-15 13:26:32 -08:00
Zach Brown	71ed4512dc	Include primary lock write_seq for write_only vers FS items are deleted by logging a deletion item that has a greater item version than the item to delete. The versions are usually maintained by the write_seq of the exclusive write lock that protects the item. Any newer write hold will have a greater version than all previous write holds so any items created under the lock will have a greater vers than all previous items under the lock. All deletion items will be merged with the older item and both will be dropped. This doesn't work for concurrent write-only locks. The write-only locks match with each other so their write_seqs are asssigned in the order that they are granted. That grant order can be mismatched with item creation order. We can get deletion items with lesser versions than the item to delete because of when each creation's write-only lock was granted. Write only locks are used to maintain consistency between concurrent writers and readers, not between writers. Consistency between writers is done with another primary write lock. For example, if you're writing seq items to a write-only region you need to have the write lock on the inode for the specific seq item you're writing. The fix, then, is to pass these primary write locks down to the item cache so that it can chose an item version that is the greatest amongst the transaction, the write-only lock, and the primary lock. This now ensures that the primary lock's increasing write_seq makes it down to the item, bringing item version ordering in line with exclusive holds of the primary lock. All of this to fix concurrent inode updates sometimes leaving behind duplicate meta_seq items because old seq item deletions ended up with older versions than the seq item they tried to delete, nullifying the deletion. Signed-off-by: Zach Brown <zab@versity.com>	2022-11-15 13:26:32 -08:00
Zach Brown	57dff347a6	Merge pull request #104 from versity/zab/v1.9 v1.9 Release	2022-10-29 17:41:51 -07:00
Zach Brown	fb7cb057c4	v1.9 Release Finish the release notes for the 1.9 release. Signed-off-by: Zach Brown <zab@versity.com> v1.9	2022-10-29 16:41:58 -07:00
Zach Brown	1b924c501e	Merge pull request #103 from versity/zab/verify_dentry_errors Zab/verify dentry errors	2022-10-27 16:15:53 -07:00
Zach Brown	aed4313995	Simplify dentry verification Now that we've removed the hash and pos from the dentry_info struct we can do without it. We can store the refresh gen in the d_fsdsta pointer (sorry, 64bit only for now.. could allocate if we needed to.) This gets rid of the lock coverage spinlocks and puts a bit more pressure on lock lookup, which we already know we have to make more efficient. We can get rid of all the dentry info allocation calls. Now that we're not setting d_op as we allocate d_fsdata we put the ops on the super block so that we get d_revalidate called on all our dentries. We also are a bit more precise about the errors we can return from verification. If the target of a dentry link changes then we return -ESTALE rather than silently performing the caller's operation on another inode. Signed-off-by: Zach Brown <zab@versity.com>	2022-10-27 14:32:06 -07:00
Zach Brown	61d86f7718	Add scoutfs_lock_ino_refresh_gen Add a lock call to get the current refresh_gen of a held lock. If the lock doesn't exist or isn't readable then we return 0. This an be used to track lock coverage of structures without the overhead and lifetime binding of the lock coverage struct. Signed-off-by: Zach Brown <zab@versity.com>	2022-10-27 14:16:07 -07:00
Zach Brown	717b56698a	Remove __exit from scoutfs_sysfs_exit() scoutfs_sysfs_exit() is called during error handling in module init. When scoutfs is built-in (so, never.) the __exit section won't be loaded. Remove the __exit annotation so it's always available to be called. Signed-off-by: Zach Brown <zab@versity.com>	2022-10-26 16:42:27 -07:00

... 2 3 4 5 6 ...

1863 Commits