scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-04-21 22:10:30 +00:00

Author	SHA1	Message	Date
Mark Fasheh	72a8e9e171	scoutfs: pull in some of ocfs2 stackglue Dlmglue is built on top of this. Bring in the portions we need which includes the stackglue API as well as most of the fs/dlm implementation. I left off the Ocfs2 specific version and connection handling. Also left out is the old Ocfs2 dlm support which we'll never want. Like dlmglue, we keep as much of the generic stackglue code in tact here. This will make translating to/from upstream patches much easier. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 21:40:20 -05:00
Mark Fasheh	960f8e08bb	scoutfs: copy in DLM_LVB_LEN from fs/ocfs2/dlm/dlmapi.h Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 19:06:18 -05:00
Mark Fasheh	114760365c	scoutfs: fix up ocfs2_log_dlm_error() We're still referencing some ocfs2 specific lock names here, take them out. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 19:00:05 -05:00
Mark Fasheh	61499c5d30	scoutfs: pull in struct ocfs2_dlm_debug from fs/ocfs2/ocfs2.h We need this for the dlmglue global context. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:52:49 -05:00
Mark Fasheh	1b59ed99fb	scoutfs: remove ocfs2_lock_res->l_type We don't need it - this the only ocfs2-ism in struct ocfs2_lock_res. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:51:38 -05:00
Mark Fasheh	bb100356d9	scoutfs: pull in some fields from ocfs2_super for dlmglue This is all the dlmglue global context needed. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:37:06 -05:00
Mark Fasheh	1831014c24	scoutfs: remove usage of ocfs2_lock_type_string() This only leaked into the bast function. I retained the debug print - it'll be turned off in our build anyway, and that's what we'd want to do upstream. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:16:00 -05:00
Mark Fasheh	13963d22e3	scoutfs: pull in OCFS2_LOCK_ID_MAX_LEN We need this for the lockres name. It also turns out to be the only thing we need from fs/ocfs2/ocfs2_lockid.h. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:12:54 -05:00
Mark Fasheh	9bfb9c059d	scoutfs: copy struct ocfs2_lock_res Grab this from fs/ocfs2/ocfs2.h and put it in dlmglue.h. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:07:52 -05:00
Mark Fasheh	99d00a5a2f	scoutfs: dlmglue needs to #include "dlmglue.h" Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 18:07:31 -05:00
Mark Fasheh	2142648906	scoutfs: include linux/dlm.h dlmglue needs this as we're no longer hooking it into the stackglue component. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 17:59:10 -05:00
Mark Fasheh	498a2f3721	scoutfs: ifdef out usage of OCFS2_LOCK_TYPE_DENTRY Some of this leaks through even after the big #ifdef'ing - ocfs2 had to special case printing the name of dentry locks. We don't have such a need so it's easy to drop those calls. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 17:57:34 -05:00
Mark Fasheh	bf6020c22b	scoutfs: hide lockdep_keys in dlmglue for now This belongs behind #ifdef CONFIG_DEBUG_LOCK_ALLOC in the upstream code too. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 17:47:50 -05:00
Mark Fasheh	d4a89a5fbc	scoutfs: dlmglue ifdef out ocfs2_build_lock_name() This was missed in the initial #ifdef patch. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 17:46:55 -05:00
Mark Fasheh	500baca533	scoutfs: wrap some mlog calls in dlmglue Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 17:15:23 -05:00
Mark Fasheh	eae932e0fe	scoutfs: dlmglue fix sched.h header Upstream moved linux/sched.h to linux/sched/signal.h. Centos still uses the old header location. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 16:05:34 -05:00
Mark Fasheh	bc2fef7fc8	scoutfs: ifdef out ocfs2 specific callbacks and functions We only want the generic stuff. Long term the Ocfs2 specific code would be what's left in fs/ocfs2/dlmglue.[ch]. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 16:05:24 -05:00
Mark Fasheh	fc21a0253c	scoutfs: Hook dlmglue into our build system Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-23 15:54:08 -05:00
Mark Fasheh	f7e3f6f9e6	scoutfs: import fs/ocfs2/dlmglue.[ch] from Linux v4.13-rc6 Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-22 19:07:53 -05:00
Mark Fasheh	021404bb6a	scoutfs: remove inode ctime index Like the mtime index, this index is unused. Removing it is a near identical task. Running the same createmany test from our last patch gives us the following: $ createmany -o '/scoutfs/file_%lu' 10000000 total: 10000000 creates in 598.28 seconds: 16714.59 creates/second real 9m58.292s user 0m7.420s sys 5m44.632s So after both indices are gone, we go from a 12m56 run time to 9m58s, saving almost 3 minutes which translates into a total performance increase of about 23%. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-22 15:59:13 -07:00
Mark Fasheh	d59367262d	scoutfs: remove inode mtime index This index is unused - we can gain some create performance by removing it. To verify this, I ran createmany for 10 million files: $ createmany -o '/scoutfs/file_%lu' 10000000 Before this patch: total: 10000000 creates in 776.54 seconds: 12877.56 creates/second real 12m56.557s user 0m7.861s sys 6m56.986s After this patch: total: 10000000 creates in 691.92 seconds: 14452.46 creates/second real 11m31.936s user 0m7.785s sys 6m19.328s So removing the index gained us about a minute and a half on the test or a 12% performance increase. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-22 15:59:13 -07:00
Zach Brown	8135b18c76	scoutfs: start truncate from first block Truncation updates extents that intersect with the input range. It starts with the first block in the range and iterates until it has searched for all the extents that could cover the range. Extents are stored in items at their final block location so that we can use _next to find intersections. Truncation was searching for the next extent after the full extent that it was still searching for. That means it was starting the search at the last block in the extent, not the first. It would miss all the extents that didn't overlap with the last block it was searching for. This fixed by searching from a temporary single block extent at the start of the search range. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-17 15:29:08 -07:00
Mark Fasheh	d1ae486d83	scoutfs: provide ->llseek Without this we return -ESPIPE when a process tries to seek on a regular file. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-08-14 19:57:13 -07:00
Zach Brown	07bbc418c3	scoutfs: merge offline extents Offline extents weren't being merged because they all had their physical blkno set to 0 and all the extent calculations didn't treat them specially. They would only merge if the physical blocks of two extent were contiguous. Instead of special casing offline extents everywhere we store them with a physical blkno set to the logical blk_off. This lets all the current extent calculations work as expected. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-14 09:19:03 -07:00
Zach Brown	7cc09761f5	scoutfs: release item cleanup needs transaction Release tries to re-instate extents if it sees an error during release. Those item manipulations need to be covered by the transaction. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-14 09:19:03 -07:00
Zach Brown	c7ad9fe772	scoutfs: make release block granular The existing release interface specified byte regions to release but that didn't match what the underlying file data mapping structure is capable of. What happens if you specify a single byte to release? Does it release the whole block? Does it release nothing? Does it return an error? By making the interface match the capability of the operation we make the functioning of the system that much more predictable. Callers are forced to think about implementing their desires in terms of block granular releasing. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-14 09:19:03 -07:00
Zach Brown	87ab27beb1	scoutfs: add statfs network message The ->statfs method was still using the super_block in the super_info that was read during mount. This will get progressively more out of date. We add a network message to ask the server for the current fields that impact statfs. This is always racy and the fields are mostly nonsense, but we try our best. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-11 10:43:35 -07:00
Zach Brown	ba7bde30fc	scoutfs: delete inode index items Delete inode index items when deleting all the items associated with an inode after its been unlinked and had all its references dropped. The index items should always match the fields in the inode item so we read it to determine the index items that should be deleted, regardless of if we have the vfs inode cached or not. We take the opportunity to collapse the two callers of item deletion which looked up the inode into item deletion so that it can use the inode fields. The deletion of index items is partially verified by an inode index test in xfstests which makes sure that unlinked files are no longer present in the index. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-11 10:13:56 -07:00
Zach Brown	3768e3c41c	scoutfs: don't add dirs to data_seq index Directories were getting added to the data_seq index. It might have looked like they weren't because their data_seqs were always 0 but when inodes are created they don't have 'have_item' set so all the fields are added regardless of their current value. We'd rather not have to wade their directories when looking for regular file data in the data_seq index so let's explicitly test for regular files when updating the data_seq index items. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-11 10:13:56 -07:00
Zach Brown	1398b2316d	scoutfs: clean up racey inode index updates The updating of the inode index items was racey. It loaded the inode values, updated the items, loaded the fields again, and then stored the fields in the inode info. All without locking. Concurrent attempts could get the fields scrambled and racing with other paths that update the fields could get the items and inode info out of sync. This fixes up the two races by only reading the inode fields once and performing the multi-stage update under a mutex. We add a new lock to avoid ordering problems with trying to add an existing lock at these points in the locking heirarchy. We specifically use a mutex because the item functions can block. Now the inode index field update just has to safely race with concurrent access to the fields. This was found by generic/037 once getattr started refreshing the inode. It now passes again. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-11 10:07:42 -07:00
Zach Brown	cdb58a967a	scoutfs: give module fs scoutfs alias Use MODULE_ALIAS_FS() to register the "scoutfs" fs alias so that modprobe can find the module if it's installed and visible to depmod. We don't yet have clever enough xfstests to mess around with modules. I manually verified this by installing the module in /lib/modules and trying mount -t scoutfs before and after the change. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-10 18:07:26 -07:00
Zach Brown	c1b2ad9421	scoutfs: separate client and server net processing The networking code was really suffering by trying to combine the client and server processing paths into one file. The code can be a lot simpler by giving the client and server their own processing paths that take their different socket lifecysles into account. The client maintains a single connection. Blocked senders work on the socket under a sending mutex. The recv path runs in work that can be canceled after first shutting down the socket. A long running server work function acquires the listener lock, manages the listening socket, and accepts new sockets. Each accepted socket has a single recv work blocked waiting for requests. That then spawns concurrent processing work which sends replies under a sending mutex. All of this is torn down by shutting down sockets and canceling work which frees its context. All this restructuring makes it a lot easier to track what is happening in mount and unmount between the client and server. This fixes bugs where unmount was failing because the monolithic socket shutdown function was queueing other work while running while draining. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-04 10:47:42 -07:00
Zach Brown	74a80b772e	scoutfs: add endian_swap.h Add a helper header for conversions between little and big endian. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-04 10:44:06 -07:00
Zach Brown	b98f97e143	scoutfs: use hlist hash for data cursors The rhashtable API has changed over time. Continuing to use it means having to worry about maintaining different APIs in different kernel generations. We have a static pool of cursors so we don't need the flexibility of the resizable rhashtable. We can roll a simple array of hlist heads to use as a hash table. And finally, these cursors will probably disappear eventually anyway. Let's not invest too much in them. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-04 10:44:06 -07:00
Zach Brown	9f4095bffb	scoutfs: break the build if we export raw types Raw [su]{8,16,32,64} types keep leaking into our exported headers where they break userspace builds. Make sure that we only use the exported __ types and add a check to break our build if we get it wrong. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-04 10:37:49 -07:00
Zach Brown	cefe06af61	scoutfs: add git describe to built module It's handy to quickly find the git commit that built a given module. We add a MOD_INFO() tag for it so we can see it in modinfo on the built module. We add a ELF note that the kernel tracks in /sys/modules/$m/notes/ when the module is loaded. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-03 15:07:23 -07:00
Zach Brown	6d16034112	scoutfs: remove old dlm make -I We don't need arguments for a dlm build. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-03 15:07:23 -07:00
Zach Brown	65c3ac5043	scoutfs: Add cluster locking to node/file ops This gives us cluster locking for the overwhelming majority of metadata ops that scoutfs supports. In particular, we can create and modify file metadata from one node and immediately see the changes reflected on another node. In addition to synchonrization the cluster locks here are providing an I/O endpoint for our item cache, ensuring that it doesn't read stale items. Readdir and file read/write are notable exception - they require a more specific approach and will be implemented in a future patch. Signed-off-by: Mark Fasheh <mfasheh@versity.com> [fixed iget unlock and truncated commit message summary] Signed-off-by: Zach Brown <zab@versity.com>	2017-08-03 11:16:35 -07:00
Zach Brown	172cff5537	scoutfs: return -ENODATA from getxattr The conversion to the multi-item xattrs accidentally returned -EIO when an attribute wasn't found instead of -ENODATA. That broke a huge number of xfstests because ls can look up xattrs and return EIO. Signed-off-by: Zach Brown <zab@versity.com>	2017-08-02 11:16:12 -07:00
Mark Fasheh	325eadca9f	scoutfs: check for NULL lock in scoutfs_unlock This reduces the amount of duplicate code in callers and makes error handling easier. The alternative is to sprinkle the code with 'if (lock)' lines at the end of our functions. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-07-27 12:33:21 -07:00
Mark Fasheh	4ff2148f10	scoutfs: Don't use stale root in get_manifest_refs get_manifest_refs was using the btree root in its stale copy of the super block. It is supposed to use the btree root that it was given by its caller who went to the trouble of finding a sufficiently current btree root. Signed-off-by: Mark Fasheh <mfasheh@versity.com> [zab: added commit message and fixed formatting] Signed-off-by: Zach Brown <zab@versity.com>	2017-07-27 12:32:05 -07:00
Mark Fasheh	a65b28d440	scoutfs: lock impossible ino group for listen lock Otherwise we get into a problem where the listen lock is conflicting with regular inode group requests. Since we never drop the listen lock and it (by design) blocks progress on another node, those inode group requests may hang. Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-07-19 19:04:41 -05:00
Mark Fasheh	2d11f08f5e	scoutfs: Remove unused functions, scoutfs_[un]lock_addr Signed-off-by: Mark Fasheh <mfasheh@versity.com>	2017-07-19 19:04:41 -05:00
Zach Brown	13ebd8d18c	scoutfs: don't use delayed downconvert work The delayed downconvert work wasn't being canceled on shutdown. 60s after unmount at least the net lock's timer would fire and crash trying to queue the delayed work on the destroyed workqueue. Proactively unlocking the locks isn't always beneficial to begin with. The relative costs of mispredicting the future are wildly different if we have to re-read item caches from segments or have to downconvert a blocking read lock. So we can just remove the delayed work to fix the bug and remove a moving piece that would need to be considered and tuned. There's still a race where we can get basts after destroying the workqueue but before we destroy the lockspace, we'll get there. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	47b26d7888	scoutfs: add end to _item_delete Add the end argument to scoutfs_item_delete() to limit how many items it will read into the cache. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	d5b4677e7f	scoutfs: add end to _dirty, _delete_many, _update These transformations are mechanical and there aren't many callers of these so we combine them into one commit. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	d78ed098a7	scoutfs: add cache reading limit to _set_batch Add an end argument to _set_batch to specify the limit of items we'll read into the cache. And it turns out that the loop in _set_batch that meant to cache all the items covered by the batch didn't try hard enough. It would stop once the first key was covered but didn't make sure that the coverage extended to cover last. This can happen if segment boundaries happen to fall within the items that make up the batch. Fix it up while we're in here. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	0b64a4c83f	scoutfs: lock inode index item iteration Add locks around inode index item iteration. This is tricky because the inode index items are enormous and we can't default to coarse locks that let it read and iterate over the entire key space. We use the manifest to find the next small fixed size region to lock and iterate from. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	f611c769e2	scoutfs: add 'end' to item_next to limit reads Add an end key to the item_next calls to limit how many items will be read into the cache. Callers typically get this from the lock they hold that covers the iteration. We differentiate between iteration and caching so that a series of small iterations (listxattr on inodes, namespace walk in small dirs) can be satisfied by a single read of adjacent items from segments. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00
Zach Brown	4f6f842efa	scoutfs: add inode index item locking Add a locking wrapper for the inode index items. It maps the index fields to a lock name for each index type. Signed-off-by: Zach Brown <zab@versity.com>	2017-07-19 13:30:03 -07:00

1 2 3 4 5 ...

384 Commits