mirror of
https://github.com/versity/scoutfs.git
synced 2026-01-10 13:47:27 +00:00
We were seeing a number of problems coming from races that allowed tasks in a mount to try and concurrently delete an inode's items. We could see error messages indicating that deletion failed with -ENOENT, we could see users of inodes behave erratically as inodes were deleted from under them, and we could see eventual server errors trying to merge overlapping data extents which were "freed" (add to transaction lists) multiple times. This commit addresses the problems in one relatively large patch. While we could mechanically split up the fixes, they're all interdependent and splitting them up (bisecting through them) could cause failures that would be devilishly hard to diagnose. First we stop allowing multiple cached vfs inodes. This was initially done to avoid deadlocks between lock invalidation and final inode deletion. We add a specific lookup that's used by invalidation which ignores any inodes which are in I_NEW or I_FREEING. Now that iget can wait on inode flags we call iget5_locked before acquiring the cluster lock. This ensures that we can only have one cached vfs inode for a given inode number in evict_inode trying to delete. Now that we can only have one cached inode, we can rework the omap tracking to use _set and _clear instead of _inc and _put. This isn't strictly necessary but is a simplification and lets us issue warnings if we see that we ever try to set an inode numbers bit on behalf of multiple cached inodes. We also add a _test helper. Orphan scanning would try to perform deletion by instantiating a cached inode and then putting it, triggering eviction and final deletion. This was an attempt to simplify concurrency but ended up causing more problems. It no longer tries to interact with inode cache at all and attempts to safely delete inode items directly. It uses the omap test to determine that it should skip an already cached inode. We had attempted to forbid opening inodes by handle if they had an nlink of 0. Since we allowed multiple cached inodes for an inode number this was to prevent adding cached inodes that were being deleted. It was only performing the check on newly allocated inodes, though, so it could get a reference to the cached inode that the scanner had inserted for deleting. We're chosing to keep restricting opening by handle to only linked inodes so we also check existing inodes after they're refreshed. We're left with a task evicting an inode and the orphan scanner racing to delete an inode's items. We move the work of determining if its safe to delete out of scoutfs_omap_should_delete() and into try_delete_inode_items() which is called directly from eviction and scanning. This is mostly code motion but we do make three critical changes. We get rid of the goofy concurrent deletion detection in delete_inode_items() and instead use a bit in the lock data to serialize multiple attempts to delete an inode's items. We no longer assume that the inode must still be around because we were called from evict and specifically check that inode item is still present for deleting. Finally, we use the omap test to discover that we shouldn't delete an inode that is locally cached (and would be not be included to the omap response). We do all this under the inode write lock to serialize between mounts. Signed-off-by: Zach Brown <zab@versity.com>
24 lines
1014 B
C
24 lines
1014 B
C
#ifndef _SCOUTFS_OMAP_H_
|
|
#define _SCOUTFS_OMAP_H_
|
|
|
|
int scoutfs_omap_set(struct super_block *sb, u64 ino);
|
|
bool scoutfs_omap_test(struct super_block *sb, u64 ino);
|
|
void scoutfs_omap_clear(struct super_block *sb, u64 ino);
|
|
int scoutfs_omap_client_handle_request(struct super_block *sb, u64 id,
|
|
struct scoutfs_open_ino_map_args *args);
|
|
void scoutfs_omap_calc_group_nrs(u64 ino, u64 *group_nr, int *bit_nr);
|
|
|
|
int scoutfs_omap_add_rid(struct super_block *sb, u64 rid);
|
|
int scoutfs_omap_remove_rid(struct super_block *sb, u64 rid);
|
|
int scoutfs_omap_finished_recovery(struct super_block *sb);
|
|
int scoutfs_omap_server_handle_request(struct super_block *sb, u64 rid, u64 id,
|
|
struct scoutfs_open_ino_map_args *args);
|
|
int scoutfs_omap_server_handle_response(struct super_block *sb, u64 rid,
|
|
struct scoutfs_open_ino_map *resp_map);
|
|
void scoutfs_omap_server_shutdown(struct super_block *sb);
|
|
|
|
int scoutfs_omap_setup(struct super_block *sb);
|
|
void scoutfs_omap_destroy(struct super_block *sb);
|
|
|
|
#endif
|