scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-02-11 04:51:08 +00:00

Author	SHA1	Message	Date
Auke Kok	8cb08507d6	Do not copy to user while holding locks in scoutfs_data_fiemap() Now that we support mmap writes, at any point in time we could pagefault and lock for writes. That means - just like readdir - we can no longer lock and copy_to_user, since it also may page fault and thus deadlock. We statically allocate 32 extent entries on the stack and use these to shuffle out fiemap entries at a time, locking and unlocking around collecting and fiemap_fill_extent_next. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-27 14:49:04 -05:00
Auke Kok	b944f609aa	remap_pages ops becomes obsolete.	2025-01-23 14:28:40 -05:00
Auke Kok	519b47a53c	mmap() trace events. We merely trace exit values and position, and ignore length. Because vm_fault_t is __bitwise, sparse will loudly complain about a plain cast to u32, so we must __force (on el8). ret will be 512 in normal cases. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-23 14:28:40 -05:00
Benjamin LaHaise	3788d67101	Add support for writable shared mmap()ings Add support for writable MAP_SHARED mmap()ings. Avoid issues with late writepage()s building transactions by doing the block_write_begin() work in scoutfs_data_page_mkwrite(). Ensure the page is marked dirty and prepared for write, then let the VM complete the write when the page is flushed or invalidated. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-23 14:28:40 -05:00
Benjamin LaHaise	b7a3d03711	Add support for read only mmap() Adds the required memory mapped ops struct and page fault handler for reads. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-23 14:28:40 -05:00
Auke Kok	235ab133a7	We must provide a_ops->dirty_folio and invalidate_folio. In v5.17-rc4-53-g3a3bae50af5d, we can no longer omit having this method unhooked as the mm caller blindly calls it now. In-kernel filesystems all were fixed in this change. aops->invalidatepage was the old aops method that would free pages with private attached data. This method is replaced with the new invalidate_folio method. If this method is NULL, the memory will become orphaned. (v5.17-rc4-29-gf50015a596fa) Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	7d0e7e29f8	Avoid integer wrapping pitfalls for (off, len) pairs. We use check_add_overflow(a, b, d) here to validate that (off, len) pairs do not exceed the max value type. The kernel conveniently has several macros to sort out the problems with signed or unsigned types. However, we're not interested in purely seeing whether (a + b) overflows, because we're using this for (off, len) overflow checks, where the bytes we read are from 0 to len -1. We must therefore call this check with (b) being "len - 1". I've made sure that we don't accidentally fail when (len == 0) in all cases by making sure we've already checked this condition before, and moving code around as needed to ensure that (len > 0) in all cases where we check. The macro check_add_overflow requires a (d) argument in which temporarily the result of the addition is stored and then checked to see if an overflow occurred. We put a `tmp` variable on the stack of the correct type as needed to make the checks function. simple-release-extents test mistakenly relied on this buggy wrap code, so it needs fixing. The move-blocks test also got it wrong. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	69de6d7a74	Check for zero len in scoutfs_data_wait_check We consistently enter scoutfs_data_wait_check when len == 0 from scoutfs_aio_write() which directly passes the i_size_read() value, and for cases where we `echo >> $FILE` this is always reached. This can cause the wrapping check to fail since `0 + (0 - 1) < 0` which triggers the WARN_ON_ONCE wrap check that needs updating to allow certain operations on huge files. More importantly we can just omit all these checks if `len == 0` anyway, since they should always succeed and never should require taking all the locks. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	5b260e6b54	block_write_begin() no longer is being passed aop_flags. The flag is now obsolete, we don't need to set flags here or pass them. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	e2b06f2c92	mpage_readpage() is now replaced with mpage_read_folio. Folios are the new data types used for passing pages. For now, folios only appear to have a single page. Future kernels will change that. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	381f4543b7	Use iter based read/write to support splice and thus sendfile(). The iter based read/write calls can support splice in el9 if we hook up these calls, otherwise splice will stop working. ->write() similar to: v3.15-rc4-330-g8d0207652cbe. ->read() to generic implementation. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	345ebd0876	fiemap_prep replaces fiemap_check_flags. v5.7-rc4-53-gcddf8a2c4a82 The prep helper replaces the sanity checks. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	4ef64c6fcf	Vfs methods become user namespace mount aware. v5.11-rc4-24-g549c7297717c All of these VFS methods are now passed a user_namespace. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	8885486bc8	Add several low level includes. Newer kernels include less header dependencies by default, so we have to add these. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Zach Brown	38e6f11ee4	Add quota support Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	fb5331a1d9	Add inode retention bit Add a bit to the private scoutfs inode flags which indicates that the inode is in retention mode. The bit is visible through the _attr_x interface. It can only be set on regular files and when set it prevents modification to all but non-user xattrs. It can be cleared by root. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	0521bd0e6b	Make offline extent creation use one transaction Initially setattr_more followed the general pattern where extent manipulation might require multiple transactions if there are lots of extent items to work with. The scoutfs_data_init_offline_extent() function that creates an offline extent handled transactions itself. But in this case the call only supports adding a single offline extent. It will always use a small fixed amount of metadata and could be combined with other metadata changes in one atomic transaction. This changes scoutfs_data_init_offline_extent() to have the caller handle transactions, inode updates, etc. This lets the caller perform all the restore changes in one transaction. This interface change will then be used as we add another caller that adds a single offline extent in the same way. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	c296bc1959	Remove scoutfs_data_wait_check_iter scoutfs_data_wait_check_iter() was checking the contiguous region of the file starting at its pos and extending for iter_iov_count() bytes. The caller can do that with the previous _data_wait_check() method by providing the same count that _check_iter() was using. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-25 15:11:20 -07:00
Auke Kok	d480243c11	Support .read/write_iter callbacks in lieu of .aio_read/write The aio_read and aio_write callbacks are no longer used by newer kernels which now uses iter based readers and writers. We can avoid implementing plain .read and .write as an iter will be generated when needed for us automatically. We add a new data_wait_check_iter() function accordingly. With these methods removed from the kernel, the el8 kernel no longer uses the extended ops wrapper struct and is much closer now to upstream. As a result, a lot of methods are moving around from inode_dir_operations to and from inode_file_operations etc, and perhaps things will look a bit more structured as a result. As a result, we need a slightly different data_wait_check() that accounts for the iter and offset properly. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	bafecbc604	Implement .readahead for address_space_operations (aops). .readpages is obsolete in el8 kernels. We implement the .readahead method instead which is passed a struct readahead_control. We use the readahead_page(rac) accessor to retrieve page by page from the struct. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	ec50e66fff	Timespec64 changes for yr2038. Provide a fallback `current_time(inode)` implementation for older kernels. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	dac3f056a5	inode->i_mutex has been replaced with inode->i_rwsem. Since v4.6-rc3-27-g9902af79c01a, inode->i_mutex has been replaced with ->i_rwsem. However, long since whenever, inode_lock() and related functions already worked as intended and provided fully exclusive locking to the inode. To avoid a name clash on pre-rhel8 kernels, we have to rename a stack variable in `src/file.c`. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	430960ef3c	page_cache_release() is removed. put_page() instead. Even in 3.x, this already was equivalent. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	8e458f9230	PAGE_CACHE_SIZE was removed, replace with PAGE_SIZE. PAGE_CACHE_SIZE was previously defined to be equivalent to PAGE_SIZE. This symbol was removed in v4.6-rc1-32-g1fa64f198b9f. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Zach Brown	55f9435fad	Fix partial preallocation when _contig_only = 0 Data preallocation attempts to allocate large aligned regions of extents. It tried to fill the hole around a write offset that didn't contain an extent. It missed the case where there can be multiple extents between the start of the region and the hole. It could try to overwrite these additional existing extents and writes could return EINVAL. We fix this by trimming the preallocation to start at the write offset if there are any extents in the region before the write offset. The data preallocation test output has to be updated now that allocation extents won't grow towards the start of the region when there are existing extents. Signed-off-by: Zach Brown <zab@versity.com>	2023-07-17 09:36:09 -07:00
Zach Brown	847916860d	Advance move_blocks extent search offset The move_blocks ioctl finds extents to move in the source file by searching from the starting block offset of the region to move. Logically, this is fine. After each extent item is deleted the next search will find the next extent. The problem is that deleted items still exist in the item cache. The next iteration has to skip over all the deleted extents from the start of the region. This is fine with large extents, but with heavily fragmented extents this creates a huge amplification of the number of items to traverse when moving the fragmented extents in a large file. (It's not quite O(n^2)/2 for the total extents, deleted items are purged as we write out the dirty items in each transaction.. but it's still immense.) The fix is to simply start searching for the next extent after the one we just moved. Signed-off-by: Zach Brown <zab@versity.com>	2023-06-28 16:54:28 -07:00
Zach Brown	3d99fda0f6	Preallocate data around iblock when noncontig If the _contig_only option isn't set then we try to preallocate aligned regions of files. The initial implementation naively only allowed one preallocation attempt in each aligned region. If it got a small allocation that didn't fill the region then every future allocation in the region would be a single block. This changes every preallocation in the region to attempt to fill the hole in the region that iblock fell in. It uses an extra extent search (item cache search) to try and avoid thousands of single block allocations. Signed-off-by: Zach Brown <zab@versity.com>	2023-06-28 12:21:25 -07:00
Zach Brown	a23e7478a0	Fix move_blocks loop exit conditions The move_blocks ioctl intends to only move extents whose bytes fall inside i_size. This is easy except for a final extent that straddles an i_size that isn't aligned to 4K data blocks. The code that either checked for an extent being entirely past i_size or for limiting the number of blocks to move by i_size clumsily compared i_size offsets in bytes with extent counts in 4KB blocks. In just the right circumstances, probably with the help of a byte length to move that is much larger than i_size, the length calculation could result in trying to move 0 blocks. Once this hit the loop would keep finding that extent and calculating 0 blocks to move and would be stuck. We fix this by clamping the count of blocks in extents to move in terms of byte offsets at the start of the loop. This gets rid of the extra size checks and byte offset use in the loop. We also add a sanity check to make sure that we can't get stuck if, say, corruption resulted in an otherwise impossible zero length extent. Signed-off-by: Zach Brown <zab@versity.com>	2023-01-10 09:34:52 -08:00
Zach Brown	8e067b3d3f	Truncate dirties zero tail extension When we truncate away from a partial block we need to zero its tail that was past i_size and dirty it so that it's written. We missed the typical vfs boilerplate of calling block_truncate_page from setattr->set_size that does this. We need to be a little careful to pass our file lock down to get_block and then queue the inode for writeback so its written out with the transaction. This follows the pattern in .write_end. Signed-off-by: Zach Brown <zab@versity.com>	2022-12-06 10:31:31 -08:00
Zach Brown	ef2daf8857	Make data preallocation tunable Make mount options for the size of preallocation and whether or not it should be restricted to extending writes. Disabling the default restriction to streaming writes lets it preallocate in aligned regions of the preallocation size when they contain no extents. Signed-off-by: Zach Brown <zab@versity.com>	2022-10-14 14:03:35 -07:00
Zach Brown	4d6350b3b0	Fix lock ordering in fallocate We were seeing ABBA deadlocks on the dio_count wait and extent_sem between fallocate and reads. It turns out that fallocate got lock ordering wrong. This brings fallocate in line with the rest of the adherents to the lock heirarchy. Most importantly, the extent_sem is used after the dio_count. While we're at it we bring the i_mutex down to just before the cluster lock for consistency. Signed-off-by: Zach Brown <zab@versity.com>	2022-02-17 14:48:13 -08:00
Zach Brown	ea2b01434e	Add support for i_version This adds i_version to our inode and maintains it as we allocate, load, modify, and store inodes. We set the flag in the superblock so in-kernel users can use i_version to see changes in our inodes. Signed-off-by: Zach Brown <zab@versity.com>	2021-09-13 14:41:07 -07:00
Zach Brown	192f077c16	Update data_version when fallocate changes size Changing the file size can changes the file contents -- reads will change when they stop returning data. fallocate can change the file size and if it does it should increment the data_version, just like setattr does. Signed-off-by: Zach Brown <zab@versity.com>	2021-07-30 13:22:42 -07:00
Zach Brown	099a65ab07	Try recovering from truncate errors and more info We're seeing errors during truncate that are surprising. Let's try and recover from them and provide more info when they happen so that we can dig deeper. Signed-off-by: Zach Brown <zab@versity.com>	2021-07-30 11:34:52 -07:00
Zach Brown	73bf916182	Return ENOSPC as space gets low Returning ENOSPC is challenging because we have clients working on allocators which are a fraction of the whole and we use COW transactions so we need to be able to allocate to free. This adds support for returning ENOSPC to client posix allocators as free space gets low. For metadata, we reserve a number of free blocks for making progress with client and server transactions which can free space. The server sets the low flag in a client's allocator if we start to dip into reserved blocks. In the client we add an argument to entering a transaction which indicates if we're allocating new space (as opposed to just modifying existing data or freeing). When an allocating transaction runs low and the server low flag is set then we return ENOSPC. Adding an argument to transaciton holders and having it return ENOSPC gave us the opportunity to clean it up and make it a little clearer. More work is done outside the wait_event function and it now specifically waits for a transaction to cycle when it forces a commit rather than spinning until the transaction worker acquires the lock and stops it. For data the same pattern applies except there are no reserved blocks and we don't COW data so it's a simple case of returning the hard ENOSPC when the data allocator flag is set. The server needs to consider the reserved count when refilling the client's meta_avail allocator and when swapping between the two meta_avail and meta_free allocators. We add the reserved metadata block count to statfs_more so that df can subtract it from the free meta blocks and make it clear when enospc is going to be returned for metadata allocations. We increase the minimum device size in mkfs so that small testing devices provide sufficient reserved blocks. And finally we add a little test that makes sure we can fill both metadata and data to ENOSPC and then recover by deleting what we filled. Signed-off-by: Zach Brown <zab@versity.com>	2021-07-07 14:13:14 -07:00
Zach Brown	63148d426e	Fix accidental EINVAL in move_blocks When move blocks is staging it requires an overlapping offline extent to cover the entire region to move. It performs the stage by modifying extents at a time. If there are fragmented source extents it will modify each of them at a time in the region. When looking for the extent to match the source extent it looked from the iblock of the start of the whole operation, not the start of the source extent it's matching. This meant that it would find a the first previous online extent it just modified, which wouldn't be online, and would return -EINVAL. The fix is to have it search from the logical start of the extent it's trying to match, not the start of the region. Signed-off-by: Zach Brown <zab@versity.com>	2021-04-23 10:39:34 -07:00
Andy Grover	0deb232d3f	Support O_TMPFILE and allow MOVE_BLOCKS into released extents Support O_TMPFILE: Create an unlinked file and put it on the orphan list. If it ever gains a link, take it off the orphan list. Change MOVE_BLOCKS ioctl to allow moving blocks into offline extent ranges. Ioctl callers must set a new flag to enable this operation mode. RH-compat: tmpfile support it actually backported by RH into 3.10 kernel. We need to use some of their kabi-maintaining wrappers to use it: use a struct inode_operations_wrapper instead of base struct inode_operations, set S_IOPS_WRAPPER flag in i_flags. This lets RH's modified vfs_tmpfile() find our tmpfile fn pointer. Add a test that tests both creating tmpfiles as well as moving their contents into a destination file via MOVE_BLOCKS. xfstests common/004 now runs because tmpfile is supported. Signed-off-by: Andy Grover <agrover@versity.com>	2021-04-05 14:23:44 -07:00
Andy Grover	355eac79d2	Retry if transaction cannot alloc for fallocate or write Add a new distinguishable return value (ENOBUFS) from allocator for if the transaction cannot alloc space. This doesn't mean the filesystem is full -- opening a new transaction may result in forward progress. Alter fallocate and get_blocks code to check for this err val and retry with a new transaction. Handling actual ENOSPC can still happen, of course. Add counter called "alloc_trans_retry" and increment it from both spots. Signed-off-by: Andy Grover <agrover@versity.com> [zab@versity.com: fixed up write_begin error paths]	2021-01-25 09:32:01 -08:00
Andy Grover	bed33c7ffd	Remove item accounting Remove kmod/src/count.h Remove scoutfs_trans_track_item() Remove reserved/actual fields from scoutfs_reservation Signed-off-by: Andy Grover <agrover@versity.com>	2021-01-20 17:01:08 -08:00
Zach Brown	3139d3ea68	Add move_blocks ioctl Add a relatively constrained ioctl that moves extents between regular files. This is intended to be used by tasks which combine many existing files into a much larger file without reading and writing all the file contents. Signed-off-by: Zach Brown <zab@versity.com>	2021-01-14 13:42:22 -08:00
Zach Brown	fc003a5038	Consistently sample data alloc total_len With many concurrent writers we were seeing excessive commits forced because it thought the data allocator was running low. The transaction was checking the raw total_len value in the data_avail alloc_root for the number of free data blocks. But this read wasn't locked, and allocators could completely remove a large free extent and then re-insert a slightly smaller free extent as they perform their alloction. The transaction could see a temporary very small total_len and trigger a commit. Data allocations are serialized by a heavy mutex so we don't want to have the reader try and use that to see a consistent total_len. Instead we create a data allocator run-time struct that has a consistent total_len that is updated after all the extent items are manipulated. This also gives us a place to put the caller's cached extent so that it can be included in the total_len, previously it wasn't included in the free total that the transaction saw. The file data allocator can then initialize and use this struct instead of its raw use of the root and cached extent. Then the transaction can sample its consistent total_len that reflects the root and cached extent. A subtle detail is that fallocate can't use _free_data to return an allocated extent on error to the avail pool. It instead frees into the data_free pool like normal frees. It doesn't really matter that this could prematurely drain the avail pool because it's in an error path. Signed-off-by: Zach Brown <zab@versity.com>	2021-01-06 09:25:32 -08:00
Zach Brown	807ae11ee9	Protect per-inode extent items with extent_sem Now that we have full precision extents a writer with i_mutex and a page lock can be modifying large extent items which cover much of the surrounding pages in the file. Readers can be in a different page with only the page lock and try to work with extent items as the writer is deleting and creating them. We add a per-inode rwsem which just protects file extent item manipulation. We try to acquire it as close to the item use as possible in data.c which is the only place we work with file extent items. This stops rare read corruption we were seeing where get_block in a reader was racing with extent item deletion in a stager at a further offset in the file. Signed-off-by: Zach Brown <zab@versity.com>	2020-12-15 11:56:50 -08:00
Zach Brown	e60f4e7082	scoutfs: use full extents for data and alloc Previously we'd avoided full extents in file data mapping items because we were deleting items from forest btrees directly. That created deletion items for every version of file extents as they were modified. Now we have the item cache which can remove deleted items from memory when deletion items aren't necessary. By layering file data extents on an extent layer, we can also transition allocators to use extents and fix a lot of problems in the radix block allocator. Most of this change is churn from changing allocator function and struct names. File data extents no longer have to manage loading and storing from and to packed extent items at a fixed granularity. All those loops are torn out and data operations now call the extent layer with their callbacks instead of calling its packed item extent functions. This now means that fallocate and especially restoring offline extents can use larger extents. Small file block allocation now comes from a cached extent which reduces item calls for small file data streaming writes. The big change in the server is to use more root structures to manage recursive modification instead of relying on the allocator to notice and do the right thing. The radix allocator tried to notice when it was actively operating on a root that it was also using to allocate and free metadata blocks. This resulted in a lot of bugs. Instead we now double buffer the server's avail and freed roots so that the server fills and drains the stable roots from the previous transaction. We also double buffer the core fs metadata avail root so that we can increase the time to reuse freed metadata blocks. The server now only moves free extents into client allocators when they fall below a low threshold. This reduces the shared modification of the client's allocator roots which requires cold block reads on both the client and server. Signed-off-by: Zach Brown <zab@versity.com>	2020-10-26 15:19:03 -07:00
Zach Brown	6bacd95aea	scoutfs: fs uses item cache instead of forest Use the new item cache for all the item work in the fs instead of calling into the forest of btrees. Most of this is mechanical conversion from the _forest calls to the _item calls. The item cache no longer supports the kvec argument for describing values so all the callers pass in the value pointer and length directly. The item cache doesn't support saving items as they're deleted and later restoring them from an error unwinding path. There were only two users of this. Directory entries can easily guarantee that deletion won't fail by dirtying the items first in the item cache. Xattr updates were a little trickier. They can combine dirtying, creating, updating, and deleting to atomically switch between items that describe different versions of a multi-item value. This also fixed a bug in the srch xattrs where replacing an xattr would create a new id for the xattr and leave existing srch items referencing a now deleted id. Replacing now reuses the old id. And finally we add back in the locking and transaction item cache integration. Signed-off-by: Zach Brown <zab@versity.com>	2020-08-26 14:39:12 -07:00
Zach Brown	177af7f746	scoutfs: use larger metadata blocks Introduce different constants for small and large metadata block sizes. The small 4KB size is used for the super block, quorum blocks, and as the granularity of file data block allocation. The larger 64KB size is used for the radix, btree, and forest bloom metadata block structures. The bulk of this are obvious transitions from the old single constant to the appropriate new constant. But there are a few more involved changes, though just barely. The block crc calculation now needs the caller to pass in the size of the block. The radix function to return free bytes instead returns free blocks and the caller is responsible for knowing how big its managed blocks are. Signed-off-by: Zach Brown <zab@versity.com>	2020-08-26 14:39:12 -07:00
Benjamin LaHaise	f5863142be	scoutfs: add data_wait_err for reporting errors Add support for reporting errors to data waiters via a new SCOUTFS_IOC_DATA_WAIT_ERR ioctl. This allows waiters to return an error to readers when staging fails. Signed-off-by: Benjamin LaHaise <bcrl@kvack.org> [zab: renamed to data_wait_err, took ino arg] Signed-off-by: Zach Brown <zab@versity.com>	2020-05-29 13:50:13 -07:00
Zach Brown	9ad86d4d29	scoutfs: commit trans before premature enospc File data allocations come from radix allocators which are populated by the server before each client transation. It's possible to fully consume the data allocator within one transaction if the number of dirty metadata blocks is kept low. This could result in premature ENOSPC. This was happening to the archive-light-cycle test. If the transactions performed by previous tests lined up just right then the creation of the initial test files could see ENOSPC and cause all sorts of nonsense in the rest of the test, culminating in cmp commands stuck in offline waits. This introduces high and low data allocator water marks for transactions. The server tries to fill data allocators for each transaction to the high water mark and the client forces the commit of a transaction if its data allocator falls below the low water mark. The archive-light-cycle test now passes easily and we see the trans_commit_data_alloc_low counter increasing during the test. Signed-off-by: Zach Brown <zab@versity.com>	2020-04-22 16:08:03 -07:00
Zach Brown	66f8b3814c	scoutfs: remove warning on reading while staging An incorrect warning condition was added as fallocate was implemented. It tried to warn against trying to read from the staging ioctl. But the staging boolean is set on the inode when the staging ioctl has the inode mutex. It protects against writes, but page reading doesn't use the mutex. It's perfectly acceptable for reads to be attempted while the staging ioctl is busy. We rely on it for a large read to consume staging being written. The warning caused reads to fail while the stager ioctl was working. Typically this would hit read-ahead and just force sync reads. But it could hit sync reads and cause EIO. Signed-off-by: Zach Brown <zab@versity.com>	2020-04-10 12:02:18 -07:00
Zach Brown	6ae0ac936c	scoutfs: fix setattr offline extent length We miscalculated the length of extents to create when initializing offline extents for setattr_more. We were clamping the extent length in each packed extent item by the full size of the offline extent, ignoring the iblock position that we were starting from. Signed-off-by: Zach Brown <zab@versity.com>	2020-04-03 14:46:48 -07:00
Zach Brown	88422c6405	scoutfs: fiemap with no extents returns 0 Don't return -ENOENT from fiemap on a file with no extents. The operation is supposed to succeed with no extents. Signed-off-by: Zach Brown <zab@versity.com>	2020-03-16 15:48:19 -07:00

1 2 3

123 Commits