scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-01-06 12:06:26 +00:00

Author	SHA1	Message	Date
Auke Kok	8a4b0967cb	Add fiemap output through scoutfs util. There's filefrag already, and that works, but, it's output is very inconsistent between various OS release versions, and it has already meant that we'd needed to adjust tests to account for these little but insignificant changes. A lot more work than useful. It's even more changed in el9. This adds `scoutfs get-fiemap FILE` and prints out block extent info with flags that we care about as an abbreviated letter: U for Unwritten, L for Last, and O for Unknown (as in, "offline"). The -P/--physical and -L/--logical options turn off logical or physical offset display, in case you only want to see the offsets in either units. You can pass -b/--byte to display offsets and lengths in byte values. The block size will then be obtained from fstat() of the queried file (4096 for scoutfs). I've removed all uses of filefrag from our scoutfs tests. Xfstests still calls it but their internal diff takes care of that issue. Where needed and appropriate, the tests are adjusted so that the output of `scoutfs get-fiemap` is as close as it can to what it used to be, so that reading the test results allows the quick view of what might have been going wrong. There are some output strings I have not bothered to update because there's no real value to updating every output string to match, and we just adjust the golden file accordingly. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	606c519e96	Simple-staging doesn't actually test overflow. This isn't a simple case where we can use u64_region_wraps because length is s32. Let's actually test an overflow case instead of a case that doesn't overflow, though. We still should properly add an overflow test here as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	7d0e7e29f8	Avoid integer wrapping pitfalls for (off, len) pairs. We use check_add_overflow(a, b, d) here to validate that (off, len) pairs do not exceed the max value type. The kernel conveniently has several macros to sort out the problems with signed or unsigned types. However, we're not interested in purely seeing whether (a + b) overflows, because we're using this for (off, len) overflow checks, where the bytes we read are from 0 to len -1. We must therefore call this check with (b) being "len - 1". I've made sure that we don't accidentally fail when (len == 0) in all cases by making sure we've already checked this condition before, and moving code around as needed to ensure that (len > 0) in all cases where we check. The macro check_add_overflow requires a (d) argument in which temporarily the result of the addition is stored and then checked to see if an overflow occurred. We put a `tmp` variable on the stack of the correct type as needed to make the checks function. simple-release-extents test mistakenly relied on this buggy wrap code, so it needs fixing. The move-blocks test also got it wrong. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	6d42d260cf	xargs option conflict now a warning in el9 The warnings thrown by el9's version of xargs are unexpected output and cause this test to fail. When using the -I option (replace) the -n 1 arguments are always assumed. In el7/8 no warnings were printed. We can just remove `-n 1` since the argument is never needed. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	b45fbe0bbb	Don't pass data version to attr_x unless the ioctl means to set it. The wrapper in setattr_more that translates the operations to attr_x needs to decide whether to ask attr_x to perform a change to any of the fields passed to it or not. For the date and size fields this is implicit - we always tell attr_x to change them. For any of the other fields, it should be explicit. The only field that is in the struct that this applies to is data_version. Because the data version field by default is zero, we use that as condition to decide whether to pass the data_version down to attr_x. Previously, the code would always pass a data_version=0 down to attr_x, triggering one of the validity checks, making it return -EINVAL. We add a simple test case to test for this issue. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-09-27 19:31:22 -04:00
Auke Kok	9d8ac2c7d7	Write to kmsg which test we're executing. This is done by xfstests and it's so much easier to follow what is going on from logs or e.g. serial console that I thought I should do this for scoutfs tests as well. It makes it so much easier to discern which test may have been cause for issues when running a bunch of tests and you're looking back at logs later. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-08-28 14:36:55 -07:00
Auke Kok	7b039a1d18	Add basic POSIX ACL tests. These are extremely limited and very quick basic ACL tests we can trivially do in under a second - purely basic funtionality tests only. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-08-12 15:07:43 -04:00
Zach Brown	aeb1dbc5f5	Merge pull request #180 from versity/zab/cleanup_large_test_tmp_output Clean up large test files	2024-07-23 10:44:07 -07:00
Zach Brown	e20d3ae1e8	Clean up large test files The test harness provides a TMP directory for tests to use. It's badly named. It's meant to be more of a scratch directory that is not on the FS being tested. Tests use it both for small log files that give insight into the platform and for large generated files that are not worth saving. We want to save the directory after test runs to get at the log files, but we don't want to burn a ton of space also saving large generated files This updates the handful of tests to remove their handful of files that are large enough to be a problem. With these out of the way we can save the tmp/ directory without its space consumption getting out of hand. Signed-off-by: Zach Brown <zab@versity.com>	2024-07-22 14:08:32 -07:00
Auke Kok	db445ce517	Fix the debug output of client-unmount-recovery The script really wants to print rid instead of pid. But in case of failure, we can just dump the arrays as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-07-22 14:02:53 -04:00
Auke Kok	7eaed848ed	Increase time measurement accuracy beyond whole seconds. We can rely on `bc` and `date` to record, manipulate and compare time data with nanosecond precision. This fixes timing issues on faster systems where this test completes a single pass of createmany in under 1.0 second, causing the math to always fail. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-07-12 15:28:28 -04:00
Zach Brown	8c06302984	Let run-tests specify mkfs format version Add a run-tests -V option that passes through the -V option to mkfs so that runs can specify the format version that the primary volume will have. This doesn't affect the scratch file system versions. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	6a17dc335f	Add quota tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	442980f1c9	Add project ID tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	82c2d0b1d0	Add o_tmpfile_linkat test binary Add a test binary that uses o_tmpfile and linkat to create a file in a given dir. We have something similar, but it's weirdly specific to a given test. This is a simpler building block that could be used by more tests. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	60ca950f42	Drop caches in totl test Now that the _READ_XATTR_TOTALS ioctl uses the weak item cache we have to drop caches before each attempt to read the xattrs that we just wrote and synced. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Bryant G. Duffy-Ly	460f3ce503	Add unit tests for retention Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com> [zab@versity.com: refactored for retention, added test cases] Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	5a53e7144d	Add format-version back/forward compat test Signed-off-by: Zach Brown <zab@versity.com> Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	a23877b150	Add fs test functions for mounted paths We have some fs functions which return info based on the test mount nr as the test has setup. This refactors those a bit to also provide some of the info when the caller has a path in a given mount. This will let tests work with scratch mounts a little more easily. Signed-off-by: Zach Brown <zab@versity.com> Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	5ccdf3c9f0	Add T_MODULE for tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	b552406427	Ignore spurious KASAN unwind warning KASAN could raise a spurious warning if the unwinder started in code without ORC metadata and tried to access in the KASAN stack frame redzones. This was fixed upstream but we can rarely see it in older kernels. We can ignore these messages. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-21 12:25:16 -08:00
Zach Brown	03ab5cedb6	clean up createmany-parallel-mounts test This test is trying to make sure that concurrent work isn't much, much, slower than individual work. It does this by timing creating a bunch of files in a dir on a mount and then timing doing the same in two mounts concurrently. But it messed it up the concurrency pretty badly. It had the concurrent createmany tasks creating files with a full path. That means that every create is trying to read all the parent directories. The way inode number allocation works means that one of the mounts is likely to be getting a write lock that includes a shared parent. This created a ton of cluster lock contention between the two tasks. Then it didn't sync the creates between phases. It could be accidentally recording the time it took to write out the dirty single creates as time taken during the parallel creates. By syncing between phases and having the createmany tasks create files relative to their per-mount directories we actually perform concurrent work and test that we're not creating contention outside of the task load. This became a problem as we switched from loopback devices to device mapper devices. The loopback writers were using buffered writes so we were masking the io cost of constantly invalidating and refilling the item cache by turning the reads into memory copies out of the page cache. While we're in here we actually clean up the created files and then use t_fail to fail the test while the files still exist so they can be examined. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 15:12:57 -08:00
Zach Brown	2b94cd6468	Add loop module kernel message filter Now that we're not setting up per-mount loopback devices we can not have the loop module loaded until tests are running. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 13:39:38 -08:00
Zach Brown	5507ee5351	Use device-mapper for per-mount test devices We don't directly mount the underlying devices for each mount because the kernel notices multiple mounts and doesn't setup a new super block for each. Previously the script used loopback devices to create the local shared block construct 'cause it was easy. This introduced corruption of blocks that saw concurrent read and write IOs. The buffered kernel file IO paths that loopback eventually degrades into by default (via splice) could have buffered readers copying out of pages without the page lock while writers modified the page. This manifest as occasional crc failure of blocks that we knowingly issue concurrent reads and writes to from multiple mounts (the quorum and super blocks). This changes the script to use device-mapper linear passthrough devices. Their IOs don't hit a caching layer and don't provide an opportunity to corrupt blocks. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 13:39:38 -08:00
Zach Brown	6daf24ff37	Extend hung task timeout for large-fragmented-free Our large fragmented free test creates pathologically file extents which are as expensive as possible to free. We know that debugging kernels can take a long time to do this so we can extend the hung task timeout. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-14 15:01:37 -08:00
Zach Brown	d94e49eb63	Fix quoted glob in srch-basic-functionality One of the phases of this test wanted to delete files but got the glob quoting wrong. This didn't matter for the original test but when we changed the test to use its own xattr name then those existing undeleted files got confused with other files in later phases of the test. This changes the test to delete the files with a more reliable find pattern instead of using shell glob expansion. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-09 14:16:36 -08:00
Zach Brown	bf21699ad7	bulk_create_paths test tool takes xattr name Previously the bulk_create_paths test tool used the same xattr name for each category of xattrs it was creating. This created a problem where two tests got their xattrs confused with each other. The first test created a bunch of srch xattrs, failed, and didn't clean up after itself. The second test saw these search xattrs as its own and got very confused when there were far more srch xattrs than it thought it had created. This lets each test specify the srch xattr names that are created by bulk_create_paths so that tests can work with their xattrs independent of each other. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-09 14:15:44 -08:00
Zach Brown	c7c67a173d	Specifically wait for compaction in srch test We just added a test to try and get srch compaction stuck by having an input file continue at a specific offset. To exercise the bug the test needs to perform 6 compactions. It needs to merge 4 sets of logs into 4 sorted files, it needs to make partial progress merging those 4 sorted files into another file, and then finall attempt to continue compacting from the partial progress offset. The first version of the test didn't necessarily ensure that these compactions happened. It created far too many log files then just waited for time to pass. If the host was slow then the mounts may not make it through the initial logs to try and compact the sorted files. The triggers wouldn't fire and the test would fail. These changes much more carefully orchestrate and watch the various steps of compaction to make sure that we trigger the bug. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-09 14:13:13 -08:00
Zach Brown	21c070b42d	Add test for srch continutation safe pos errors Add a test for srch compaction getting stuck hitting errors continuing a partial operation. It ensures that a block has an encoded entry at the _SAFE_BYTES offset, that an operaton stops precisely at that offset, and then watches for errors. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-07 12:34:00 -08:00
Zach Brown	77fbf92968	Add t_trigger_set helper Add a helper to arm or disarm a trigger with a value argument. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-07 12:12:10 -08:00
Auke Kok	707e1b2d59	Ensure dd creates the full 8K input test file. Without `iflag=fullblock` we encounter sporadic cases where the input file to the truncate test isn't fully written to 8K and ends up to be only 4K. The subsequent truncate tests then fail. We add a check to the input test file size just to be sure in the future. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-23 17:04:19 -04:00
Zach Brown	d71583bcf5	Merge pull request #134 from versity/auke/tests-add-bc Add `bc` to test requirement.	2023-10-16 15:12:22 -07:00
Zach Brown	bb835b948d	Merge pull request #138 from versity/auke/ignore-journald-rotate Filter out journald rotate messages.	2023-10-16 14:54:56 -07:00
Auke Kok	7ceb215c91	Filter out journald rotate messages. On el9 distros systemd-journald will log rotation events into kmesg. Since the default logs on VM images are transient only, they are rotated several times during a single test cycle, causing test failures. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-12 12:27:41 -04:00
Auke Kok	d4d2b0850b	Add `bc` to test requirement. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-12 12:21:29 -04:00
Zach Brown	cf05aefe50	t_quiet appends command output The t_quiet test command execution helper was constantly truncating the quiet.log with the output of each command. It was meant to show each command and its output as they're run. Signed-off-by: Zach Brown <zab@versity.com>	2023-10-11 14:50:04 -07:00
Auke Kok	0e1e55d25b	Ignore `last` flag output by filefrag. New versions of filefrag will output the presence of the `last` flag as well, but we don't care. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	a7704e0b56	Allow the kernel to return -ESTALE from orphan-inode test In newer kernels, we always get -ESTALE because the inode has been marked immediately as deleting. Since this is expected behavior we should not fail the test here on this error value. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	819df4be60	Skip userns based testing for RHEL8. In RHEL7, this was skipped automatically. In RHEL8, we don't support the needed passing through of the actual user namespace into our ACL set/get handlers. Once we get around v5.11 or so, the handlers are automatically passed the namespace. Until then, skip this test. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	11c041d2ea	New versions of getfattr will quote empty attr values. Instead of messing with quotes and using grep for the correct xattr name, directly query the value of the xattr being tested only, and compare that to the input. Side effect is that this is significantly simpler and faster. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	46e8dfe884	Account for coreutils using statx() call instead of stat() `stat` internally switched to using the new `statx` syscall, and this affects the output of perror() subsequently. This is the same error as before (and expected). Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	a9beeaf5da	Account for e2fsprogs output format changes. The filefrag program in e2fsprogs-v1.42.10-10-g29758d2f now includes an extra flag, and changes how the `unknown` flag is output. We essentially adjust for this "new" golden value on the fly if we encounter it. We don't expect future changes to the output. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	205d8ebd4a	Account for quoting style changes in coreutils. In older versions of coreutils, quoted strings are occasionally output using utf-8 open/close single quotes. New versions of coreutils will exclusively use the ASCII single quote character "'" when the output is not a TTY - as is the case with all test scripts. We can avoid most of these problems by always setting LC_ALL=C in testing, however. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Auke Kok	e580f33f82	Ignore loop device resizing messages. These occasionally trigger during tests. Signed-off-by: Auke Kok <auke.kok@versity.com>	2023-10-09 15:35:40 -04:00
Zach Brown	14eddb6420	Remove seq test from fence-and-reclaim The fence-and-reclaim test has a little function that runs after fencing and recovery to make sure that all the mounts are operational again. The main thing it does is re-use the same locks across a lot of files to ensure that lock recovery didn't lose any locks that stop forward progress. But I also threw in a test of the committed_seq machinery, as a bit of belt and suspenders. The problem is the test is racey. It samples the seq after the write so the greatest seq it rememebers can be after the write and will not be committed by the other nodes reads. It being less than the committed_seq is a totally reasonable race. Which explains why this test has been rarely failing since it was written. There's no particular reason to test the committed_seq machinery here, so we can just remove that racey test. Signed-off-by: Zach Brown <zab@versity.com>	2023-10-09 10:56:15 -07:00
Zach Brown	55f9435fad	Fix partial preallocation when _contig_only = 0 Data preallocation attempts to allocate large aligned regions of extents. It tried to fill the hole around a write offset that didn't contain an extent. It missed the case where there can be multiple extents between the start of the region and the hole. It could try to overwrite these additional existing extents and writes could return EINVAL. We fix this by trimming the preallocation to start at the write offset if there are any extents in the region before the write offset. The data preallocation test output has to be updated now that allocation extents won't grow towards the start of the region when there are existing extents. Signed-off-by: Zach Brown <zab@versity.com>	2023-07-17 09:36:09 -07:00
Zach Brown	a9da27444f	Merge pull request #128 from versity/zab/prealloc_fragmentation Zab/prealloc fragmentation	2023-06-29 09:57:32 -07:00
Zach Brown	49fe89741d	Merge pull request #125 from versity/zab/get_referring_entries Zab/get referring entries	2023-06-29 09:57:06 -07:00
Zach Brown	564b942ead	Write test for hole filling noncontig prealloc Add a test which exercises filling holes in prealloc regions when the _contig_only prealloc option is not set. Signed-off-by: Zach Brown <zab@versity.com>	2023-06-28 16:16:04 -07:00
Zach Brown	89b238a5c4	Add more acceptable quorum delay during testing Loaded VMs can see a few more seconds delay. Signed-off-by: Zach Brown <zab@versity.com>	2023-06-16 09:38:58 -07:00

1 2 3 4 5

236 Commits