scoutfs

mirror of https://github.com/versity/scoutfs.git synced 2026-01-04 03:14:02 +00:00

Author	SHA1	Message	Date
Zach Brown	624eb128c6	Merge pull request #221 from versity/auke/enospc-test Give enospc test more time to commit unlink.	2025-05-09 11:27:04 -07:00
Zach Brown	091eb3b683	Merge pull request #219 from versity/auke/fix-tests-failing-dirty-test-dirs Fix test cases that don't run cleanly in a semi-dirty env.	2025-05-09 11:17:24 -07:00
Zach Brown	04e8cc6295	Merge pull request #220 from versity/auke/orphan-inodes Extend orphan-inodes timeout.	2025-05-09 11:15:13 -07:00
Auke Kok	377e49caf1	Properly silently kill background tasks. Occasionally, we have some tests fail because these kills produce: tests/lock-recover-invalidate.sh: line 42: 9928 Terminated Even though we expected them to be silent. In these particular cases we already don't care about this output. We borrow the silent_kill() function from orphan-inodes and promote it to t_silent_kill() in funcs/exec.sh, and then use it everywhere where appropriate. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 12:03:04 -07:00
Auke Kok	d08eb66adc	Give enospc test more time to commit unlink. The current test sequence performs the unlink and immediately tests whether enough resources are available to create new files again, and this consistently fails. One of my crummy VMs takes a good 12 seconds before the `touch` actually succeeds. We care about the filesystem eventually returning from ENOSPC, and certainly we don't want it to take forever, but there is a period after our first ENOSPC error and cleanup that we expect ENOSPC to fail for a bit longer. Make the timeout 120s. As soon as the `touch` completes, exit the wait loop. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 11:40:13 -07:00
Auke Kok	1d0cde7cc3	Clean up old test data as needed. If run without `-m` (explicit mkfs) in subsequent testing, old test data files may break several tests. Most failures are -EEXIST, but there are some more subtle ones. This change erases any existing test dir as needed just before we run the tests, and avoids the issue entirely. I considered doing a `mv dir dir.$$ && rm -rf dir.$$ &` alternative solution but that likely will interfere disproportionally with tests that do disconnects and other thing that can be impacted by an unlink storm. This has an obvious performance aspect - tests will be a little slower to start on subsequent runs. In CI, this will effectively be a no-op though. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 10:10:01 -07:00
Auke Kok	138c7c6b49	Extend orphan-inodes timeout. This test regularly fails in CI when the 15 seconds elapses and the system still hasn't concluded the mount log merges and orphan inode scans needed to unlink the test files. Instead of just extending the timeout value, we test-and-retry for 120s. This hopefully is faster in most cases. My smallest VM needs about 6s-8s on average. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-05-08 09:56:45 -07:00
Zach Brown	8aa1a98901	Merge pull request #210 from versity/auke/perf-irq-took-too-long Filter out perf `interrupt took too long` dmesg.	2025-04-30 10:04:00 -07:00
Auke Kok	24031cde1d	TAP formatted output. Stored as `results/scoutfs.tap`, this file contains TAP format 14 generated test results. Embedded in the output are some metadata so that these files can be aggregated and stored in an unique and deduplicating way, but using a generated UUID at the start of testing. The file itself also catches git ID, date, and kernel version, as well as the (possibly altered) test sequence used. Any test that has diff or dmesg output will be considered failed, and a copy of the relevant data is included as comments. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-04-15 12:02:41 -07:00
Auke Kok	1b47e9429e	Filter out perf `interrupt took too long` dmesg. Example: ``` [ 2469.638414] perf: interrupt took too long (2507 > 2500), lowering kernel.perf_event_max_sample_rate to 79000 ``` Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-04-14 12:06:58 -07:00
Auke Kok	7ea084082d	Ignore pipefail alternative error when not a tty. This happens with the basic-truncate test, only. It's the only user of the `yes` program. The `yes` command normally fails gracefully under the usual runs that are attached to some terminal. But when the test script runs entirely under something else, it will throw a needless error message that pollutes the test output: `yes: standard output: Broken pipe` Adjust the redirect to omit all stderr for `yes` in this case. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-04-14 11:13:39 -07:00
Auke Kok	e59a5f8ebd	Readdir w/offset validation. Verify using xfs_io that readdir offsets match expected output. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-27 14:49:04 -05:00
Auke Kok	92f704d35a	Enable all xfstests mmap() tests. Now that all of these should be passing, we enable all mmap() tests in xfstests, and update the golden output with the new tests. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-23 14:28:40 -05:00
Auke Kok	311bf75902	Add mmap tests. Two test programs are added. The run time is about 1min on my el7 instance. The test script finishes up with a read/write mmap test on offline extents to verify the data wait paths in those functions. One program will perform vfs read/write and mmap read/write calls on the same file from across 5 threads (mounts) repeatedly. The goal is to assure there are no locking issues between read/write paths. The second test program performs consistency checking on a file that is repeatedly written/read using memory maps and normal reads and writes, and the content is verified after every operation. Signed-off-by: Auke Kok <auke.kok@versity.com>	2025-01-23 14:28:40 -05:00
Zach Brown	4a26059d00	Add lock-shrink-read-race test Add a quick test that races readers and shrinking to stress lock object refcount racing between concurrent lock request handling threads in the lock server. Signed-off-by: Zach Brown <zab@versity.com>	2024-10-31 15:35:11 -07:00
Auke Kok	fc7876e844	Allow certain tests to skip, but not fail exit condition. Previously, any t_skip would cause the final test result to be a failure because up until now no test should have been skipped. However, with format-version-forward-back not being compatible with el9, we are going to rely on el7/8 testing for that test soleley, and therefore we have to allow skipping of this test on el9 and newer OS versions. We add `t_skip_permitted` to signal this from the test case to the run-tests.sh script. A new exit code is passed, and all accounting is updated to reflect that a test was skipped, but this was permitted. We modify format-version-forward-back to use this new exit path. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	5337b9e221	Ingore Process accounting resumed dmesg. I'm seeing more and more of these as audit is enabled in el8 and el9 images I am using for testing, and during ENOSPC tests this has a chance of triggering process accounting suspension, and subsequent resume. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	8a22bdd366	Ignore device mapper size change dmesg output. In v1.18-10-g5507ee5, we changed the test code away from loopback to device-mapper, which simplified our DUT setup code. However, this results in the occasional `device changed size` messages now being emitted by the `dm` driver instead of the `loop` kernel module. We have to additionally ignore these kernel messages from now as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	9335d2eb86	Don't --track when checking out a tag. I've pushed a tag/release to scoutfs-xfstests-dev instead of a full blown branch. This seems simpler and cleaner than using branches, because we're going to end up rebasing these things a lot. However, we can't --track tags, so, if the branch name passed to -x is actually a tag instead of a branch, we have to omit the --track option here. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	97b081de3f	Switch xfstests tag over in CI jobs using this marker file. CI testing needs to know which xfstests branch to use on all OSs. We can't just use the el9 xfstests branch on el9 only, because we need to run the same el9 xfstests on el8 and el7 as well, otherwise testing will just fail. So, we put a marker file in our git repo that tells us that we're not going to use the default `scoutfs` branch from scoutfs-xfstests-dev but our own special tag or branch. The CI job then should pass the proper -x {branch} flag to the run-tests.sh script. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	21b5032365	Add new xfstests that we won't support or don't pass The new version of xfstests adds a _lot_ more tests to our mix. Many of the new ones will auto enable or auto skip as needed. There are tests we can't or won't support that will be in future xfstests. Disable them now so we can avoid dealing with them later. Quite a few fall into "we don't support these types of mounting yet", mostly bind-mount or dm-mapper things. We disable all the swapfile tests flatout. A few tests fail on el7 but not el8/9 but we don't have a way to run them without failing yet, so disable them as well. Update golden with the proper new array of tests. This all requires the `auke/scoutfs-el9` branch in `versity/scoutfs-xfstests-dev`. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	4723f4f9ab	Disable format-version-forward-back test on el9+. Using t_skip, we just skip this test on el9. If we ever want to add a formatversion 2->3 test, perhaps we should just add a separate test script, instead of going over a static array. But let's not worry about this too much right now. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	0a8b3f4e94	Fix basic-posix-acl test output on el9 It turns out that on el9, `bash -c` prints out `bash: line 1: cd..` instead of `line 0:` on el7 or el8. So discard all the stderr from these `cd` lines entirely and just rely on the expected echo output to stdout. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	8a4b0967cb	Add fiemap output through scoutfs util. There's filefrag already, and that works, but, it's output is very inconsistent between various OS release versions, and it has already meant that we'd needed to adjust tests to account for these little but insignificant changes. A lot more work than useful. It's even more changed in el9. This adds `scoutfs get-fiemap FILE` and prints out block extent info with flags that we care about as an abbreviated letter: U for Unwritten, L for Last, and O for Unknown (as in, "offline"). The -P/--physical and -L/--logical options turn off logical or physical offset display, in case you only want to see the offsets in either units. You can pass -b/--byte to display offsets and lengths in byte values. The block size will then be obtained from fstat() of the queried file (4096 for scoutfs). I've removed all uses of filefrag from our scoutfs tests. Xfstests still calls it but their internal diff takes care of that issue. Where needed and appropriate, the tests are adjusted so that the output of `scoutfs get-fiemap` is as close as it can to what it used to be, so that reading the test results allows the quick view of what might have been going wrong. There are some output strings I have not bothered to update because there's no real value to updating every output string to match, and we just adjust the golden file accordingly. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 15:38:34 -07:00
Auke Kok	606c519e96	Simple-staging doesn't actually test overflow. This isn't a simple case where we can use u64_region_wraps because length is s32. Let's actually test an overflow case instead of a case that doesn't overflow, though. We still should properly add an overflow test here as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	7d0e7e29f8	Avoid integer wrapping pitfalls for (off, len) pairs. We use check_add_overflow(a, b, d) here to validate that (off, len) pairs do not exceed the max value type. The kernel conveniently has several macros to sort out the problems with signed or unsigned types. However, we're not interested in purely seeing whether (a + b) overflows, because we're using this for (off, len) overflow checks, where the bytes we read are from 0 to len -1. We must therefore call this check with (b) being "len - 1". I've made sure that we don't accidentally fail when (len == 0) in all cases by making sure we've already checked this condition before, and moving code around as needed to ensure that (len > 0) in all cases where we check. The macro check_add_overflow requires a (d) argument in which temporarily the result of the addition is stored and then checked to see if an overflow occurred. We put a `tmp` variable on the stack of the correct type as needed to make the checks function. simple-release-extents test mistakenly relied on this buggy wrap code, so it needs fixing. The move-blocks test also got it wrong. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	6d42d260cf	xargs option conflict now a warning in el9 The warnings thrown by el9's version of xargs are unexpected output and cause this test to fail. When using the -I option (replace) the -n 1 arguments are always assumed. In el7/8 no warnings were printed. We can just remove `-n 1` since the argument is never needed. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-10-03 12:41:05 -07:00
Auke Kok	b45fbe0bbb	Don't pass data version to attr_x unless the ioctl means to set it. The wrapper in setattr_more that translates the operations to attr_x needs to decide whether to ask attr_x to perform a change to any of the fields passed to it or not. For the date and size fields this is implicit - we always tell attr_x to change them. For any of the other fields, it should be explicit. The only field that is in the struct that this applies to is data_version. Because the data version field by default is zero, we use that as condition to decide whether to pass the data_version down to attr_x. Previously, the code would always pass a data_version=0 down to attr_x, triggering one of the validity checks, making it return -EINVAL. We add a simple test case to test for this issue. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-09-27 19:31:22 -04:00
Auke Kok	9d8ac2c7d7	Write to kmsg which test we're executing. This is done by xfstests and it's so much easier to follow what is going on from logs or e.g. serial console that I thought I should do this for scoutfs tests as well. It makes it so much easier to discern which test may have been cause for issues when running a bunch of tests and you're looking back at logs later. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-08-28 14:36:55 -07:00
Auke Kok	7b039a1d18	Add basic POSIX ACL tests. These are extremely limited and very quick basic ACL tests we can trivially do in under a second - purely basic funtionality tests only. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-08-12 15:07:43 -04:00
Zach Brown	aeb1dbc5f5	Merge pull request #180 from versity/zab/cleanup_large_test_tmp_output Clean up large test files	2024-07-23 10:44:07 -07:00
Zach Brown	e20d3ae1e8	Clean up large test files The test harness provides a TMP directory for tests to use. It's badly named. It's meant to be more of a scratch directory that is not on the FS being tested. Tests use it both for small log files that give insight into the platform and for large generated files that are not worth saving. We want to save the directory after test runs to get at the log files, but we don't want to burn a ton of space also saving large generated files This updates the handful of tests to remove their handful of files that are large enough to be a problem. With these out of the way we can save the tmp/ directory without its space consumption getting out of hand. Signed-off-by: Zach Brown <zab@versity.com>	2024-07-22 14:08:32 -07:00
Auke Kok	db445ce517	Fix the debug output of client-unmount-recovery The script really wants to print rid instead of pid. But in case of failure, we can just dump the arrays as well. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-07-22 14:02:53 -04:00
Auke Kok	7eaed848ed	Increase time measurement accuracy beyond whole seconds. We can rely on `bc` and `date` to record, manipulate and compare time data with nanosecond precision. This fixes timing issues on faster systems where this test completes a single pass of createmany in under 1.0 second, causing the math to always fail. Signed-off-by: Auke Kok <auke.kok@versity.com>	2024-07-12 15:28:28 -04:00
Zach Brown	8c06302984	Let run-tests specify mkfs format version Add a run-tests -V option that passes through the -V option to mkfs so that runs can specify the format version that the primary volume will have. This doesn't affect the scratch file system versions. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	6a17dc335f	Add quota tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	442980f1c9	Add project ID tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	82c2d0b1d0	Add o_tmpfile_linkat test binary Add a test binary that uses o_tmpfile and linkat to create a file in a given dir. We have something similar, but it's weirdly specific to a given test. This is a simpler building block that could be used by more tests. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	60ca950f42	Drop caches in totl test Now that the _READ_XATTR_TOTALS ioctl uses the weak item cache we have to drop caches before each attempt to read the xattrs that we just wrote and synced. Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Bryant G. Duffy-Ly	460f3ce503	Add unit tests for retention Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com> [zab@versity.com: refactored for retention, added test cases] Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 15:09:05 -07:00
Zach Brown	5a53e7144d	Add format-version back/forward compat test Signed-off-by: Zach Brown <zab@versity.com> Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	a23877b150	Add fs test functions for mounted paths We have some fs functions which return info based on the test mount nr as the test has setup. This refactors those a bit to also provide some of the info when the caller has a path in a given mount. This will let tests work with scratch mounts a little more easily. Signed-off-by: Zach Brown <zab@versity.com> Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	5ccdf3c9f0	Add T_MODULE for tests Signed-off-by: Zach Brown <zab@versity.com>	2024-06-28 14:53:49 -07:00
Zach Brown	b552406427	Ignore spurious KASAN unwind warning KASAN could raise a spurious warning if the unwinder started in code without ORC metadata and tried to access in the KASAN stack frame redzones. This was fixed upstream but we can rarely see it in older kernels. We can ignore these messages. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-21 12:25:16 -08:00
Zach Brown	03ab5cedb6	clean up createmany-parallel-mounts test This test is trying to make sure that concurrent work isn't much, much, slower than individual work. It does this by timing creating a bunch of files in a dir on a mount and then timing doing the same in two mounts concurrently. But it messed it up the concurrency pretty badly. It had the concurrent createmany tasks creating files with a full path. That means that every create is trying to read all the parent directories. The way inode number allocation works means that one of the mounts is likely to be getting a write lock that includes a shared parent. This created a ton of cluster lock contention between the two tasks. Then it didn't sync the creates between phases. It could be accidentally recording the time it took to write out the dirty single creates as time taken during the parallel creates. By syncing between phases and having the createmany tasks create files relative to their per-mount directories we actually perform concurrent work and test that we're not creating contention outside of the task load. This became a problem as we switched from loopback devices to device mapper devices. The loopback writers were using buffered writes so we were masking the io cost of constantly invalidating and refilling the item cache by turning the reads into memory copies out of the page cache. While we're in here we actually clean up the created files and then use t_fail to fail the test while the files still exist so they can be examined. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 15:12:57 -08:00
Zach Brown	2b94cd6468	Add loop module kernel message filter Now that we're not setting up per-mount loopback devices we can not have the loop module loaded until tests are running. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 13:39:38 -08:00
Zach Brown	5507ee5351	Use device-mapper for per-mount test devices We don't directly mount the underlying devices for each mount because the kernel notices multiple mounts and doesn't setup a new super block for each. Previously the script used loopback devices to create the local shared block construct 'cause it was easy. This introduced corruption of blocks that saw concurrent read and write IOs. The buffered kernel file IO paths that loopback eventually degrades into by default (via splice) could have buffered readers copying out of pages without the page lock while writers modified the page. This manifest as occasional crc failure of blocks that we knowingly issue concurrent reads and writes to from multiple mounts (the quorum and super blocks). This changes the script to use device-mapper linear passthrough devices. Their IOs don't hit a caching layer and don't provide an opportunity to corrupt blocks. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-15 13:39:38 -08:00
Zach Brown	6daf24ff37	Extend hung task timeout for large-fragmented-free Our large fragmented free test creates pathologically file extents which are as expensive as possible to free. We know that debugging kernels can take a long time to do this so we can extend the hung task timeout. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-14 15:01:37 -08:00
Zach Brown	d94e49eb63	Fix quoted glob in srch-basic-functionality One of the phases of this test wanted to delete files but got the glob quoting wrong. This didn't matter for the original test but when we changed the test to use its own xattr name then those existing undeleted files got confused with other files in later phases of the test. This changes the test to delete the files with a more reliable find pattern instead of using shell glob expansion. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-09 14:16:36 -08:00
Zach Brown	bf21699ad7	bulk_create_paths test tool takes xattr name Previously the bulk_create_paths test tool used the same xattr name for each category of xattrs it was creating. This created a problem where two tests got their xattrs confused with each other. The first test created a bunch of srch xattrs, failed, and didn't clean up after itself. The second test saw these search xattrs as its own and got very confused when there were far more srch xattrs than it thought it had created. This lets each test specify the srch xattr names that are created by bulk_create_paths so that tests can work with their xattrs independent of each other. Signed-off-by: Zach Brown <zab@versity.com>	2023-11-09 14:15:44 -08:00

1 2 3 4 5

209 Commits