Adjusting hung_task_timeout_secs is still needed for this test to pass
with a debug kernel. But the logic belongs on the platform side.
Signed-off-by: Chris Kirby <ckirby@versity.com>
Now that all of these should be passing, we enable all mmap() tests in
xfstests, and update the golden output with the new tests.
Signed-off-by: Auke Kok <auke.kok@versity.com>
Two test programs are added. The run time is about 1min on my el7
instance.
The test script finishes up with a read/write mmap test on offline
extents to verify the data wait paths in those functions.
One program will perform vfs read/write and mmap read/write calls on
the same file from across 5 threads (mounts) repeatedly. The goal
is to assure there are no locking issues between read/write paths.
The second test program performs consistency checking on a file that is
repeatedly written/read using memory maps and normal reads and writes,
and the content is verified after every operation.
Signed-off-by: Auke Kok <auke.kok@versity.com>
Add a quick test that races readers and shrinking to stress lock object
refcount racing between concurrent lock request handling threads in the
lock server.
Signed-off-by: Zach Brown <zab@versity.com>
The new version of xfstests adds a _lot_ more tests to our mix. Many
of the new ones will auto enable or auto skip as needed.
There are tests we can't or won't support that will be in future
xfstests. Disable them now so we can avoid dealing with them later.
Quite a few fall into "we don't support these types of mounting yet",
mostly bind-mount or dm-mapper things. We disable all the swapfile
tests flatout.
A few tests fail on el7 but not el8/9 but we don't have a way to run
them without failing yet, so disable them as well.
Update golden with the proper new array of tests. This all requires
the `auke/scoutfs-el9` branch in `versity/scoutfs-xfstests-dev`.
Signed-off-by: Auke Kok <auke.kok@versity.com>
It turns out that on el9, `bash -c` prints out `bash: line 1: cd..`
instead of `line 0:` on el7 or el8. So discard all the stderr from
these `cd` lines entirely and just rely on the expected echo
output to stdout.
Signed-off-by: Auke Kok <auke.kok@versity.com>
There's filefrag already, and that works, but, it's output is very
inconsistent between various OS release versions, and it has already
meant that we'd needed to adjust tests to account for these little
but insignificant changes. A lot more work than useful. It's even
more changed in el9.
This adds `scoutfs get-fiemap FILE` and prints out block extent
info with flags that we care about as an abbreviated letter: U for
Unwritten, L for Last, and O for Unknown (as in, "offline").
The -P/--physical and -L/--logical options turn off logical or physical
offset display, in case you only want to see the offsets in either
units. You can pass -b/--byte to display offsets and lengths in
byte values. The block size will then be obtained from fstat() of
the queried file (4096 for scoutfs).
I've removed all uses of filefrag from our scoutfs tests. Xfstests
still calls it but their internal diff takes care of that issue.
Where needed and appropriate, the tests are adjusted so that the output
of `scoutfs get-fiemap` is as close as it can to what it used to be,
so that reading the test results allows the quick view of what might
have been going wrong.
There are some output strings I have not bothered to update because
there's no real value to updating every output string to match,
and we just adjust the golden file accordingly.
Signed-off-by: Auke Kok <auke.kok@versity.com>
This isn't a simple case where we can use u64_region_wraps because
length is s32.
Let's actually test an overflow case instead of a case that doesn't
overflow, though. We still should properly add an overflow test here as
well.
Signed-off-by: Auke Kok <auke.kok@versity.com>
The wrapper in setattr_more that translates the operations to attr_x
needs to decide whether to ask attr_x to perform a change to any of
the fields passed to it or not. For the date and size fields this
is implicit - we always tell attr_x to change them. For any of the
other fields, it should be explicit.
The only field that is in the struct that this applies to is
data_version. Because the data version field by default is zero,
we use that as condition to decide whether to pass the data_version
down to attr_x.
Previously, the code would always pass a data_version=0 down to attr_x,
triggering one of the validity checks, making it return -EINVAL. We
add a simple test case to test for this issue.
Signed-off-by: Auke Kok <auke.kok@versity.com>
These are extremely limited and very quick basic ACL tests we can
trivially do in under a second - purely basic funtionality tests only.
Signed-off-by: Auke Kok <auke.kok@versity.com>
The test harness provides a TMP directory for tests to use. It's badly
named. It's meant to be more of a scratch directory that is not on the
FS being tested.
Tests use it both for small log files that give insight into the
platform and for large generated files that are not worth saving. We
want to save the directory after test runs to get at the log files, but
we don't want to burn a ton of space also saving large generated files
This updates the handful of tests to remove their handful of files that
are large enough to be a problem. With these out of the way we can save
the tmp/ directory without its space consumption getting out of hand.
Signed-off-by: Zach Brown <zab@versity.com>
Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
[zab@versity.com: refactored for retention, added test cases]
Signed-off-by: Zach Brown <zab@versity.com>
This test is trying to make sure that concurrent work isn't much, much,
slower than individual work. It does this by timing creating a bunch of
files in a dir on a mount and then timing doing the same in two mounts
concurrently. But it messed it up the concurrency pretty badly.
It had the concurrent createmany tasks creating files with a full path.
That means that every create is trying to read all the parent
directories. The way inode number allocation works means that one of
the mounts is likely to be getting a write lock that includes a shared
parent. This created a ton of cluster lock contention between the two
tasks.
Then it didn't sync the creates between phases. It could be
accidentally recording the time it took to write out the dirty single
creates as time taken during the parallel creates.
By syncing between phases and having the createmany tasks create files
relative to their per-mount directories we actually perform concurrent
work and test that we're not creating contention outside of the task
load.
This became a problem as we switched from loopback devices to device
mapper devices. The loopback writers were using buffered writes so we
were masking the io cost of constantly invalidating and refilling the
item cache by turning the reads into memory copies out of the page
cache.
While we're in here we actually clean up the created files and then use
t_fail to fail the test while the files still exist so they can be
examined.
Signed-off-by: Zach Brown <zab@versity.com>
Our large fragmented free test creates pathologically file extents which
are as expensive as possible to free. We know that debugging kernels
can take a long time to do this so we can extend the hung task timeout.
Signed-off-by: Zach Brown <zab@versity.com>
We just added a test to try and get srch compaction stuck by having an
input file continue at a specific offset. To exercise the bug the test
needs to perform 6 compactions. It needs to merge 4 sets of logs into 4
sorted files, it needs to make partial progress merging those 4 sorted
files into another file, and then finall attempt to continue compacting
from the partial progress offset.
The first version of the test didn't necessarily ensure that these
compactions happened. It created far too many log files then just
waited for time to pass. If the host was slow then the mounts may not
make it through the initial logs to try and compact the sorted files.
The triggers wouldn't fire and the test would fail.
These changes much more carefully orchestrate and watch the various
steps of compaction to make sure that we trigger the bug.
Signed-off-by: Zach Brown <zab@versity.com>
Add a test for srch compaction getting stuck hitting errors continuing a
partial operation. It ensures that a block has an encoded entry at
the _SAFE_BYTES offset, that an operaton stops precisely at that
offset, and then watches for errors.
Signed-off-by: Zach Brown <zab@versity.com>
In RHEL7, this was skipped automatically. In RHEL8, we don't support
the needed passing through of the actual user namespace into our
ACL set/get handlers. Once we get around v5.11 or so, the handlers
are automatically passed the namespace. Until then, skip this test.
Signed-off-by: Auke Kok <auke.kok@versity.com>
In older versions of coreutils, quoted strings are occasionally
output using utf-8 open/close single quotes.
New versions of coreutils will exclusively use the ASCII single quote
character "'" when the output is not a TTY - as is the case with
all test scripts.
We can avoid most of these problems by always setting LC_ALL=C in
testing, however.
Signed-off-by: Auke Kok <auke.kok@versity.com>
Data preallocation attempts to allocate large aligned regions of
extents. It tried to fill the hole around a write offset that
didn't contain an extent. It missed the case where there can be
multiple extents between the start of the region and the hole.
It could try to overwrite these additional existing extents and writes
could return EINVAL.
We fix this by trimming the preallocation to start at the write offset
if there are any extents in the region before the write offset. The
data preallocation test output has to be updated now that allocation
extents won't grow towards the start of the region when there are
existing extents.
Signed-off-by: Zach Brown <zab@versity.com>
Add a test which exercises filling holes in prealloc regions when the
_contig_only prealloc option is not set.
Signed-off-by: Zach Brown <zab@versity.com>
Update the quorum_heartbeat_timeout_ms test to also test the mount
option, not just updating the timeout via sysfs. This takes some
reworking as we have to avoid the active leader/server when setting the
timeout via the mount option. We also allow for a bit more slack around
comparing kernel sleeps and userspace wall clocks.
Signed-off-by: Zach Brown <zab@versity.com>
There were kernels that didn't apply the current umask to inode modes
created with O_TMPFILE without acls. Let's have a test running to make
sure that we're not surprised if we come across one.
Signed-off-by: Zach Brown <zab@versity.com>
We had a one-off test that was overly specific to staging from tmpfile.
This renames it to a more generic test where we can add more tests of
o_tmpfile in general.
Signed-off-by: Zach Brown <zab@versity.com>
Add a quick test of the index items to make sure that rapid inode
updates don't create duplicate meta_seq items.
Signed-off-by: Zach Brown <zab@versity.com>
Add a test which gives the server a transaction with a free list block
that contains blknos that each dirty an individiaul btree blocks in the
global data free extent btree.
Signed-off-by: Zach Brown <zab@versity.com>
We're seeing some trouble with very specific race conditions. This
updates the orphan-inodes test to try and force final inode deletion
during eviction, the orphan scan worker, and opening inodes by handle to
all race and hit an inode number at the same time.
Signed-off-by: Zach Brown <zab@versity.com>
This unit test reproduces the race we have between
client and server diong lock recovery while farewell
is processed.
Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
We want to enable the test case for:
generic/023 - tests that renameat2 syscall exists
generic/024 - renameat2 with NOREPLACE flag
Move both generic/025 and 078 to the no run list so that
we can test the unsupported output if the flags were
passed that were not supported.
Example output:
generic/025 [not run] fs doesn't support RENAME_EXCHANGE
generic/078 [not run] fs doesn't support RENAME_WHITEOUT
Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
The goal of the test case is to have two mount points
with two async calls made to do renameat2. This allows
for two calls to race to call renameat2 RENAME_NOREPLACE.
When this happens you expect one of them to fail with a
-EEXIST. This would validate that the new flag works.
Essentially one of the two calls to renameat should hit the
new RENAME_NOREPLACE code and exit early.
Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
The current test case attempts to create a state to read
by calling setattr and getattr in attempt to force block
cache reads. It so happens that this does not always force
cache block reads, which in rare cases causes this test case
to fail.
The new test case removes all the extra bouncing around of mount
points and we just directly call scoutfs df which will walk
everyone's allocators to summarize the block counts, which is
guaranteed to exist. Therefore, we do not have to create any sort
of state prior to trying to force a read.
Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
As we update xattrs we need to update any existing old items with the
contents of the new xattr that uses those items. The loop that updated
existing items only took the old xattr size into account and assumed
that the new xattr would use those items. If the new xattr size used
fewer parts then the attempt to update all the old parts that weren't
covered by the new size would go very wrong. The length of the region
in the new xattr would be negative so it'd try to use the max part
length. Worse, it'd copy these max part length regions outside the
input new xattr buffer. Typically this would land in addressible memory
and copy garbage into the unused old items before they were later
deleted.
However, it could access so far outside the input buffer that it could
cross a page boudary into inaccessible memory and fault. We saw this in
the field while trying to repeatedly incrementally shrink a large xattr.
This fixes the loop that updates overlapping items between the new and
old xattr to start with the smaller of their two item counts. Now it
will only update items that are actually used by both xattrs and will
only safely access the new xattr input buffer.
Signed-off-by: Zach Brown <zab@versity.com>
The k-way merge function at the core of the srch file entry merging had
some bookkeeping math (calculating number of parents) that couldn't
handle merging a single incoming entry stream, so it threw a warning and
returned an error. When refusing to handle that case, it was assuming
that caller was trying to merge down a single log file which doesn't
make any sense.
But in the case of multiple small unsorted logs we can absolutely end up
with their entries stored in one sorted page. We have one sorted input
page that's merging multiple log files. The merge function is also the
path that writes to the output file so we absolutely need to handle this
case.
We more carefully calculate the number of parents, clamping it to one
parent when we'd otherwise get "(roundup(1) -> 1) - 1 == 0" when
calculating the number of parents from the number of inputs. We can
relax the warning and error to refuse to merge nothing.
The test triggers this case by putting single search entries in the log
files for mounts and unmounting them to force rotation of the mount log
files into mergable rotated log files.
Signed-off-by: Zach Brown <zab@versity.com>