Commit Graph

1863 Commits

Author SHA1 Message Date
Zach Brown
c3c4b08038 v1.20 Release
Finish the release notes for the 1.20 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.20
2024-04-22 13:20:42 -07:00
Zach Brown
0519830229 Merge pull request #165 from versity/greg/kmod-uninstall-cleanup
More cleanly drive weak-modules on install/uninstall
2024-04-11 14:32:06 -07:00
Greg Cymbalski
4d6e1a14ae More safely install/uninstall with weak-modules
This addresses some minor issues with how we handle driving the
weak-modules infrastructure for handling running on kernels not
explicitly built for.

For one, we now drive weak-modules at install-time more explicitly (it
was adding symlinks for all modules into the right place for the running
kernel, whereas now it only handles that for scoutfs against all
installed kernels).

Also we no longer leave stale modules on the filesystem after an
uninstall/upgrade, similar to what's done for vsm's kmods right now.
RPM's pre/postinstall scriptlets are used to drive weak-modules to clean
things up.

Note that this (intentionally) does not (re)generate initrds of any
kind.

Finally, this was tested on both the native kernel version and on
updates that would need the migrated modules. As a result, installs are
a little quicker, the module still gets migrated successfully, and
uninstalls correctly remove (only) the packaged module.
2024-04-11 13:20:50 -07:00
Greg Cymbalski
fc3e061ea8 Merge pull request #164 from versity/greg/preserve-git-describe
Encode git info into spec to keep git info in final kmod
2024-03-29 13:48:33 -07:00
Greg Cymbalski
a4bc3fb27d Capture git info at spec creation time, pass into make 2024-02-05 15:44:10 -08:00
Zach Brown
67990a7007 Merge pull request #162 from versity/zab/v1.19
v1.19 Release
2024-01-30 15:46:49 -08:00
Zach Brown
ba819be8f9 v1.19 Release
Finish the release notes for the 1.19 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.19
2024-01-30 12:11:23 -08:00
Zach Brown
1b103184ca Merge pull request #161 from versity/zab/merge_timeout_option_fix
Correctly set the log_merge_wait_timeout_ms option
2024-01-30 12:07:10 -08:00
Zach Brown
c3890abd7b Correctly set the log_merge_wait_timeout_ms option
The initial code for setting the timeout used the wrong parsed variable.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-30 12:01:35 -08:00
Zach Brown
5ab38bfa48 Merge pull request #160 from versity/zab/log_merging_speedups
Zab/log merging speedups
2024-01-29 12:26:55 -08:00
Zach Brown
e9ad61b444 Delete multiple log trees items per server commit
server_log_merge_free_work() is responsible for freeing all the input
log trees for a log merge operation that has finished.  It looks for the
next item to free, frees the log btree it references, and then deletes
the item.  It was doing this with a full server commit for each item
which can take an agonizingly long time.

This changes it perform multiple deletions in a commit as long as
there's plenty of alloc space.  The moment the commit gets low it
applies the commit and opens a new one.  This sped up the deletion of a
few hundred thousand log tree items from taking hours to seconds.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:30:17 -08:00
Zach Brown
91bbf90f71 Don't pin input btrees when merging
The btree_merge code was pinning leaf blocks for all input btrees as it
iterated over them.  This doesn't work when there are a very large
number of input btrees.  It can run out of memory trying to hold a
reference to a 64KiB leaf block for each input root.

This reworks the btree merging code.  It reads a window of blocks from
all input trees to get a set of merged items.  It can take multiple
passes to complete the merge but by setting the merge window large
enough this overhead is reduced.  Merging now consumes a fixed amount of
memory rather than using memory proportional to the number of input
btrees.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:30:17 -08:00
Zach Brown
b5630f540d Add tracing of the log merge finalizing decision
Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:30:17 -08:00
Zach Brown
90a4c82363 Make log merge wait timeout tunable
Add a mount option for the amount of time that log merge creation can
wait before giving up.  We add some counters so we can see how often
the timeout is being hit and what the average successfull wait time is.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:25:56 -08:00
Zach Brown
f654fa0fda Send syncs once when starting to merge
The server sends sync requests to clients when it sees that they have
open log trees that need to be committed for log merging to proceed.

These are currently sent in the context of each client's get_log_trees
request, resulting in sync requests queued for one client from all
clients.  Depending on message delivery and commit latencies, this can
create a sync storm.

The server's sends are reliable and the open commits are marked with the
seq when they opened.  It's easy for us to record having sent syncs to
all open commits so that future attempts can be avoided.  Later open
commits will have higher seqs and will get a new round of syncs sent.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:25:20 -08:00
Zach Brown
50168a2d2a Check each client's last log item for stable seq
The server was checking all client log_trees items to search for the
lowest commit seq that was still open.  This can be expensive when there
are a lot of finalized log_trees items that won't have open seqs.  Only
the last log_trees item for each client rid can be open, and the items
are sorted by rid and nr, so we can easily only check the last item for
each client rid.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:24:50 -08:00
Zach Brown
3c0616524a Only search last log_trees per rid for finalizing
During get_log_trees the server checks log_trees items to see if it
should start a log merge operation.  It did this by iterating over all
log_trees items and there can be quite a lot of them.

It doesn't need to see all of the items.  It only needs to see the most
recent log_trees item for each mount.  That's enough to make the
decisions that start the log merging process.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:23:59 -08:00
Zach Brown
8d3e6883c6 Merge pull request #159 from versity/auke/trans_hold
Fix ret output for scoutfs_trans_hold trace pt.
2024-01-09 09:23:32 -08:00
Auke Kok
8747dae61c Fix ret output for scoutfs_trans_hold trace pt.
Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-01-08 16:27:41 -08:00
Zach Brown
fffcf4a9bb Merge pull request #158 from versity/zab/kasan_stack_oob_get_reg
Ignore spurious KASAN unwind warning
2023-11-22 10:04:18 -08:00
Zach Brown
b552406427 Ignore spurious KASAN unwind warning
KASAN could raise a spurious warning if the unwinder started in code
without ORC metadata and tried to access in the KASAN stack frame
redzones.  This was fixed upstream but we can rarely see it in older
kernels.  We can ignore these messages.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-21 12:25:16 -08:00
Zach Brown
d812599e6b Merge pull request #157 from versity/zab/dmsetup_test_devices
Zab/dmsetup test devices
2023-11-21 10:13:02 -08:00
Zach Brown
03ab5cedb6 clean up createmany-parallel-mounts test
This test is trying to make sure that concurrent work isn't much, much,
slower than individual work.  It does this by timing creating a bunch of
files in a dir on a mount and then timing doing the same in two mounts
concurrently.  But it messed it up the concurrency pretty badly.

It had the concurrent createmany tasks creating files with a full path.
That means that every create is trying to read all the parent
directories.  The way inode number allocation works means that one of
the mounts is likely to be getting a write lock that includes a shared
parent.  This created a ton of cluster lock contention between the two
tasks.

Then it didn't sync the creates between phases.  It could be
accidentally recording the time it took to write out the dirty single
creates as time taken during the parallel creates.

By syncing between phases and having the createmany tasks create files
relative to their per-mount directories we actually perform concurrent
work and test that we're not creating contention outside of the task
load.

This became a problem as we switched from loopback devices to device
mapper devices.  The loopback writers were using buffered writes so we
were masking the io cost of constantly invalidating and refilling the
item cache by turning the reads into memory copies out of the page
cache.

While we're in here we actually clean up the created files and then use
t_fail to fail the test while the files still exist so they can be
examined.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-15 15:12:57 -08:00
Zach Brown
2b94cd6468 Add loop module kernel message filter
Now that we're not setting up per-mount loopback devices we can not have
the loop module loaded until tests are running.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-15 13:39:38 -08:00
Zach Brown
5507ee5351 Use device-mapper for per-mount test devices
We don't directly mount the underlying devices for each mount because
the kernel notices multiple mounts and doesn't setup a new super block
for each.

Previously the script used loopback devices to create the local shared
block construct 'cause it was easy.  This introduced corruption of
blocks that saw concurrent read and write IOs.  The buffered kernel file
IO paths that loopback eventually degrades into by default (via splice)
could have buffered readers copying out of pages without the page lock
while writers modified the page.  This manifest as occasional crc
failure of blocks that we knowingly issue concurrent reads and writes to
from multiple mounts (the quorum and super blocks).

This changes the script to use device-mapper linear passthrough devices.
Their IOs don't hit a caching layer and don't provide an opportunity to
corrupt blocks.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-15 13:39:38 -08:00
Zach Brown
1600a121d9 Merge pull request #156 from versity/zab/large_fragmented_free_hung_task
Extend hung task timeout for large-fragmented-free
2023-11-15 09:49:13 -08:00
Zach Brown
6daf24ff37 Extend hung task timeout for large-fragmented-free
Our large fragmented free test creates pathologically file extents which
are as expensive as possible to free.  We know that debugging kernels
can take a long time to do this so we can extend the hung task timeout.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-14 15:01:37 -08:00
Zach Brown
cd5d9ff3e0 Merge pull request #154 from versity/zab/srch_test_fixes
Zab/srch test fixes
2023-11-13 09:47:46 -08:00
Zach Brown
d94e49eb63 Fix quoted glob in srch-basic-functionality
One of the phases of this test wanted to delete files but got the glob
quoting wrong.  This didn't matter for the original test but when we
changed the test to use its own xattr name then those existing undeleted
files got confused with other files in later phases of the test.

This changes the test to delete the files with a more reliable find
pattern instead of using shell glob expansion.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-09 14:16:36 -08:00
Zach Brown
1dbe408539 Add tracing of srch compact struct communication
Signed-off-by: Zach Brown <zab@versity.com>
2023-11-09 14:16:33 -08:00
Zach Brown
bf21699ad7 bulk_create_paths test tool takes xattr name
Previously the bulk_create_paths test tool used the same xattr name for
each category of xattrs it was creating.

This created a problem where two tests got their xattrs confused with
each other.  The first test created a bunch of srch xattrs, failed, and
didn't clean up after itself.  The second test saw these search xattrs
as its own and got very confused when there were far more srch xattrs
than it thought it had created.

This lets each test specify the srch xattr names that are created by
bulk_create_paths so that tests can work with their xattrs independent
of each other.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-09 14:15:44 -08:00
Zach Brown
c7c67a173d Specifically wait for compaction in srch test
We just added a test to try and get srch compaction stuck by having an
input file continue at a specific offset.  To exercise the bug the test
needs to perform 6 compactions.  It needs to merge 4 sets of logs into 4
sorted files, it needs to make partial progress merging those 4 sorted
files into another file, and then finall attempt to continue compacting
from the partial progress offset.

The first version of the test didn't necessarily ensure that these
compactions happened.  It created far too many log files then just
waited for time to pass.  If the host was slow then the mounts may not
make it through the initial logs to try and compact the sorted files.
The triggers wouldn't fire and the test would fail.

These changes much more carefully orchestrate and watch the various
steps of compaction to make sure that we trigger the bug.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-09 14:13:13 -08:00
Zach Brown
0d10189f58 Make srch compact request delay tunable
Add a sysfs file for getting and setting the delay between srch
compaction requests from the client.  We'll use this in testing to
ensure compaction runs promptly.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-09 14:13:07 -08:00
Zach Brown
6b88f3268e Merge pull request #153 from versity/zab/v1.18
v1.18 Release
2023-11-08 10:57:56 -08:00
Zach Brown
4b2afa61b8 v1.18 Release
Finish the release notes for the 1.18 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.18
2023-11-07 16:01:59 -08:00
Zach Brown
222ba2cede Merge pull request #152 from versity/zab/stuck_srch_compact
Zab/stuck srch compact
2023-11-07 15:56:39 -08:00
Zach Brown
c7e97eeb1f Allow srch compaction from _SAFE_BYTES
Compacting sorted srch files can take multiple transactions because they
can be very large.  Each transaction resumes at a byte offset in a block
where the previous transaction stopped.

The resuming code tests that the byte offsets are sane but had a mistake
in testing the offset to skip to.  It returned an error if the
compaction resumed from the last possible safe offset for decoding
entries.

If a system is unlucky enough to have a compaction transaction stop at
just this offset then compaction stops making forward progress as each
attempt to resume returns an error.

The fix allows continuation from this last safe offset while returning
errors for attempts to continue *past* that offset.  This matches all
the encoding code which allows encoding the last entry in the block at
this offset.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-07 12:34:00 -08:00
Zach Brown
21c070b42d Add test for srch continutation safe pos errors
Add a test for srch compaction getting stuck hitting errors continuing a
partial operation.  It ensures that a block has an encoded entry at
the _SAFE_BYTES offset, that an operaton stops precisely at that
offset, and then watches for errors.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-07 12:34:00 -08:00
Zach Brown
77fbf92968 Add t_trigger_set helper
Add a helper to arm or disarm a trigger with a value argument.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-07 12:12:10 -08:00
Zach Brown
d5c699c3b4 Don't respond with ENOENT for no srch compaction
The srch compaction request building function and the srch compaction
worker both have logic to recognize a valid response with no input files
indicating that there's no work to do.  The server unfortunately
translated nr == 0 into ENOENT and send that error response to the
client.  This caused the client to increment error counters in the
common case when there's no compaction work to perform.  We'd like the
error counter to reflect actual errors, we're about to check it in a
test, so let's fix this up to the server sends a sucessful response with
nr == 0 to indicate that there's no work to do.

Signed-off-by: Zach Brown <zab@versity.com>
2023-11-07 10:30:38 -08:00
Zach Brown
b56b8e502c Merge pull request #145 from versity/zab/server_seqlock
Use seqlock instead of seqcount in server
2023-10-24 14:36:56 -07:00
Zach Brown
5ff372561d Merge pull request #146 from versity/auke/truncatedd
Ensure dd creates the full 8K input test file.
2023-10-24 10:10:11 -07:00
Zach Brown
bdecee5e5d Merge pull request #147 from versity/zab/v1.17
v1.17 Release
2023-10-24 09:52:36 -07:00
Zach Brown
a9281b75fa v1.17 Release
Finish the release notes for the 1.17 release.

Signed-off-by: Zach Brown <zab@versity.com>
v1.17
2023-10-23 14:20:13 -07:00
Auke Kok
707e1b2d59 Ensure dd creates the full 8K input test file.
Without `iflag=fullblock` we encounter sporadic cases where the
input file to the truncate test isn't fully written to 8K and ends
up to be only 4K. The subsequent truncate tests then fail.

We add a check to the input test file size just to be sure in the
future.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2023-10-23 17:04:19 -04:00
Zach Brown
006f429f72 Use seqlock instead of seqcount in server
The server had a few lower level seqcounts that it used to protect
state.  One user got it wrong by forgetting to disable pre-emption
around writers.  Debug kernels warned as write_seqcount_begin() was
called without preemption disabled.

We fix that user and make it easier to get right in the future by having
one higher level seqlock and using that consistently for seq read
begin/retry and write lock/unlock patterns.

Signed-off-by: Zach Brown <zab@versity.com>
2023-10-19 15:43:15 -07:00
Zach Brown
d71583bcf5 Merge pull request #134 from versity/auke/tests-add-bc
Add `bc` to test requirement.
2023-10-16 15:12:22 -07:00
Zach Brown
bb835b948d Merge pull request #138 from versity/auke/ignore-journald-rotate
Filter out journald rotate messages.
2023-10-16 14:54:56 -07:00
Zach Brown
bcdc4f5423 Merge pull request #143 from versity/zab/t_quiet_appends
t_quiet appends command output
2023-10-12 11:58:50 -07:00
Auke Kok
7ceb215c91 Filter out journald rotate messages.
On el9 distros systemd-journald will log rotation events into kmesg.
Since the default logs on VM images are transient only, they are
rotated several times during a single test cycle, causing test failures.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2023-10-12 12:27:41 -04:00