Commit Graph

360 Commits

Author SHA1 Message Date
Auke Kok
f67462750b Add tcp_keepalive_timeout_ms option, change default to 60s
The default TCP keepalive value is currently 10s, resulting in clients
being disconnected after 10 seconds of not replying to a TCP keepalive
packet. These keepalive values are reasonable most of the times, but
we've seen client disconnects where this timeout has been exceeded,
resulting in fencing. The cause for this is unknown at this time, but it
is suspected that network intermissions are happening.

This change adds a configurable value for this specific client socket
timeout. It enforces that its value is above UNRESPONSIVE_PROBES, whose
value remains unchanged.

The default value of 10000ms (10s) is changed to 60s. This is the value
we're assuming is much better suited for customers and has been briefly
trialed, showing that it may help to avoid network level interruptions
better.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-10-28 18:45:43 -04:00
Auke Kok
70bd936213 Ignore sparse error about stat.h on el8.
On el8, sparse is at 0.6.4 in epel-release, but it fails with:
```
[SP src/util.c]
src/util.c: note: in included file (through /usr/include/sys/stat.h):
/usr/include/bits/statx.h:30:6: error: not a function <noident>
/usr/include/bits/statx.h:30:6: error: bad constant expression type
```

This is due to us needing O_DIRECT from <fcntl.h>, so we set _GNU_SOURCE
before including it, but this causes (through _USE_GNU in sys/stat.h)
statx.h to be included, and that has __has_include, and sparse is too
dumb to understand it.

Just shut it up.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2025-10-06 12:27:25 -05:00
Zach Brown
d0cf026298 Require sparse, and filter kernel sparse output
Fail the build if we don't check with sparse in both the kernel and
userspace utils.  Add a filtering wrapper to the kernel build so that we
have a place to filter out uninteresting errors from kernel sources that
we're building against.

Signed-off-by: Zach Brown <zab@versity.com>
2025-10-03 09:35:36 -07:00
Zach Brown
c6dab3c306 Remove wordexp expansion of utils path argument
scoutfs cli commands were using a helper that tried to perform word
expansion on the path argument.  This was done with the intent of
providing the convenience of shell expansion (env vars, ~) within the
cli command argument.

But it breaks paths that accidentally have their file names match the
syntax that wordexp supports.   "[ ]" tripped up files in the wild.

We don't need to provide shell expansion functionality in our argument
parsing.  The shell can do that.  The cli must pass the arguments
straight through, no parsing at all.

Signed-off-by: Zach Brown <zab@versity.com>
2025-02-18 11:55:37 -08:00
Zach Brown
295f751aed Add test_bit to utils bitmap
Add test_bit() to the trivial utils bitmap.c implementation.

Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:58 -08:00
Zach Brown
7f6032d9b4 Add lk rbtree wrapper
Import the kernel's rbtree implementation with a wrapper so we can use
it from userspace.

Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:49 -08:00
Zach Brown
7e3a6537ec Add userspace version of our dirent name hash
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:41 -08:00
Zach Brown
49b7b70438 Add userspace version of our mode to type
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:31 -08:00
Zach Brown
de0fdd1f9f Promote userspace btree block initialization
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:23 -08:00
Zach Brown
a6d7de3c00 Add fls64() alias for userspace flsll()
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:16 -08:00
Zach Brown
2c2c127c5e Add put_unaligned_leXX() for userspace
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:58:10 -08:00
Zach Brown
9491c784e7 Add srch_encode_entry() for userspace utils
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:57:56 -08:00
Zach Brown
c3b30930fa Add bloom filter index calc for userspace utils
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:57:46 -08:00
Zach Brown
e7e46a80e6 Add userspace NSEC_PER_SEC
Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:57:39 -08:00
Zach Brown
1ddf752f42 Import a few more functions to our list.h
Import a few more functions from the kernel's list.h into our imported
copy.

Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:57:29 -08:00
Zach Brown
14b65c6360 Fix printing alloc list block extents
The list alloc blocks have an array of blknos that are offset by a start
field in the block header.  The print code wasn't using that and was
always referencing the beginning of the array, which could miss blocks.

Signed-off-by: Zach Brown <zab@versity.com>
2025-01-22 09:57:21 -08:00
Auke Kok
8a4b0967cb Add fiemap output through scoutfs util.
There's filefrag already, and that works, but, it's output is very
inconsistent between various OS release versions, and it has already
meant that we'd needed to adjust tests to account for these little
but insignificant changes. A lot more work than useful. It's even
more changed in el9.

This adds `scoutfs get-fiemap FILE` and prints out block extent
info with flags that we care about as an abbreviated letter: U for
Unwritten, L for Last, and O for Unknown (as in, "offline").

The -P/--physical and -L/--logical options turn off logical or physical
offset display, in case you only want to see the offsets in either
units. You can pass -b/--byte to display offsets and lengths in
byte values. The block size will then be obtained from fstat() of
the queried file (4096 for scoutfs).

I've removed all uses of filefrag from our scoutfs tests. Xfstests
still calls it but their internal diff takes care of that issue.

Where needed and appropriate, the tests are adjusted so that the output
of `scoutfs get-fiemap` is as close as it can to what it used to be,
so that reading the test results allows the quick view of what might
have been going wrong.

There are some output strings I have not bothered to update because
there's no real value to updating every output string to match,
and we just adjust the golden file accordingly.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-10-03 15:38:34 -07:00
Auke Kok
ac00f5cedb Free after getline(), even if fail, and catch eof() on el9
getline() allocates the space for the return value even if there is an
error, so when it returns an error, we still have to free() it.

In el9, when reading stdin we will get errno=0 returned (no error) when
we hit the end of stdin. This behavior is different from el7/8. We don't
want to throw an error here to avoid failing the test, since it doesn't.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-10-03 12:41:05 -07:00
Auke Kok
00ebe92186 Add stddef.h to util.h to avoid duplicate offsetof() def.
In el9 releases, our includes declare offsetof() before our header
chain includes stddef.h, which doesn't properly check if offsetof
is already defined, leading to a redefinition. Just include stddef
at all times here.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-10-03 12:41:05 -07:00
Auke Kok
570c05898c Correct endian conversion length (blkno is le64)
Trivial correction of wrong bitlength conversion.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-10-03 12:41:05 -07:00
Auke Kok
3b8d2eab8e Sparse fix for epel 0.6.4 sparse - redefines
We should rely on sparse from epel to do automated sparse checking and
not a git tag. But the 0.6.4 build currently fails on sparse/gcc
redefines.

This magic Awk from Zach script processes sparse and gcc internal defines
and leaves the one intact that sparse doesn't have.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-09-27 15:37:47 -04:00
Auke Kok
267c1cc2d5 Check meta flags bit set/unset for devices.
This extra check assures the passed meta device and data device
are indeed what they should be, and prevents against unwanted
swapping or repeated duplicate device arguments.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2024-07-12 15:22:45 -04:00
Zach Brown
1bc83e9e2d Add indx xattr tag support to utils
Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 15:09:05 -07:00
Zach Brown
e0bb6ca481 Add quota support to utils
Add scoutfs cli commands for managing quotas and add its persistent
structures to the print command.

Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 15:09:05 -07:00
Zach Brown
4a8240748e Add project ID support
Add support for project IDs.  They're managed through the _attr_x
interfaces and are inherited from the parent directory during creation.

Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 15:09:05 -07:00
Zach Brown
fb5331a1d9 Add inode retention bit
Add a bit to the private scoutfs inode flags which indicates that the
inode is in retention mode.  The bit is visible through the _attr_x
interface.  It can only be set on regular files and when set it prevents
modification to all but non-user xattrs.  It can be cleared by root.

Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 15:09:05 -07:00
Zach Brown
de304628ea Add attr_x commands and documentation to utils
Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 14:53:49 -07:00
Bryant G. Duffy-Ly
9ba4271c26 Add new max format version of 2
We're about to add new format structures so increment the max version to
2.  Future commits will add the features before we release version 2 in
the wild.

Signed-off-by: Zach Brown <zab@zabbo.net>
2024-06-28 14:53:49 -07:00
Bryant G. Duffy-Ly
90cfaf17d1 Initial support for different inode sizes
We're about to increase the inode size and increment the format version.
Inode reading and writing has to handle different valid inode sizes as
allowed by the format version.   This is the initial skeletal work that
later patches which really increase the inode size will further refine
to add the specific known sizes and format versions.

Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
[zab@versity.com: reworded description, reworked to use _within]
Signed-off-by: Zach Brown <zab@versity.com>
2024-06-28 14:53:49 -07:00
Zach Brown
d6642da44d Prevent downgrade of format version
Don't let change-format-version decrease the format version.  It doesn't
have the machinery to go back and migrate newer structures to older
structures that would be compatible with code expecting the older
version.

Signed-off-by: Bryant G. Duffy-Ly <bduffyly@versity.com>
[zab@versity.com: split from initial patch with other changes]
Signed-off-by: Zach Brown <zab@versity.com>
2024-06-25 15:11:20 -07:00
Zach Brown
6db69b7a4f Set root inode crtime in mkfs
When we added the crtime creation timestamp to the inode we forgot to
update mkfs to set the crtime of the root inode.

Signed-off-by: Zach Brown <zab@versity.com>
2024-06-25 15:11:18 -07:00
Zach Brown
90a4c82363 Make log merge wait timeout tunable
Add a mount option for the amount of time that log merge creation can
wait before giving up.  We add some counters so we can see how often
the timeout is being hit and what the average successfull wait time is.

Signed-off-by: Zach Brown <zab@versity.com>
2024-01-25 11:25:56 -08:00
Ben McClelland
d2c2fece2a Add rpm spec file support for el8 builds
The rpmbuild support files no longer define the previously used kernel
module macros. This carves out the differences between el7 and el8 with
conditionals based on the distro we are building for.

Signed-off-by: Ben McClelland <ben.mcclelland@versity.com>
2023-10-09 15:35:40 -04:00
Zach Brown
2279e9657f Add get_referring_entries scoutfs command
Add a cli command for the get_referring_entries ioctl.

Signed-off-by: Zach Brown <zab@versity.com>
2023-06-14 14:12:10 -07:00
Zach Brown
912906f050 Make quorum heartbeat timeout tunable
Add mount and sysfs options for changing the quorum heartbeat timeout.
This allows setting a longer delay in taking over for failed hosts that
has a greater chance of surviving temporary non-fatal delays.

We also double the existing default timeout to 10s which is still
reasonably responsive.

Signed-off-by: Zach Brown <zab@versity.com>
2023-05-17 14:44:27 -07:00
Zach Brown
e7bd1b45dc Add prepare-empty-data-device scoutfs command
Add a command for writing a super block to a new data device after
reading the metadata device to ensure that there's no existing
data on the old data device.

Signed-off-by: Zach Brown <zab@versity.com>
2023-04-17 12:47:50 -07:00
Zach Brown
18903ce500 Alphabetize command listing in scoutfs man page
List the scoutfs utility commands in the man page in alphabetical order.

Signed-off-by: Zach Brown <zab@versity.com>
2023-04-17 12:47:50 -07:00
Zach Brown
b76e22ffcf Refactor user util functions for device size
Split the existing device_size() into get_device_size() and
limit_device_size().  An upcoming command wants to get the device size
without applying limiting policy.

Signed-off-by: Zach Brown <zab@versity.com>
2023-04-17 12:47:50 -07:00
Zach Brown
3363b4fb79 Flush device caches in buffered util cmds
Add calls to our new device cache flushing helper in commands that use
buffered reads.

Signed-off-by: Zach Brown <zab@versity.com>
2023-01-18 10:52:02 -08:00
Zach Brown
ddb5cce2a5 Add quick utils flush_device helper
Add a quick helper that just calls cache flushing ioctls on different
kinds of files.

Signed-off-by: Zach Brown <zab@versity.com>
2023-01-18 10:27:47 -08:00
Zach Brown
ef2daf8857 Make data preallocation tunable
Make mount options for the size of preallocation and whether or not it
should be restricted to extending writes.  Disabling the default
restriction to streaming writes lets it preallocate in aligned regions
of the preallocation size when they contain no extents.

Signed-off-by: Zach Brown <zab@versity.com>
2022-10-14 14:03:35 -07:00
Zach Brown
29538a9f45 Add POSIX ACL support
Add support for the POSIX ACLs as described in acl(5).  Support is
enabled by default and can be explicitly enabled or disabled with the
acl or noacl mount options, respectively.

Signed-off-by: Zach Brown <zab@versity.com>
2022-09-28 10:36:10 -07:00
Zach Brown
49df98f5a8 Add skip-likely-huge print option
Add an option to skip printing structures that are likely to be so huge
that the print output becomes completely unwieldly on large systems.

Signed-off-by: Zach Brown <zab@versity.com>
2022-07-06 15:07:57 -07:00
Zach Brown
26ae9c6e04 Verify local unmount testing fence script
The fence script we use for our single node multi-mount tests only knows
how to fence by using forced unmount to destroy a mount.  As of now, the
tests only generate failing nodes that need to be fenced by using forced
unmount as well.  This results in the awkward situation where the
testing fence script doesn't have anything to do because the mount is
already gone.

When the test fence script has nothing to do we might not notice if it
isn't run.  This adds explicit verification to the fencing tests that
the script was really run.  It adds per-invocation logging to the fence
script and the test makes sure that it was run.

While we're at it, we take the opportunity to tidy up some of the
scripting around this.  We use a sysfs file with the data device
major:minor numbers so that the fencing script can find and unmount
mounts without having to ask them for their rid.  They may not be
operational.

Signed-off-by: Zach Brown <zab@versity.com>
2022-03-28 14:52:08 -07:00
Zach Brown
a67ea30bb7 Add orphan_scan_delay_ms mount option
Add a mount option to set the delay betwen scanning of the orphan list.
The sysfs file for the option is writable so this option can be set at
run time.

Signed-off-by: Zach Brown <zab@versity.com>
2022-03-10 11:43:11 -08:00
Zach Brown
ae08a797ae Clean quorum and format change command docs
The man pages and inline help blurbs for the recently added format
version and quorum config commands incorrectly described the device
arguments which are needed.

Signed-off-by: Zach Brown <zab@versity.com>
2022-02-08 11:23:27 -08:00
Zach Brown
e067961714 Add get-allocated-inos scoutfs command
Add the get-allocated-inos scoutfs command which wraps the
GET_ALLOCATED_INOS ioctl.   It'll be used by tests to find items
associated with an inode instead of trying to open the inode by a
constructed handle after it was unlinked.

Signed-off-by: Zach Brown <zab@versity.com>
2022-01-24 09:40:08 -08:00
Zach Brown
813ce24d79 Move local-force-unmount test script into tests/
The local-force-unmount fenced fencing script only works when all the
mounts are on the local host and it uses force unmount.   It is only
used in our specific local testing scripts.  Packaging it as an example
lead people to believe that it could be used to cobble together a
multi-host testing network, however temporary.

Move it from being in utils and packged to being private to our tests so
that it doesn't present an attractive nuisance.

Signed-off-by: Zach Brown <zab@versity.com>
2022-01-19 11:33:34 -08:00
Zach Brown
89ca903c41 Print log trees get/commit seqs
Back when we added the get/commit transaction sequence numbers to the
log_trees we forgot to add them to the scoutfs print output.

Signed-off-by: Zach Brown <zab@versity.com>
2022-01-19 09:21:02 -08:00
Zach Brown
8bc1ee8346 Add change-quorum-config command
Add a command to change the quorum config which starts by only supports
updating the super block whlie the file system is oflfine.

Signed-off-by: Zach Brown <zab@versity.com>
2021-11-24 15:41:04 -08:00