Compare commits

...

27 Commits

Author SHA1 Message Date
Auke Kok c137637007 We can't cross-mount ipv4/ipv6.
I thought this would just work between ipv4 and ipv6 based quorum
members, but it turns out it will only work one way by default. While
we could make this work (multiple sockets, special sockopts) it is
highly unlikely and very undesirable.

Much stronger feels to just disallow it explicitly and reject mixed
v4/v6 configurations outright (mkfs/change-quorum, and mount) to avoid
this. I can't imagine this doing any good for users' fencing setups.

The test cases added validate the 2 easy userspace checks. The mount
check isn't easily testable because we disallow userspace from creating
such a failure path.

One additional test section tests the migration path from v4->v6->v4
so there's at least some test that checks that this actually works.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-14 14:43:00 -07:00
Auke Kok 2c46e37543 Account for ipv6 in kernel_get{sock,peer}name compat.
These 2 kernel wrapper functions need to be able to properly handle
the ipv6 addrlen, instead of returning -EAFNOSUPPORT(-97).

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-06 15:17:26 -07:00
Auke Kok 61ece361a7 Add IPv6 support to the kernel module.
This adds IPv6 support to the kernel module side.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-06 15:17:26 -07:00
Auke Kok afba683be6 Enable ipv6 in testing.
Instead of using 127.0.0.1, we initialize the quorum slots to ::1,
enabling all ipv6 support.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-06 15:17:26 -07:00
Auke Kok a95996e932 Add ipv6 support to scoutfs userspace utility.
This change adds ipv6 support to various scoutfs sub-commands, allowing
users to mkfs, print and change-quorum-config using ipv6 addresses, and
modifies the outputs.

Any ipv6 address/port is displayed as [::1]:5000 to comply with the
related RFC's. Input strings remain consistent as the quorum config
input value is comma-separated already, not posing any issues.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-06 15:17:26 -07:00
Auke Kok 6ff9fd39fa Don't stack alloc struct scoutfs_quorum_block_event old
The size of this thing is well over 1kb, and the compiler will
error on several supported distributions that this particular
function reaches over 2k stack frame size, which is excessive,
even for a function that isn't called regularly.

We can allocate the thing in one go if we smartly allocate this
as an array of (an array of structs) which allows us to index
it as a 2d array as before, taking away some of the additional
complexities.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-06 15:17:26 -07:00
Zach Brown fece0a9372 Merge pull request #310 from versity/zab/v1.31
v1.31 Release
2026-05-06 10:37:07 -07:00
Zach Brown aa432727f2 v1.31 Release
Finish the release notes for the 1.31 release.

Signed-off-by: Zach Brown <zab@versity.com>
2026-05-05 14:29:18 -07:00
Zach Brown ceebadd139 Merge pull request #308 from versity/auke/totl-delta-repair
totl key repair
2026-05-05 13:05:57 -07:00
Zach Brown 4b4ddc9ded Merge pull request #298 from versity/auke/double_unlock_dw_truncate
Fix double unlock in scoutfs_setattr data_wait error path
2026-05-04 09:52:29 -07:00
Zach Brown 94d3ece590 Merge pull request #299 from versity/auke/cond_resched_block_free
Add cond_resched in block_free_work
2026-05-04 09:49:43 -07:00
Auke Kok 6d5517614b Fix double unlock in scoutfs_setattr data_wait error path
When scoutfs_setattr truncates a file with offline extents, it unlocks
the inode lock before calling scoutfs_data_wait to wait for the data
to be staged. If data_wait returns any error, the code jumps to 'goto
out' which calls scoutfs_unlock again, thus double-unlocking the lock.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-04 09:48:54 -07:00
Auke Kok 10279d0b23 Add test exercising the totl delta inject ioctl.
Skews a totl twice, restore it, and intersperse setfattr/unlink to
exercise both injected and naturally-produced deltas.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-04 09:43:01 -07:00
Zach Brown 443c34309f Merge pull request #303 from versity/auke/clang_build_werr
3 minor clang things
2026-05-04 09:42:43 -07:00
Auke Kok 5c81a979d5 Add SCOUTFS_IOC_INJECT_TOTL_DELTA ioctl.
Inject a signed (total, count) delta at a totl key.  No validity
checking.  Requires CAP_SYS_ADMIN.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-04 09:42:42 -07:00
Zach Brown ec38b6e1c8 Merge pull request #305 from versity/auke/block_submit_bio_err
Set BLOCK_BIT_ERROR on bio submit failure during forced unmount
2026-05-04 09:35:43 -07:00
Zach Brown 8e0066b231 Merge pull request #309 from versity/auke/quota_invalidate_race
fix and test - quota invalidate race
2026-05-04 09:34:26 -07:00
Zach Brown a0fda5b735 Merge pull request #307 from versity/zab/next_merge_range_zero
Search all merge range items for next
2026-05-04 09:29:54 -07:00
Auke Kok fc56a69d8f Add quota invalidate race regression test
Run concurrent quota add/del on one mount against rapid file
creation and deletion on both mounts to exercise the race fixed
in the previous commit.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-02 13:19:31 -07:00
Auke Kok c8bc42ccdb Fix quota invalidate race with concurrent ruleset read
A quota check holds the quota cluster lock for READ and marks the
cached ruleset EBUSY while loading rules.  A quota mod on the same
mount holds the lock for WRITE (compatible with the local READ)
and calls scoutfs_quota_invalidate(), tripping
BUG_ON(rs == ERR_PTR(-EBUSY)).

Make invalidate skip EBUSY so the reader's claim is preserved, and
have scoutfs_quota_mod_rule wait for the reader to finish before
calling invalidate.  Without the wait, the in-flight reader would
publish its stale ruleset after invalidate runs, leaving the cache
stale until the next invalidation.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-05-02 13:19:31 -07:00
Zach Brown 4db0a48fe4 Search all merge range items for next
When searching for the next least merge range we need to sweep all the
stored items because they're interleaved with respect to key sorting
because we've clobbered the zone.

To search all of them we need to start from 0, not from the caller's
start key after setting the zone.  If the caller happens to provide a
start key with a small zone but large other fields (totl keys with
sufficiently large identifiers) we can miss ranges.

Signed-off-by: Zach Brown <zab@zabbo.net>
2026-04-29 10:17:38 -07:00
Auke Kok ac1ab8e87f Add cond_resched in block_free_work
I'm seeing consistent CPU soft lockups in block_free_work on
my bare metal system that aren't reached by VM instances. The
reason is that the bare metal machine has a ton more memory
available causing the block free work queue to grow much
larger in size, and then it has so much work that it can take 30+
seconds before it goes through it all.

This is all with a debug kernel. A non debug kernel will likely
zoom through the outstanding work here at a much faster rate.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-22 13:39:32 -07:00
Zach Brown af31b9f1e8 Merge pull request #306 from versity/zab/v1.30
v1.30 Release
2026-04-22 10:43:17 -07:00
Auke Kok 8bfd35db0b Set BLOCK_BIT_ERROR on bio submit failure during forced unmount
block_submit_bio will return -ENOLINK if called during a forced
shutdown, the bio is never submitted, and thus no completion callback
will fire to set BLOCK_BIT_ERROR. Any other task waiting for this
specific bp will end up waiting forever.

To fix, fall through to the existing block_end_io call on the
error path instead of returning directly.  That means moving
the forcing_unmount check past the setup calls so block_end_io's
bookkeeping stays balanced. block_end_io then sets BLOCK_BIT_ERROR
and wakes up waiters just as it would on a failed async completion.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-20 17:01:12 -07:00
Auke Kok 019125d86d Don't swallow invalid message error
A malformed message encountered here increases the counter, but doesn't
tear down the connection because of the nested for loops. The comments
indicate that that is the expected behavior - a misbehaving client
should not be tolerated.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-15 17:02:40 -07:00
Auke Kok 347e27acec Fix leak in client side lock invalidation
Clang's scan-build found this leak when we get an invalidation
for a lock we no longer have. Free ireq to fix.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-15 16:35:10 -07:00
Auke Kok 3ce5d47f2c Initialize resp_data to silence clang uninitialized warning
Clang flow analysis flags resp_data in process_response as possibly
uninitialized when find_request returns NULL.

  kmod/src/net.c:533:6: error: variable 'resp_data' is used uninitialized
  whenever 'if' condition is false [-Werror,-Wsometimes-uninitialized]

In practice the read is harmless because resp_func stays NULL in that
path and call_resp_func only dereferences resp_data when resp_func is
non-NULL. Initialize at declaration.

Signed-off-by: Auke Kok <auke.kok@versity.com>
2026-04-15 14:06:46 -07:00
32 changed files with 808 additions and 212 deletions
+15
View File
@@ -1,6 +1,21 @@
Versity ScoutFS Release Notes
=============================
---
v1.31
\
*May 5, 2026*
Fix race between modifying quota rules and internal reading of the rules
that tripped an assertion.
Fix a bug that could skip merging totl items under specific heavy write
loads. This could lead to merged totl items incorrectly tracking the
sum of all the contributing totl xattrs.
Fix many small low risk bugs in error paths that were found with code
analysis and testing.
---
v1.30
\
+13 -3
View File
@@ -218,6 +218,7 @@ static void block_free_work(struct work_struct *work)
llist_for_each_entry_safe(bp, tmp, deleted, free_node) {
block_free(sb, bp);
cond_resched();
}
}
@@ -467,9 +468,6 @@ static int block_submit_bio(struct super_block *sb, struct block_private *bp,
sector_t sector;
int ret = 0;
if (scoutfs_forcing_unmount(sb))
return -ENOLINK;
sector = bp->bl.blkno << (SCOUTFS_BLOCK_LG_SHIFT - 9);
WARN_ON_ONCE(bp->bl.blkno == U64_MAX);
@@ -480,6 +478,17 @@ static int block_submit_bio(struct super_block *sb, struct block_private *bp,
set_bit(BLOCK_BIT_IO_BUSY, &bp->bits);
block_get(bp);
/*
* A second thread may already be waiting on this block's completion
* after this thread won the race to submit the block. We exit through
* the block_end_io error path which sets BLOCK_BIT_ERROR and assures
* that other callers in the waitq get woken up.
*/
if (scoutfs_forcing_unmount(sb)) {
ret = -ENOLINK;
goto end_io;
}
blk_start_plug(&plug);
for (off = 0; off < SCOUTFS_BLOCK_LG_SIZE; off += PAGE_SIZE) {
@@ -517,6 +526,7 @@ static int block_submit_bio(struct super_block *sb, struct block_private *bp,
blk_finish_plug(&plug);
end_io:
/* let racing end_io know we're done */
block_end_io(sb, opf, bp, ret);
+1 -1
View File
@@ -479,7 +479,7 @@ static void scoutfs_client_connect_worker(struct work_struct *work)
struct scoutfs_sb_info *sbi = SCOUTFS_SB(sb);
struct scoutfs_mount_options opts;
struct scoutfs_net_greeting greet;
struct sockaddr_in sin;
struct sockaddr_storage sin;
bool am_quorum;
int ret;
+13 -7
View File
@@ -25,6 +25,7 @@
#include "sysfs.h"
#include "server.h"
#include "fence.h"
#include "net.h"
/*
* Fencing ensures that a given mount can no longer write to the
@@ -79,7 +80,7 @@ struct pending_fence {
struct timer_list timer;
ktime_t start_kt;
__be32 ipv4_addr;
union scoutfs_inet_addr addr;
bool fenced;
bool error;
int reason;
@@ -171,14 +172,19 @@ static ssize_t error_store(struct kobject *kobj, struct kobj_attribute *attr, co
}
SCOUTFS_ATTR_RW(error);
static ssize_t ipv4_addr_show(struct kobject *kobj,
static ssize_t inet_addr_show(struct kobject *kobj,
struct kobj_attribute *attr, char *buf)
{
DECLARE_FENCE_FROM_KOBJ(fence, kobj);
struct sockaddr_storage sin;
return snprintf(buf, PAGE_SIZE, "%pI4", &fence->ipv4_addr);
memset(&sin, 0, sizeof(struct sockaddr_storage));
scoutfs_addr_to_sin(&sin, &fence->addr);
return snprintf(buf, PAGE_SIZE, "%pISc", SIN_ARG(&sin));
}
SCOUTFS_ATTR_RO(ipv4_addr);
SCOUTFS_ATTR_RO(inet_addr);
static ssize_t reason_show(struct kobject *kobj, struct kobj_attribute *attr,
char *buf)
@@ -212,7 +218,7 @@ static struct attribute *fence_attrs[] = {
SCOUTFS_ATTR_PTR(elapsed_secs),
SCOUTFS_ATTR_PTR(fenced),
SCOUTFS_ATTR_PTR(error),
SCOUTFS_ATTR_PTR(ipv4_addr),
SCOUTFS_ATTR_PTR(inet_addr),
SCOUTFS_ATTR_PTR(reason),
SCOUTFS_ATTR_PTR(rid),
NULL,
@@ -232,7 +238,7 @@ static void fence_timeout(struct timer_list *timer)
wake_up(&fi->waitq);
}
int scoutfs_fence_start(struct super_block *sb, u64 rid, __be32 ipv4_addr, int reason)
int scoutfs_fence_start(struct super_block *sb, u64 rid, union scoutfs_inet_addr *addr, int reason)
{
DECLARE_FENCE_INFO(sb, fi);
struct pending_fence *fence;
@@ -248,7 +254,7 @@ int scoutfs_fence_start(struct super_block *sb, u64 rid, __be32 ipv4_addr, int r
scoutfs_sysfs_init_attrs(sb, &fence->ssa);
fence->start_kt = ktime_get();
fence->ipv4_addr = ipv4_addr;
memcpy(&fence->addr, addr, sizeof(union scoutfs_inet_addr));
fence->fenced = false;
fence->error = false;
fence->reason = reason;
+1 -1
View File
@@ -7,7 +7,7 @@ enum {
SCOUTFS_FENCE_QUORUM_BLOCK_LEADER,
};
int scoutfs_fence_start(struct super_block *sb, u64 rid, __be32 ipv4_addr, int reason);
int scoutfs_fence_start(struct super_block *sb, u64 rid, union scoutfs_inet_addr *addr, int reason);
int scoutfs_fence_next(struct super_block *sb, u64 *rid, int *reason, bool *error);
int scoutfs_fence_reason_pending(struct super_block *sb, int reason);
int scoutfs_fence_free(struct super_block *sb, u64 rid);
+1
View File
@@ -549,6 +549,7 @@ retry:
goto out;
if (scoutfs_data_wait_found(&dw)) {
scoutfs_unlock(sb, lock, SCOUTFS_LOCK_WRITE);
lock = NULL;
/* XXX callee locks instead? */
inode_unlock(inode);
+39
View File
@@ -1739,6 +1739,43 @@ out:
return ret;
}
static long scoutfs_ioc_inject_totl_delta(struct file *file, unsigned long arg)
{
struct super_block *sb = file_inode(file)->i_sb;
struct scoutfs_ioctl_inject_totl_delta __user *uitd = (void __user *)arg;
struct scoutfs_ioctl_inject_totl_delta itd;
struct scoutfs_xattr_totl_val tval;
struct scoutfs_lock *lock = NULL;
struct scoutfs_key key;
int ret;
if (!capable(CAP_SYS_ADMIN))
return -EPERM;
if (copy_from_user(&itd, uitd, sizeof(itd)))
return -EFAULT;
scoutfs_xattr_init_totl_key(&key, itd.name);
tval.total = cpu_to_le64((u64)itd.total);
tval.count = cpu_to_le64((u64)itd.count);
ret = scoutfs_lock_xattr_totl(sb, SCOUTFS_LOCK_WRITE_ONLY, 0, &lock);
if (ret < 0)
goto out;
ret = scoutfs_hold_trans(sb, true);
if (ret < 0)
goto unlock;
ret = scoutfs_item_delta(sb, &key, &tval, sizeof(tval), lock);
scoutfs_release_trans(sb);
unlock:
scoutfs_unlock(sb, lock, SCOUTFS_LOCK_WRITE_ONLY);
out:
return ret;
}
long scoutfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
{
switch (cmd) {
@@ -1790,6 +1827,8 @@ long scoutfs_ioctl(struct file *file, unsigned int cmd, unsigned long arg)
return scoutfs_ioc_read_xattr_index(file, arg);
case SCOUTFS_IOC_PUNCH_OFFLINE:
return scoutfs_ioc_punch_offline(file, arg);
case SCOUTFS_IOC_INJECT_TOTL_DELTA:
return scoutfs_ioc_inject_totl_delta(file, arg);
}
return -ENOTTY;
+13
View File
@@ -876,4 +876,17 @@ struct scoutfs_ioctl_punch_offline {
#define SCOUTFS_IOC_PUNCH_OFFLINE \
_IOW(SCOUTFS_IOCTL_MAGIC, 24, struct scoutfs_ioctl_punch_offline)
/*
* Inject a signed (total, count) delta at the totl key @name (a, b, c
* match the trailing dotted u64s of a totl xattr name).
*/
struct scoutfs_ioctl_inject_totl_delta {
__u64 name[SCOUTFS_IOCTL_XATTR_TOTAL_NAME_NR];
__s64 total;
__s64 count;
};
#define SCOUTFS_IOC_INJECT_TOTL_DELTA \
_IOW(SCOUTFS_IOCTL_MAGIC, 25, struct scoutfs_ioctl_inject_totl_delta)
#endif
+8 -4
View File
@@ -195,9 +195,11 @@ struct kc_shrinker_wrapper {
#include <linux/inet.h>
static inline int kc_kernel_getsockname(struct socket *sock, struct sockaddr *addr)
{
int addrlen = sizeof(struct sockaddr_in);
int addrlen = sizeof(struct sockaddr_storage);
int ret = kernel_getsockname(sock, addr, &addrlen);
if (ret == 0 && addrlen != sizeof(struct sockaddr_in))
if (ret == 0 && (!(
(addrlen == sizeof(struct sockaddr_in)) ||
(addrlen == sizeof(struct sockaddr_in6)))))
return -EAFNOSUPPORT;
else if (ret < 0)
return ret;
@@ -206,9 +208,11 @@ static inline int kc_kernel_getsockname(struct socket *sock, struct sockaddr *ad
}
static inline int kc_kernel_getpeername(struct socket *sock, struct sockaddr *addr)
{
int addrlen = sizeof(struct sockaddr_in);
int addrlen = sizeof(struct sockaddr_storage);
int ret = kernel_getpeername(sock, addr, &addrlen);
if (ret == 0 && addrlen != sizeof(struct sockaddr_in))
if (ret == 0 && (!(
(addrlen == sizeof(struct sockaddr_in)) ||
(addrlen == sizeof(struct sockaddr_in6)))))
return -EAFNOSUPPORT;
else if (ret < 0)
return ret;
+1
View File
@@ -813,6 +813,7 @@ int scoutfs_lock_invalidate_request(struct super_block *sb, u64 net_id,
out:
if (!lock) {
kfree(ireq);
ret = scoutfs_client_lock_response(sb, net_id, nl);
BUG_ON(ret); /* lock server doesn't fence timed out client requests */
}
+27 -15
View File
@@ -525,7 +525,7 @@ static int process_response(struct scoutfs_net_connection *conn,
struct super_block *sb = conn->sb;
struct message_send *msend;
scoutfs_net_response_t resp_func = NULL;
void *resp_data;
void *resp_data = NULL;
spin_lock(&conn->lock);
@@ -804,7 +804,7 @@ static void scoutfs_net_recv_worker(struct work_struct *work)
if (invalid_message(conn, nh)) {
scoutfs_inc_counter(sb, net_recv_invalid_message);
ret = -EBADMSG;
break;
goto out;
}
data_len = le16_to_cpu(nh->data_len);
@@ -1218,7 +1218,8 @@ static void scoutfs_net_connect_worker(struct work_struct *work)
trace_scoutfs_net_connect_work_enter(sb, 0, 0);
ret = kc_sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
ret = kc_sock_create_kern(conn->connect_sin.ss_family,
SOCK_STREAM, IPPROTO_TCP, &sock);
if (ret)
goto out;
@@ -1239,7 +1240,9 @@ static void scoutfs_net_connect_worker(struct work_struct *work)
trace_scoutfs_conn_connect_start(conn);
ret = kernel_connect(sock, (struct sockaddr *)&conn->connect_sin,
sizeof(struct sockaddr_in), 0);
conn->connect_sin.ss_family == AF_INET ?
sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6),
0);
if (ret)
goto out;
@@ -1281,6 +1284,13 @@ static bool empty_accepted_list(struct scoutfs_net_connection *conn)
return empty;
}
/*
* sockaddr_storage wraps both _in and _in6, which have _port always
* __be16 at the same offset, and we only need to test whether it's
* zero.
*/
#define sockaddr_port_is_nonzero(sin) ((sin).__data[0] || (sin).__data[1])
/*
* Safely shut down an active connection. This can be triggered by
* errors in workers or by an external call to free the connection. The
@@ -1304,7 +1314,7 @@ static void scoutfs_net_shutdown_worker(struct work_struct *work)
trace_scoutfs_conn_shutdown_start(conn);
/* connected and accepted conns print a message */
if (conn->peername.sin_port != 0)
if (sockaddr_port_is_nonzero(conn->peername))
scoutfs_info(sb, "%s "SIN_FMT" -> "SIN_FMT,
conn->listening_conn ? "server closing" :
"client disconnected",
@@ -1434,6 +1444,7 @@ static void scoutfs_net_reconn_free_worker(struct work_struct *work)
DEFINE_CONN_FROM_WORK(conn, work, reconn_free_dwork.work);
struct super_block *sb = conn->sb;
struct scoutfs_net_connection *acc;
union scoutfs_inet_addr addr;
unsigned long now = jiffies;
unsigned long deadline = 0;
bool requeue = false;
@@ -1454,8 +1465,9 @@ restart:
if (!test_conn_fl(conn, shutting_down)) {
scoutfs_info(sb, "client "SIN_FMT" reconnect timed out, fencing",
SIN_ARG(&acc->last_peername));
scoutfs_sin_to_addr(&addr, &acc->last_peername);
ret = scoutfs_fence_start(sb, acc->rid,
acc->last_peername.sin_addr.s_addr,
&addr,
SCOUTFS_FENCE_CLIENT_RECONNECT);
if (ret) {
scoutfs_err(sb, "client fence returned err %d, shutting down server",
@@ -1538,9 +1550,9 @@ scoutfs_net_alloc_conn(struct super_block *sb,
conn->req_funcs = req_funcs;
spin_lock_init(&conn->lock);
init_waitqueue_head(&conn->waitq);
conn->sockname.sin_family = AF_INET;
conn->peername.sin_family = AF_INET;
conn->last_peername.sin_family = AF_INET;
conn->sockname.ss_family = AF_UNSPEC;
conn->peername.ss_family = AF_UNSPEC;
conn->last_peername.ss_family = AF_UNSPEC;
INIT_LIST_HEAD(&conn->accepted_head);
INIT_LIST_HEAD(&conn->accepted_list);
conn->next_send_seq = 1;
@@ -1619,7 +1631,7 @@ void scoutfs_net_free_conn(struct super_block *sb,
*/
int scoutfs_net_bind(struct super_block *sb,
struct scoutfs_net_connection *conn,
struct sockaddr_in *sin)
struct sockaddr_storage *sin)
{
struct socket *sock = NULL;
int addrlen;
@@ -1630,7 +1642,7 @@ int scoutfs_net_bind(struct super_block *sb,
if (WARN_ON_ONCE(conn->sock))
return -EINVAL;
ret = kc_sock_create_kern(AF_INET, SOCK_STREAM, IPPROTO_TCP, &sock);
ret = kc_sock_create_kern(sin->ss_family, SOCK_STREAM, IPPROTO_TCP, &sock);
if (ret)
goto out;
@@ -1642,7 +1654,7 @@ int scoutfs_net_bind(struct super_block *sb,
if (ret)
goto out;
addrlen = sizeof(struct sockaddr_in);
addrlen = sin->ss_family == AF_INET ? sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6);
ret = kernel_bind(sock, (struct sockaddr *)sin, addrlen);
if (ret)
goto out;
@@ -1658,7 +1670,7 @@ int scoutfs_net_bind(struct super_block *sb,
ret = 0;
conn->sock = sock;
*sin = conn->sockname;
sin = (struct sockaddr_storage *)&conn->sockname;
out:
if (ret < 0 && sock)
@@ -1693,7 +1705,7 @@ static bool connect_result(struct scoutfs_net_connection *conn, int *error)
done = true;
*error = 0;
} else if (test_conn_fl(conn, shutting_down) ||
conn->connect_sin.sin_family == 0) {
conn->connect_sin.ss_family == AF_UNSPEC) {
done = true;
*error = -ESHUTDOWN;
}
@@ -1714,7 +1726,7 @@ static bool connect_result(struct scoutfs_net_connection *conn, int *error)
*/
int scoutfs_net_connect(struct super_block *sb,
struct scoutfs_net_connection *conn,
struct sockaddr_in *sin, unsigned long timeout_ms)
struct sockaddr_storage *sin, unsigned long timeout_ms)
{
int ret = 0;
+38 -21
View File
@@ -49,15 +49,15 @@ struct scoutfs_net_connection {
unsigned long flags; /* CONN_FL_* bitmask */
unsigned long reconn_deadline;
struct sockaddr_in connect_sin;
struct sockaddr_storage connect_sin;
unsigned long connect_timeout_ms;
struct socket *sock;
u64 rid;
u64 greeting_id;
struct sockaddr_in sockname;
struct sockaddr_in peername;
struct sockaddr_in last_peername;
struct sockaddr_storage sockname;
struct sockaddr_storage peername;
struct sockaddr_storage last_peername;
struct list_head accepted_head;
struct scoutfs_net_connection *listening_conn;
@@ -99,27 +99,44 @@ enum conn_flags {
CONN_FL_reconn_freeing = (1UL << 6), /* waiting done, setter frees */
};
#define SIN_FMT "%pIS:%u"
#define SIN_ARG(sin) sin, be16_to_cpu((sin)->sin_port)
#define SIN_FMT "%pISpc"
#define SIN_ARG(sin) sin
static inline void scoutfs_addr_to_sin(struct sockaddr_in *sin,
static inline void scoutfs_addr_to_sin(struct sockaddr_storage *sin,
union scoutfs_inet_addr *addr)
{
BUG_ON(addr->v4.family != cpu_to_le16(SCOUTFS_AF_IPV4));
sin->sin_family = AF_INET;
sin->sin_addr.s_addr = cpu_to_be32(le32_to_cpu(addr->v4.addr));
sin->sin_port = cpu_to_be16(le16_to_cpu(addr->v4.port));
if (addr->v4.family == cpu_to_le16(SCOUTFS_AF_IPV4)) {
struct sockaddr_in *sin4 = (struct sockaddr_in *)sin;
memset(sin, 0, sizeof(struct sockaddr_storage));
sin4->sin_family = AF_INET;
sin4->sin_addr.s_addr = cpu_to_be32(le32_to_cpu(addr->v4.addr));
sin4->sin_port = cpu_to_be16(le16_to_cpu(addr->v4.port));
} else if (addr->v6.family == cpu_to_le16(SCOUTFS_AF_IPV6)) {
struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)sin;
memset(sin, 0, sizeof(struct sockaddr_storage));
sin6->sin6_family = AF_INET6;
memcpy(&sin6->sin6_addr.in6_u.u6_addr8, &addr->v6.addr, 16);
sin6->sin6_port = cpu_to_be16(le16_to_cpu(addr->v6.port));
} else
BUG();
}
static inline void scoutfs_sin_to_addr(union scoutfs_inet_addr *addr, struct sockaddr_in *sin)
static inline void scoutfs_sin_to_addr(union scoutfs_inet_addr *addr, struct sockaddr_storage *sin)
{
BUG_ON(sin->sin_family != AF_INET);
memset(addr, 0, sizeof(union scoutfs_inet_addr));
addr->v4.family = cpu_to_le16(SCOUTFS_AF_IPV4);
addr->v4.addr = be32_to_le32(sin->sin_addr.s_addr);
addr->v4.port = be16_to_le16(sin->sin_port);
if (sin->ss_family == AF_INET) {
struct sockaddr_in *sin4 = (struct sockaddr_in *)sin;
memset(addr, 0, sizeof(union scoutfs_inet_addr));
addr->v4.family = cpu_to_le16(SCOUTFS_AF_IPV4);
addr->v4.addr = be32_to_le32(sin4->sin_addr.s_addr);
addr->v4.port = be16_to_le16(sin4->sin_port);
} else if (sin->ss_family == AF_INET6) {
struct sockaddr_in6 *sin6 = (struct sockaddr_in6 *)sin;
memset(addr, 0, sizeof(union scoutfs_inet_addr));
addr->v6.family = cpu_to_le16(SCOUTFS_AF_IPV6);
memcpy(&addr->v6.addr, &sin6->sin6_addr.in6_u.u6_addr8, 16);
addr->v6.port = be16_to_le16(sin6->sin6_port);
} else
BUG();
}
struct scoutfs_net_connection *
@@ -130,10 +147,10 @@ scoutfs_net_alloc_conn(struct super_block *sb,
u64 scoutfs_net_client_rid(struct scoutfs_net_connection *conn);
int scoutfs_net_connect(struct super_block *sb,
struct scoutfs_net_connection *conn,
struct sockaddr_in *sin, unsigned long timeout_ms);
struct sockaddr_storage *sin, unsigned long timeout_ms);
int scoutfs_net_bind(struct super_block *sb,
struct scoutfs_net_connection *conn,
struct sockaddr_in *sin);
struct sockaddr_storage *sin);
void scoutfs_net_listen(struct super_block *sb,
struct scoutfs_net_connection *conn);
int scoutfs_net_submit_request(struct super_block *sb,
+137 -43
View File
@@ -145,14 +145,26 @@ struct quorum_info {
#define DECLARE_QUORUM_INFO_KOBJ(kobj, name) \
DECLARE_QUORUM_INFO(SCOUTFS_SYSFS_ATTRS_SB(kobj), name)
static bool quorum_slot_present(struct scoutfs_quorum_config *qconf, int i)
static bool quorum_slot_ipv4(struct scoutfs_quorum_config *qconf, int i)
{
BUG_ON(i < 0 || i > SCOUTFS_QUORUM_MAX_SLOTS);
return qconf->slots[i].addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4);
}
static void quorum_slot_sin(struct scoutfs_quorum_config *qconf, int i, struct sockaddr_in *sin)
static bool quorum_slot_ipv6(struct scoutfs_quorum_config *qconf, int i)
{
BUG_ON(i < 0 || i > SCOUTFS_QUORUM_MAX_SLOTS);
return qconf->slots[i].addr.v6.family == cpu_to_le16(SCOUTFS_AF_IPV6);
}
static bool quorum_slot_present(struct scoutfs_quorum_config *qconf, int i)
{
return quorum_slot_ipv4(qconf, i) || quorum_slot_ipv6(qconf, i);
}
static void quorum_slot_sin(struct scoutfs_quorum_config *qconf, int i, struct sockaddr_storage *sin)
{
BUG_ON(i < 0 || i >= SCOUTFS_QUORUM_MAX_SLOTS);
@@ -179,11 +191,18 @@ static int create_socket(struct super_block *sb)
{
DECLARE_QUORUM_INFO(sb, qinf);
struct socket *sock = NULL;
struct sockaddr_in sin;
struct sockaddr_storage sin;
struct scoutfs_quorum_slot slot = qinf->qconf.slots[qinf->our_quorum_slot_nr];
int addrlen;
int ret;
ret = kc_sock_create_kern(PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock);
if (le16_to_cpu(slot.addr.v4.family) == SCOUTFS_AF_IPV4)
ret = kc_sock_create_kern(PF_INET, SOCK_DGRAM, IPPROTO_UDP, &sock);
else if (le16_to_cpu(slot.addr.v6.family) == SCOUTFS_AF_IPV6)
ret = kc_sock_create_kern(PF_INET6, SOCK_DGRAM, IPPROTO_UDP, &sock);
else
BUG();
if (ret) {
scoutfs_err(sb, "quorum couldn't create udp socket: %d", ret);
goto out;
@@ -192,9 +211,9 @@ static int create_socket(struct super_block *sb)
/* rather fail and retry than block waiting for free */
sock->sk->sk_allocation = GFP_ATOMIC;
addrlen = (le16_to_cpu(slot.addr.v4.family) == SCOUTFS_AF_IPV4) ?
sizeof(struct sockaddr_in) : sizeof(struct sockaddr_in6);
quorum_slot_sin(&qinf->qconf, qinf->our_quorum_slot_nr, &sin);
addrlen = sizeof(sin);
ret = kernel_bind(sock, (struct sockaddr *)&sin, addrlen);
if (ret) {
scoutfs_err(sb, "quorum failed to bind udp socket to "SIN_FMT": %d",
@@ -241,7 +260,7 @@ static int send_msg_members(struct super_block *sb, int type, u64 term, int only
.iov_base = &qmes,
.iov_len = sizeof(qmes),
};
struct sockaddr_in sin;
struct sockaddr_storage sin;
struct msghdr mh = {
.msg_flags = MSG_DONTWAIT | MSG_NOSIGNAL,
.msg_name = &sin,
@@ -542,10 +561,11 @@ int scoutfs_quorum_fence_leaders(struct super_block *sb, struct scoutfs_quorum_c
u64 term)
{
#define NR_OLD 2
struct scoutfs_quorum_block_event old[SCOUTFS_QUORUM_MAX_SLOTS][NR_OLD] = {{{0,}}};
struct scoutfs_quorum_block_event (*old)[NR_OLD];
struct scoutfs_sb_info *sbi = SCOUTFS_SB(sb);
struct scoutfs_quorum_block blk;
struct sockaddr_in sin;
struct sockaddr_storage sin;
union scoutfs_inet_addr addr;
const __le64 lefsid = cpu_to_le64(sbi->fsid);
const u64 rid = sbi->rid;
bool fence_started = false;
@@ -558,13 +578,20 @@ int scoutfs_quorum_fence_leaders(struct super_block *sb, struct scoutfs_quorum_c
BUILD_BUG_ON(SCOUTFS_QUORUM_BLOCKS < SCOUTFS_QUORUM_MAX_SLOTS);
old = kmalloc(NR_OLD * SCOUTFS_QUORUM_MAX_SLOTS * sizeof(struct scoutfs_quorum_block_event), GFP_KERNEL);
if (!old) {
ret = -ENOMEM;
goto out;
}
memset(old, 0, NR_OLD * SCOUTFS_QUORUM_MAX_SLOTS * sizeof(struct scoutfs_quorum_block_event));
for (i = 0; i < SCOUTFS_QUORUM_MAX_SLOTS; i++) {
if (!quorum_slot_present(qconf, i))
continue;
ret = read_quorum_block(sb, SCOUTFS_QUORUM_BLKNO + i, &blk, false);
if (ret < 0)
goto out;
goto out_free;
/* elected leader still running */
if (le64_to_cpu(blk.events[SCOUTFS_QUORUM_EVENT_ELECT].term) >
@@ -598,14 +625,17 @@ int scoutfs_quorum_fence_leaders(struct super_block *sb, struct scoutfs_quorum_c
scoutfs_info(sb, "fencing previous leader "SCSBF" at term %llu in slot %u with address "SIN_FMT,
SCSB_LEFR_ARGS(lefsid, fence_rid),
le64_to_cpu(old[i][j].term), i, SIN_ARG(&sin));
ret = scoutfs_fence_start(sb, le64_to_cpu(fence_rid), sin.sin_addr.s_addr,
scoutfs_sin_to_addr(&addr, &sin);
ret = scoutfs_fence_start(sb, le64_to_cpu(fence_rid), &addr,
SCOUTFS_FENCE_QUORUM_BLOCK_LEADER);
if (ret < 0)
goto out;
goto out_free;
fence_started = true;
}
}
out_free:
kfree(old);
out:
err = scoutfs_fence_wait_fenced(sb, msecs_to_jiffies(SCOUTFS_QUORUM_FENCE_TO_MS));
if (ret == 0)
@@ -708,7 +738,7 @@ static void scoutfs_quorum_worker(struct work_struct *work)
struct quorum_info *qinf = container_of(work, struct quorum_info, work);
struct scoutfs_mount_options opts;
struct super_block *sb = qinf->sb;
struct sockaddr_in unused;
struct sockaddr_storage unused;
struct quorum_host_msg msg;
struct quorum_status qst = {0,};
struct hb_recording hbr;
@@ -990,7 +1020,7 @@ out:
* leader with the greatest elected term. If we get it wrong the
* connection will timeout and the client will try again.
*/
int scoutfs_quorum_server_sin(struct super_block *sb, struct sockaddr_in *sin)
int scoutfs_quorum_server_sin(struct super_block *sb, struct sockaddr_storage *sin)
{
struct scoutfs_super_block *super = NULL;
struct scoutfs_quorum_block blk;
@@ -1049,7 +1079,7 @@ u8 scoutfs_quorum_votes_needed(struct super_block *sb)
return qinf->votes_needed;
}
void scoutfs_quorum_slot_sin(struct scoutfs_quorum_config *qconf, int i, struct sockaddr_in *sin)
void scoutfs_quorum_slot_sin(struct scoutfs_quorum_config *qconf, int i, struct sockaddr_storage *sin)
{
return quorum_slot_sin(qconf, i, sin);
}
@@ -1208,8 +1238,13 @@ static int verify_quorum_slots(struct super_block *sb, struct quorum_info *qinf,
struct scoutfs_quorum_config *qconf)
{
char slots[(SCOUTFS_QUORUM_MAX_SLOTS * 3) + 1];
struct sockaddr_in other;
struct sockaddr_in sin;
struct sockaddr_storage other;
struct sockaddr_storage sin;
struct sockaddr_in *sin4;
struct sockaddr_in *other4;
struct sockaddr_in6 *sin6;
struct sockaddr_in6 *other6;
__le16 family = cpu_to_le16(SCOUTFS_AF_NONE);
int found = 0;
int ret;
int i;
@@ -1220,35 +1255,94 @@ static int verify_quorum_slots(struct super_block *sb, struct quorum_info *qinf,
if (!quorum_slot_present(qconf, i))
continue;
scoutfs_quorum_slot_sin(qconf, i, &sin);
if (!valid_ipv4_unicast(sin.sin_addr.s_addr)) {
scoutfs_err(sb, "quorum slot #%d has invalid ipv4 unicast address: "SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
if (!valid_ipv4_port(sin.sin_port)) {
scoutfs_err(sb, "quorum slot #%d has invalid ipv4 port number:"SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if (!quorum_slot_present(qconf, j))
continue;
scoutfs_quorum_slot_sin(qconf, j, &other);
if (sin.sin_addr.s_addr == other.sin_addr.s_addr &&
sin.sin_port == other.sin_port) {
scoutfs_err(sb, "quorum slots #%u and #%u have the same address: "SIN_FMT,
i, j, SIN_ARG(&sin));
if (quorum_slot_ipv4(qconf, i)) {
if (family == cpu_to_le16(SCOUTFS_AF_NONE)) {
family = cpu_to_le16(SCOUTFS_AF_IPV4);
} else if (family != cpu_to_le16(SCOUTFS_AF_IPV4)) {
scoutfs_err(sb, "quorum slot #%d is IPv4 but earlier slots are IPv6; mixed IPv4/IPv6 quorum is not supported",
i);
return -EINVAL;
}
}
found++;
scoutfs_quorum_slot_sin(qconf, i, &sin);
sin4 = (struct sockaddr_in *)&sin;
if (!valid_ipv4_unicast(sin4->sin_addr.s_addr)) {
scoutfs_err(sb, "quorum slot #%d has invalid ipv4 unicast address: "SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
if (!valid_ipv4_port(sin4->sin_port)) {
scoutfs_err(sb, "quorum slot #%d has invalid ipv4 port number:"SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if (!quorum_slot_ipv4(qconf, j))
continue;
scoutfs_quorum_slot_sin(qconf, j, &other);
other4 = (struct sockaddr_in *)&other;
if (sin4->sin_addr.s_addr == other4->sin_addr.s_addr &&
sin4->sin_port == other4->sin_port) {
scoutfs_err(sb, "quorum slots #%u and #%u have the same address: "SIN_FMT,
i, j, SIN_ARG(&sin));
return -EINVAL;
}
}
found++;
} else if (quorum_slot_ipv6(qconf, i)) {
if (family == cpu_to_le16(SCOUTFS_AF_NONE)) {
family = cpu_to_le16(SCOUTFS_AF_IPV6);
} else if (family != cpu_to_le16(SCOUTFS_AF_IPV6)) {
scoutfs_err(sb, "quorum slot #%d is IPv6 but earlier slots are IPv4; mixed IPv4/IPv6 quorum is not supported",
i);
return -EINVAL;
}
quorum_slot_sin(qconf, i, &sin);
sin6 = (struct sockaddr_in6 *)&sin;
if ((sin6->sin6_addr.in6_u.u6_addr32[0] == 0) && (sin6->sin6_addr.in6_u.u6_addr32[1] == 0) &&
(sin6->sin6_addr.in6_u.u6_addr32[2] == 0) && (sin6->sin6_addr.in6_u.u6_addr32[3] == 0)) {
scoutfs_err(sb, "quorum slot #%d has unspecified ipv6 address:"SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
if (sin6->sin6_addr.in6_u.u6_addr8[0] == 0xff) {
scoutfs_err(sb, "quorum slot #%d has multicast ipv6 address:"SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
if (!valid_ipv4_port(sin6->sin6_port)) {
scoutfs_err(sb, "quorum slot #%d has invalid ipv6 port number:"SIN_FMT,
i, SIN_ARG(&sin));
return -EINVAL;
}
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if (!quorum_slot_ipv6(qconf, j))
continue;
quorum_slot_sin(qconf, j, &other);
other6 = (struct sockaddr_in6 *)&other;
if ((ipv6_addr_equal(&sin6->sin6_addr, &other6->sin6_addr)) &&
(sin6->sin6_port == other6->sin6_port)) {
scoutfs_err(sb, "quorum slots #%u and #%u have the same address: "SIN_FMT,
i, j, SIN_ARG(&sin));
return -EINVAL;
}
}
found++;
}
}
if (found == 0) {
+2 -2
View File
@@ -1,11 +1,11 @@
#ifndef _SCOUTFS_QUORUM_H_
#define _SCOUTFS_QUORUM_H_
int scoutfs_quorum_server_sin(struct super_block *sb, struct sockaddr_in *sin);
int scoutfs_quorum_server_sin(struct super_block *sb, struct sockaddr_storage *sin);
u8 scoutfs_quorum_votes_needed(struct super_block *sb);
void scoutfs_quorum_slot_sin(struct scoutfs_quorum_config *qconf, int i,
struct sockaddr_in *sin);
struct sockaddr_storage *sin);
int scoutfs_quorum_fence_leaders(struct super_block *sb, struct scoutfs_quorum_config *qconf,
u64 term);
+13 -10
View File
@@ -1114,6 +1114,7 @@ int scoutfs_quota_mod_rule(struct super_block *sb, bool is_add,
goto release;
}
wait_event(qtinf->waitq, !ruleset_is_busy(qtinf));
scoutfs_quota_invalidate(sb);
ret = 0;
@@ -1142,12 +1143,17 @@ void scoutfs_quota_get_lock_range(struct scoutfs_key *start, struct scoutfs_key
}
/*
* This is called during cluster lock invalidation to indicate that the
* ruleset is no longer protected by cluster locking and might have been
* modified. We mark the ruleset invalid and free it once all readers
* drain. The next check will acquire the cluster lock and read the
* rules. Because this is called during invalidation this is serialized
* with write holders of cluster locks so we can never see -EBUSY here.
* Mark the cached ruleset invalid and free the previous one once readers
* drain. Called from cluster lock invalidation and from quota rule
* modification.
*
* Cluster lock invalidation runs only after the lock layer has drained
* local READ users. Since EBUSY is set only while a reader holds READ,
* the reader has already published by the time we run.
*
* Quota rule modification waits on the waitq for any in-flight reader
* to publish before calling here, so the next check rebuilds against
* the newly written rules rather than the reader's stale result.
*/
void scoutfs_quota_invalidate(struct super_block *sb)
{
@@ -1161,13 +1167,10 @@ void scoutfs_quota_invalidate(struct super_block *sb)
spin_lock(&qtinf->lock);
rs = rcu_dereference_protected(qtinf->ruleset, lockdep_is_held(&qtinf->lock));
if (rs != ERR_PTR(-EINVAL))
if (rs == ERR_PTR(-ENOENT) || !IS_ERR(rs))
rcu_assign_pointer(qtinf->ruleset, ERR_PTR(-EINVAL));
spin_unlock(&qtinf->lock);
/* cluster locking should have prevented this */
BUG_ON(rs == ERR_PTR(-EBUSY));
if (!IS_ERR(rs))
call_rcu(&rs->rcu, free_ruleset_rcu);
+21 -19
View File
@@ -1355,35 +1355,37 @@ DEFINE_EVENT(scoutfs_lock_class, scoutfs_lock_shrink,
);
DECLARE_EVENT_CLASS(scoutfs_net_class,
TP_PROTO(struct super_block *sb, struct sockaddr_in *name,
struct sockaddr_in *peer, struct scoutfs_net_header *nh),
TP_PROTO(struct super_block *sb, struct sockaddr_storage *name,
struct sockaddr_storage *peer, struct scoutfs_net_header *nh),
TP_ARGS(sb, name, peer, nh),
TP_STRUCT__entry(
SCSB_TRACE_FIELDS
si4_trace_define(name)
si4_trace_define(peer)
__field_struct(struct sockaddr_storage, name)
__field_struct(struct sockaddr_storage, peer)
snh_trace_define(nh)
),
TP_fast_assign(
SCSB_TRACE_ASSIGN(sb);
si4_trace_assign(name, name);
si4_trace_assign(peer, peer);
memcpy(&__entry->name, name, sizeof(struct sockaddr_storage));
memcpy(&__entry->peer, peer, sizeof(struct sockaddr_storage));
snh_trace_assign(nh, nh);
),
TP_printk(SCSBF" name "SI4_FMT" peer "SI4_FMT" nh "SNH_FMT,
SCSB_TRACE_ARGS, si4_trace_args(name), si4_trace_args(peer),
TP_printk(SCSBF" name "SIN_FMT" peer "SIN_FMT" nh "SNH_FMT,
SCSB_TRACE_ARGS,
&__entry->name,
&__entry->peer,
snh_trace_args(nh))
);
DEFINE_EVENT(scoutfs_net_class, scoutfs_net_send_message,
TP_PROTO(struct super_block *sb, struct sockaddr_in *name,
struct sockaddr_in *peer, struct scoutfs_net_header *nh),
TP_PROTO(struct super_block *sb, struct sockaddr_storage *name,
struct sockaddr_storage *peer, struct scoutfs_net_header *nh),
TP_ARGS(sb, name, peer, nh)
);
DEFINE_EVENT(scoutfs_net_class, scoutfs_net_recv_message,
TP_PROTO(struct super_block *sb, struct sockaddr_in *name,
struct sockaddr_in *peer, struct scoutfs_net_header *nh),
TP_PROTO(struct super_block *sb, struct sockaddr_storage *name,
struct sockaddr_storage *peer, struct scoutfs_net_header *nh),
TP_ARGS(sb, name, peer, nh)
);
@@ -1416,8 +1418,8 @@ DECLARE_EVENT_CLASS(scoutfs_net_conn_class,
__field(void *, sock)
__field(__u64, c_rid)
__field(__u64, greeting_id)
si4_trace_define(sockname)
si4_trace_define(peername)
__field_struct(struct sockaddr_storage, sockname)
__field_struct(struct sockaddr_storage, peername)
__field(unsigned char, e_accepted_head)
__field(void *, listening_conn)
__field(unsigned char, e_accepted_list)
@@ -1435,8 +1437,8 @@ DECLARE_EVENT_CLASS(scoutfs_net_conn_class,
__entry->sock = conn->sock;
__entry->c_rid = conn->rid;
__entry->greeting_id = conn->greeting_id;
si4_trace_assign(sockname, &conn->sockname);
si4_trace_assign(peername, &conn->peername);
memcpy(&__entry->sockname, &conn->sockname, sizeof(struct sockaddr_storage));
memcpy(&__entry->peername, &conn->peername, sizeof(struct sockaddr_storage));
__entry->e_accepted_head = !!list_empty(&conn->accepted_head);
__entry->listening_conn = conn->listening_conn;
__entry->e_accepted_list = !!list_empty(&conn->accepted_list);
@@ -1446,7 +1448,7 @@ DECLARE_EVENT_CLASS(scoutfs_net_conn_class,
__entry->e_resend_queue = !!list_empty(&conn->resend_queue);
__entry->recv_seq = atomic64_read(&conn->recv_seq);
),
TP_printk(SCSBF" flags %s rc_dl %lu cto %lu sk %p rid %llu grid %llu sn "SI4_FMT" pn "SI4_FMT" eah %u lc %p eal %u nss %llu nsi %llu esq %u erq %u rs %llu",
TP_printk(SCSBF" flags %s rc_dl %lu cto %lu sk %p rid %llu grid %llu sn "SIN_FMT" pn "SIN_FMT" eah %u lc %p eal %u nss %llu nsi %llu esq %u erq %u rs %llu",
SCSB_TRACE_ARGS,
print_conn_flags(__entry->flags),
__entry->reconn_deadline,
@@ -1454,8 +1456,8 @@ DECLARE_EVENT_CLASS(scoutfs_net_conn_class,
__entry->sock,
__entry->c_rid,
__entry->greeting_id,
si4_trace_args(sockname),
si4_trace_args(peername),
&__entry->sockname,
&__entry->peername,
__entry->e_accepted_head,
__entry->listening_conn,
__entry->e_accepted_list,
+5 -6
View File
@@ -1077,8 +1077,7 @@ static int next_log_merge_range(struct super_block *sb, struct scoutfs_btree_roo
struct scoutfs_key key;
int ret;
key = *start;
key.sk_zone = SCOUTFS_LOG_MERGE_RANGE_ZONE;
init_log_merge_key(&key, SCOUTFS_LOG_MERGE_RANGE_ZONE, 0, 0);
scoutfs_key_set_ones(&rng->start);
do {
@@ -3640,7 +3639,7 @@ static bool invalid_mounted_client_item(struct scoutfs_btree_item_ref *iref)
* it's acceptable to see -EEXIST.
*/
static int insert_mounted_client(struct super_block *sb, u64 rid, u64 gr_flags,
struct sockaddr_in *sin)
struct sockaddr_storage *sin)
{
DECLARE_SERVER_INFO(sb, server);
struct scoutfs_super_block *super = DIRTY_SUPER_SB(sb);
@@ -4393,7 +4392,7 @@ static void fence_pending_recov_worker(struct work_struct *work)
break;
}
ret = scoutfs_fence_start(sb, rid, le32_to_be32(addr.v4.addr),
ret = scoutfs_fence_start(sb, rid, &addr,
SCOUTFS_FENCE_CLIENT_RECOVERY);
if (ret < 0) {
scoutfs_err(sb, "fence returned err %d, shutting down server", ret);
@@ -4544,7 +4543,7 @@ static void scoutfs_server_worker(struct work_struct *work)
struct scoutfs_net_connection *conn = NULL;
struct scoutfs_mount_options opts;
DECLARE_WAIT_QUEUE_HEAD(waitq);
struct sockaddr_in sin;
struct sockaddr_storage sin;
bool alloc_init = false;
u64 max_seq;
int ret;
@@ -4553,7 +4552,7 @@ static void scoutfs_server_worker(struct work_struct *work)
scoutfs_options_read(sb, &opts);
scoutfs_quorum_slot_sin(&server->qconf, opts.quorum_slot_nr, &sin);
scoutfs_info(sb, "server starting at "SIN_FMT, SIN_ARG(&sin));
scoutfs_info(sb, "server starting at "SIN_FMT, &sin);
scoutfs_block_writer_init(sb, &server->wri);
server->finalize_sent_seq = 0;
-21
View File
@@ -1,27 +1,6 @@
#ifndef _SCOUTFS_SERVER_H_
#define _SCOUTFS_SERVER_H_
#define SI4_FMT "%u.%u.%u.%u:%u"
#define si4_trace_define(name) \
__field(__u32, name##_addr) \
__field(__u16, name##_port)
#define si4_trace_assign(name, sin) \
do { \
__typeof__(sin) _sin = (sin); \
\
__entry->name##_addr = be32_to_cpu(_sin->sin_addr.s_addr); \
__entry->name##_port = be16_to_cpu(_sin->sin_port); \
} while(0)
#define si4_trace_args(name) \
(__entry->name##_addr >> 24), \
(__entry->name##_addr >> 16) & 255, \
(__entry->name##_addr >> 8) & 255, \
__entry->name##_addr & 255, \
__entry->name##_port
#define SNH_FMT \
"seq %llu recv_seq %llu id %llu data_len %u cmd %u flags 0x%x error %u"
#define SNH_ARG(nh) \
+1
View File
@@ -12,3 +12,4 @@ src/o_tmpfile_umask
src/o_tmpfile_linkat
src/mmap_stress
src/mmap_validate
src/totl-delta-inject
+2 -1
View File
@@ -15,7 +15,8 @@ BIN := src/createmany \
src/o_tmpfile_umask \
src/o_tmpfile_linkat \
src/mmap_stress \
src/mmap_validate
src/mmap_validate \
src/totl-delta-inject
DEPS := $(wildcard src/*.d)
+7
View File
@@ -0,0 +1,7 @@
== mkfs rejects mixed v4/v6 quorum
rc: 64
== mkfs all-v4, mount three members, cross-mount signature visible
== change-quorum-config rejects mixed v4/v6 quorum
rc: 64
== switch v4 -> v6, signature survives, cross-mount write again
== switch v6 -> v4, signatures survive
+6
View File
@@ -0,0 +1,6 @@
== setup
== concurrent quota mod and check across mounts
== verify quota rules are consistent after race
== verify file creation still works under quota
file visible on mount 1
== cleanup
+10
View File
@@ -0,0 +1,10 @@
== setup three files contributing to totl 8888.0.0
== merge baseline into fs_root
8888.0.0 = 42, 3
== inject (+128, +2) unbalances totl 8888.0.0
8888.0.0 = 170, 5
== unlink f3 (value 32) produces a -32/-1 delta
8888.0.0 = 138, 4
== inject (-128, -2) restores accounting for the remaining files
8888.0.0 = 10, 2
== cleanup
+1 -1
View File
@@ -383,7 +383,7 @@ fi
quo=""
if [ -n "$T_MKFS" ]; then
for i in $(seq -0 $((T_QUORUM - 1))); do
quo="$quo -Q $i,127.0.0.1,$((T_TEST_PORT + i))"
quo="$quo -Q $i,::1,$((T_TEST_PORT + i))"
done
msg "making new filesystem with $T_QUORUM quorum members"
+3
View File
@@ -1,6 +1,7 @@
export-get-name-parent.sh
basic-block-counts.sh
basic-bad-mounts.sh
basic-inetaddr.sh
basic-posix-acl.sh
basic-acl-consistency.sh
inode-items-updated.sh
@@ -29,6 +30,8 @@ totl-xattr-tag.sh
basic-xattr-indx.sh
quota.sh
totl-merge-read.sh
quota-invalidate-race.sh
totl-delta-inject.sh
lock-refleak.sh
lock-shrink-consistency.sh
lock-shrink-read-race.sh
+121
View File
@@ -0,0 +1,121 @@
/*
* Test helper that calls SCOUTFS_IOC_INJECT_TOTL_DELTA to seed
* arbitrary totl deltas.
*
* Copyright (C) 2026 Versity Software, Inc. All rights reserved.
*
* This program is free software; you can redistribute it and/or
* modify it under the terms of the GNU General Public
* License v2 as published by the Free Software Foundation.
*/
#ifndef _GNU_SOURCE
#define _GNU_SOURCE
#endif
#include <unistd.h>
#include <stdio.h>
#include <stdlib.h>
#include <string.h>
#include <stdint.h>
#include <inttypes.h>
#include <fcntl.h>
#include <errno.h>
#include <sys/ioctl.h>
#include <linux/types.h>
#include "ioctl.h"
static void usage(const char *prog)
{
fprintf(stderr,
"Usage: %s <mountpoint> <a>.<b>.<c> <total> <count>\n",
prog);
exit(2);
}
static int parse_s64(const char *s, int64_t *out)
{
char *end;
int64_t v;
errno = 0;
v = strtoll(s, &end, 0);
if (errno || *end != '\0' || end == s)
return -1;
*out = v;
return 0;
}
/*
* Parse "<a>.<b>.<c>" into abc[0..2] (skxt_a, skxt_b, skxt_c). Each
* component must be a non-empty unsigned base-0 integer.
*/
static int parse_dotted_name(const char *s, uint64_t abc[3])
{
const char *p = s;
char *end;
int i;
for (i = 0; i < 3; i++) {
if (*p == '\0' || *p == '.')
return -1;
errno = 0;
abc[i] = strtoull(p, &end, 0);
if (errno || end == p)
return -1;
if (i < 2) {
if (*end != '.')
return -1;
p = end + 1;
} else {
if (*end != '\0')
return -1;
}
}
return 0;
}
int main(int argc, char **argv)
{
struct scoutfs_ioctl_inject_totl_delta itd = {{0,}};
uint64_t abc[3];
int64_t total, count;
int fd;
int ret;
if (argc != 5)
usage(argv[0]);
if (parse_dotted_name(argv[2], abc) ||
parse_s64(argv[3], &total) ||
parse_s64(argv[4], &count)) {
fprintf(stderr, "could not parse arguments\n");
usage(argv[0]);
}
itd.name[0] = abc[0];
itd.name[1] = abc[1];
itd.name[2] = abc[2];
itd.total = total;
itd.count = count;
fd = open(argv[1], O_RDONLY | O_DIRECTORY);
if (fd < 0) {
fprintf(stderr, "open(%s): %s\n", argv[1], strerror(errno));
return 1;
}
ret = ioctl(fd, SCOUTFS_IOC_INJECT_TOTL_DELTA, &itd);
if (ret < 0) {
fprintf(stderr,
"INJECT_TOTL_DELTA(%" PRIu64 ".%" PRIu64 ".%" PRIu64
", total=%" PRId64 ", count=%" PRId64 "): %s\n",
abc[0], abc[1], abc[2], total, count, strerror(errno));
close(fd);
return 1;
}
close(fd);
return 0;
}
+78
View File
@@ -0,0 +1,78 @@
#
# Test that mixed ipv4/6 fails through mkfs/quorum change and that
# users can migrate from ipv4 to v6 and back.
#
t_require_commands dmsetup blockdev cmp
P0=$T_SCRATCH_PORT
P1=$((T_SCRATCH_PORT + 1))
P2=$((T_SCRATCH_PORT + 2))
SIG=$T_TMP.sig
seq 1 4096 > "$SIG"
trap '
umount $T_TMPDIR/m0 $T_TMPDIR/m1 $T_TMPDIR/m2 2>/dev/null
dmsetup remove _bia_m0 _bia_m1 _bia_m2 _bia_d0 _bia_d1 _bia_d2 2>/dev/null
' EXIT
mkdir -p "$T_TMPDIR/m0" "$T_TMPDIR/m1" "$T_TMPDIR/m2"
for nv in "m0 $T_EX_META_DEV" "m1 $T_EX_META_DEV" "m2 $T_EX_META_DEV" \
"d0 $T_EX_DATA_DEV" "d1 $T_EX_DATA_DEV" "d2 $T_EX_DATA_DEV"; do
set -- $nv
t_quiet dmsetup create _bia_$1 --table "0 $(blockdev --getsz $2) linear $2 0"
done
mnt() {
mount -t scoutfs \
-o metadev_path=/dev/mapper/_bia_m$1,quorum_slot_nr=$1 \
/dev/mapper/_bia_d$1 "$T_TMPDIR/m$1"
}
mount_all() {
mnt 0 &
mnt 1 &
mnt 2 &
wait
}
umount_all() {
umount $T_TMPDIR/m0 &
umount $T_TMPDIR/m1 &
umount $T_TMPDIR/m2 &
wait
}
verify() {
cmp -s "$SIG" "$T_TMPDIR/m0/sig" &&
cmp -s "$SIG" "$T_TMPDIR/m1/sig" &&
cmp -s "$SIG" "$T_TMPDIR/m2/sig" || t_fail "$1"
}
echo "== mkfs rejects mixed v4/v6 quorum"
t_rc scoutfs mkfs -f -Q 0,127.0.0.1,$P0 -Q 1,::1,$P1 -Q 2,127.0.0.1,$P2 /dev/mapper/_bia_m0 /dev/mapper/_bia_d0
echo "== mkfs all-v4, mount three members, cross-mount signature visible"
t_quiet scoutfs mkfs -f -Q 0,127.0.0.1,$P0 -Q 1,127.0.0.1,$P1 -Q 2,127.0.0.1,$P2 /dev/mapper/_bia_m0 /dev/mapper/_bia_d0
mount_all
cp "$SIG" "$T_TMPDIR/m0/sig"
verify "v4 initial"
umount_all
echo "== change-quorum-config rejects mixed v4/v6 quorum"
t_rc scoutfs change-quorum-config --offline -Q 0,127.0.0.1,$P0 -Q 1,::1,$P1 -Q 2,127.0.0.1,$P2 /dev/mapper/_bia_m0
echo "== switch v4 -> v6, signature survives, cross-mount write again"
t_quiet scoutfs change-quorum-config --offline -Q 0,::1,$P0 -Q 1,::1,$P1 -Q 2,::1,$P2 /dev/mapper/_bia_m0
mount_all
verify "after v4->v6"
cp "$SIG" "$T_TMPDIR/m1/sig-v6"
cmp -s "$SIG" "$T_TMPDIR/m0/sig-v6" || t_fail "v6 cross-mount write not visible on m0"
cmp -s "$SIG" "$T_TMPDIR/m2/sig-v6" || t_fail "v6 cross-mount write not visible on m2"
umount_all
echo "== switch v6 -> v4, signatures survive"
t_quiet scoutfs change-quorum-config --offline -Q 0,127.0.0.1,$P0 -Q 1,127.0.0.1,$P1 -Q 2,127.0.0.1,$P2 /dev/mapper/_bia_m0
mount_all
verify "after v6->v4"
cmp -s "$SIG" "$T_TMPDIR/m0/sig-v6" || t_fail "after v6->v4 sig-v6 lost"
umount_all
t_pass
+70
View File
@@ -0,0 +1,70 @@
#
# Regression for the BUG_ON in scoutfs_quota_invalidate when a concurrent
# ruleset read on one mount races with a quota rule modification.
#
t_require_mounts 2
TEST_UID=22222
SET_UID="--ruid=$TEST_UID --euid=$TEST_UID"
echo "== setup"
mkdir -p "$T_D0/dir"
chown --quiet $TEST_UID "$T_D0/dir"
# totl xattr gives quota checks something to consult
setfattr -n scoutfs.totl.test.1.1.1 -v 1 "$T_D0/dir"
echo "== concurrent quota mod and check across mounts"
(
for i in $(seq 1 20); do
scoutfs quota-add -p "$T_M0" \
-r "1 1,L,- 1,L,- $i,L,- I 999999 -" 2>/dev/null
scoutfs quota-del -p "$T_M0" \
-r "1 1,L,- 1,L,- $i,L,- I 999999 -" 2>/dev/null
done
) &
MOD_PID=$!
# same mount as the mod: races local read against invalidate
(
for i in $(seq 1 50); do
setpriv $SET_UID touch "$T_D0/dir/race0_$i" 2>/dev/null
rm -f "$T_D0/dir/race0_$i"
done
) &
CHECK0_PID=$!
# other mount: drives cross-node lock traffic
(
for i in $(seq 1 50); do
setpriv $SET_UID touch "$T_D1/dir/race1_$i" 2>/dev/null
rm -f "$T_D1/dir/race1_$i"
done
) &
CHECK1_PID=$!
t_quiet wait $MOD_PID
t_quiet wait $CHECK0_PID
t_quiet wait $CHECK1_PID
echo "== verify quota rules are consistent after race"
scoutfs quota-wipe -p "$T_M0"
scoutfs quota-list -p "$T_M0"
echo "== verify file creation still works under quota"
scoutfs quota-add -p "$T_M0" -r "1 1,L,- 1,L,- 1,L,- I 999999 -"
sync
echo 1 > $(t_debugfs_path)/drop_weak_item_cache
echo 1 > $(t_debugfs_path)/drop_quota_check_cache
setpriv $SET_UID touch "$T_D0/dir/verify_file"
test -f "$T_D1/dir/verify_file" && echo "file visible on mount 1"
rm -f "$T_D0/dir/verify_file"
scoutfs quota-wipe -p "$T_M0"
echo "== cleanup"
setfattr -x scoutfs.totl.test.1.1.1 "$T_D0/dir"
rm -rf "$T_D0/dir"
t_pass
+43
View File
@@ -0,0 +1,43 @@
#
# Exercise the SCOUTFS_IOC_INJECT_TOTL_DELTA ioctl that injects totl
# deltas directly via totl-delta-inject(1).
#
t_require_commands setfattr scoutfs sync rm touch totl-delta-inject
# force a log merge then read-xattr-totals filtered to our own keys
read_totals()
{
t_force_log_merge
sync
echo 1 > $(t_debugfs_path)/drop_weak_item_cache
scoutfs read-xattr-totals -p "$T_M0" | \
grep -E '^8888\.' || true
}
echo "== setup three files contributing to totl 8888.0.0"
touch "$T_D0/f1" "$T_D0/f2" "$T_D0/f3"
setfattr -n scoutfs.totl.inj.8888.0.0 -v 2 "$T_D0/f1"
setfattr -n scoutfs.totl.inj.8888.0.0 -v 8 "$T_D0/f2"
setfattr -n scoutfs.totl.inj.8888.0.0 -v 32 "$T_D0/f3"
echo "== merge baseline into fs_root"
read_totals
echo "== inject (+128, +2) unbalances totl 8888.0.0"
totl-delta-inject "$T_M0" 8888.0.0 128 2
read_totals
echo "== unlink f3 (value 32) produces a -32/-1 delta"
rm -f "$T_D0/f3"
read_totals
echo "== inject (-128, -2) restores accounting for the remaining files"
totl-delta-inject "$T_M0" 8888.0.0 -128 -2
read_totals
echo "== cleanup"
rm -f "$T_D0/f1" "$T_D0/f2"
read_totals
t_pass
+22 -11
View File
@@ -160,15 +160,16 @@ int parse_timespec(char *str, struct timespec *ts)
* Parse a quorum slot specification string "NR,ADDR,PORT" into its
* component parts. We use sscanf to both parse the leading NR and
* trailing PORT integers, and to pull out the inner ADDR string which
* is then parsed to make sure that it's a valid unicast ipv4 address.
* is then parsed to make sure that it's a valid unicast ip address.
* We require that all components be specified, and sccanf will check
* this by the number of matches it returns.
*/
int parse_quorum_slot(struct scoutfs_quorum_slot *slot, char *arg)
{
#define ADDR_CHARS 45 /* max ipv6 */
char addr[ADDR_CHARS + 1] = {'\0',};
#define ADDR_CHARS 45 /* (INET6_ADDRSTRLEN - 1) */
char addr[INET6_ADDRSTRLEN] = {'\0',};
struct in_addr in;
struct in6_addr in6;
int port;
int parsed;
int nr;
@@ -206,15 +207,25 @@ int parse_quorum_slot(struct scoutfs_quorum_slot *slot, char *arg)
return -EINVAL;
}
if (inet_aton(addr, &in) == 0 || htonl(in.s_addr) == 0 ||
htonl(in.s_addr) == UINT_MAX) {
printf("invalid ipv4 address '%s' in quorum slot '%s'\n",
addr, arg);
return -EINVAL;
if (inet_pton(AF_INET, addr, &in) == 1) {
if (htonl(in.s_addr) == 0 || htonl(in.s_addr) == UINT_MAX) {
printf("invalid ipv4 address '%s' in quorum slot '%s'\n",
addr, arg);
return -EINVAL;
}
slot->addr.v4.family = cpu_to_le16(SCOUTFS_AF_IPV4);
slot->addr.v4.addr = cpu_to_le32(htonl(in.s_addr));
slot->addr.v4.port = cpu_to_le16(port);
} else if (inet_pton(AF_INET6, addr, &in6) == 1) {
if (IN6_IS_ADDR_UNSPECIFIED(&in6) || IN6_IS_ADDR_MULTICAST(&in6)) {
printf("invalid ipv6 address '%s' in quorum slot '%s'\n",
addr, arg);
return -EINVAL;
}
slot->addr.v6.family = cpu_to_le16(SCOUTFS_AF_IPV6);
memcpy(slot->addr.v6.addr, &in6, 16);
slot->addr.v6.port = cpu_to_le16(port);
}
slot->addr.v4.family = cpu_to_le16(SCOUTFS_AF_IPV4);
slot->addr.v4.addr = cpu_to_le32(htonl(in.s_addr));
slot->addr.v4.port = cpu_to_le16(port);
return nr;
}
+40 -17
View File
@@ -28,6 +28,7 @@
#include "srch.h"
#include "leaf_item_hash.h"
#include "dev.h"
#include "quorum.h"
static void print_block_header(struct scoutfs_block_header *hdr, int size)
{
@@ -400,12 +401,20 @@ static int print_mounted_client_entry(struct scoutfs_key *key, u64 seq, u8 flags
{
struct scoutfs_mounted_client_btree_val *mcv = val;
struct in_addr in;
char ip6addr[INET6_ADDRSTRLEN];
memset(&in, 0, sizeof(in));
in.s_addr = htonl(le32_to_cpu(mcv->addr.v4.addr));
if (mcv->addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4)) {
in.s_addr = htonl(le32_to_cpu(mcv->addr.v4.addr));
printf(" rid %016llx ipv4_addr %s flags 0x%x\n",
le64_to_cpu(key->skmc_rid), inet_ntoa(in), mcv->flags);
printf(" rid %016llx ipv4_addr %s flags 0x%x\n",
le64_to_cpu(key->skmc_rid), inet_ntoa(in), mcv->flags);
} else if (mcv->addr.v6.family == cpu_to_le16(SCOUTFS_AF_IPV6)) {
printf(" rid %016llx ipv6_addr %s flags 0x%x\n",
le64_to_cpu(key->skmc_rid),
inet_ntop(AF_INET, mcv->addr.v6.addr, ip6addr, INET6_ADDRSTRLEN),
mcv->flags);
}
return 0;
}
@@ -891,26 +900,40 @@ static int print_btree_leaf_items(int fd, struct scoutfs_super_block *super,
static char *alloc_addr_str(union scoutfs_inet_addr *ia)
{
struct in_addr addr;
char ip6addr[INET6_ADDRSTRLEN];
char *quad;
char *str;
int len;
memset(&addr, 0, sizeof(addr));
addr.s_addr = htonl(le32_to_cpu(ia->v4.addr));
quad = inet_ntoa(addr);
if (quad == NULL)
return NULL;
if (le16_to_cpu(ia->v4.family) == SCOUTFS_AF_IPV4) {
memset(&addr, 0, sizeof(addr));
addr.s_addr = htonl(le32_to_cpu(ia->v4.addr));
quad = inet_ntoa(addr);
if (quad == NULL)
return NULL;
len = snprintf(NULL, 0, "%s:%u", quad, le16_to_cpu(ia->v4.port));
if (len < 1 || len > 22)
return NULL;
len = snprintf(NULL, 0, "%s:%u", quad, le16_to_cpu(ia->v4.port));
if (len < 1 || len > 22)
return NULL;
len++; /* null */
str = malloc(len);
if (!str)
return NULL;
len++; /* null */
str = malloc(len);
if (!str)
return NULL;
snprintf(str, len, "%s:%u", quad, le16_to_cpu(ia->v4.port));
snprintf(str, len, "%s:%u", quad, le16_to_cpu(ia->v4.port));
} else if (le16_to_cpu(ia->v6.family) == SCOUTFS_AF_IPV6) {
if (inet_ntop(AF_INET6, ia->v6.addr, ip6addr, INET6_ADDRSTRLEN) == NULL)
return NULL;
len = strlen(ip6addr) + 9; /* "[]:\0" (4) plus max strlen(u16) (5) */
str = malloc(len);
if (!str)
return NULL;
snprintf(str, len, "[%s]:%u", ip6addr, le16_to_cpu(ia->v6.port));
} else
return NULL;
return str;
}
@@ -1026,7 +1049,7 @@ static void print_super_block(struct scoutfs_super_block *super, u64 blkno)
printf(" quorum config version %llu\n",
le64_to_cpu(super->qconf.version));
for (i = 0; i < array_size(super->qconf.slots); i++) {
if (super->qconf.slots[i].addr.v4.family != cpu_to_le16(SCOUTFS_AF_IPV4))
if (!quorum_slot_present(super, i))
continue;
addr = alloc_addr_str(&super->qconf.slots[i].addr);
+56 -29
View File
@@ -10,7 +10,8 @@
bool quorum_slot_present(struct scoutfs_super_block *super, int i)
{
return super->qconf.slots[i].addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4);
return ((super->qconf.slots[i].addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4)) ||
(super->qconf.slots[i].addr.v6.family == cpu_to_le16(SCOUTFS_AF_IPV6)));
}
bool valid_quorum_slots(struct scoutfs_quorum_slot *slots)
@@ -18,35 +19,57 @@ bool valid_quorum_slots(struct scoutfs_quorum_slot *slots)
struct in_addr in;
bool valid = true;
char *addr;
char ip6addr[INET6_ADDRSTRLEN];
__le16 family = cpu_to_le16(SCOUTFS_AF_NONE);
int i;
int j;
for (i = 0; i < SCOUTFS_QUORUM_MAX_SLOTS; i++) {
if (slots[i].addr.v4.family == cpu_to_le16(SCOUTFS_AF_NONE))
continue;
if (slots[i].addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4)) {
if (family == cpu_to_le16(SCOUTFS_AF_NONE)) {
family = cpu_to_le16(SCOUTFS_AF_IPV4);
} else if (family != cpu_to_le16(SCOUTFS_AF_IPV4)) {
fprintf(stderr, "quorum slot nr %u is IPv4 but earlier slots are IPv6; mixed IPv4/IPv6 quorum is not supported\n",
i);
valid = false;
}
if (slots[i].addr.v4.family != cpu_to_le16(SCOUTFS_AF_IPV4)) {
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if (slots[i].addr.v4.addr == slots[j].addr.v4.addr &&
slots[i].addr.v4.port == slots[j].addr.v4.port) {
in.s_addr =
htonl(le32_to_cpu(slots[i].addr.v4.addr));
addr = inet_ntoa(in);
fprintf(stderr, "quorum slot nr %u and %u have the same address %s:%u\n",
i, j, addr,
le16_to_cpu(slots[i].addr.v4.port));
valid = false;
}
}
} else if (slots[i].addr.v6.family == cpu_to_le16(SCOUTFS_AF_IPV6)) {
if (family == cpu_to_le16(SCOUTFS_AF_NONE)) {
family = cpu_to_le16(SCOUTFS_AF_IPV6);
} else if (family != cpu_to_le16(SCOUTFS_AF_IPV6)) {
fprintf(stderr, "quorum slot nr %u is IPv6 but earlier slots are IPv4; mixed IPv4/IPv6 quorum is not supported\n",
i);
valid = false;
}
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if ((IN6_ARE_ADDR_EQUAL(slots[i].addr.v6.addr, slots[j].addr.v6.addr)) &&
(slots[i].addr.v6.port == slots[j].addr.v6.port)) {
fprintf(stderr, "quorum slot nr %u and %u have the same address [%s]:%u\n",
i, j,
inet_ntop(AF_INET6, slots[i].addr.v6.addr, ip6addr, INET6_ADDRSTRLEN),
le16_to_cpu(slots[i].addr.v6.port));
valid = false;
}
}
} else if (slots[i].addr.v6.family != cpu_to_le16(SCOUTFS_AF_NONE)) {
fprintf(stderr, "quorum slot nr %u has invalid family %u\n",
i, le16_to_cpu(slots[i].addr.v4.family));
valid = false;
}
for (j = i + 1; j < SCOUTFS_QUORUM_MAX_SLOTS; j++) {
if (slots[i].addr.v4.family != cpu_to_le16(SCOUTFS_AF_IPV4))
continue;
if (slots[i].addr.v4.addr == slots[j].addr.v4.addr &&
slots[i].addr.v4.port == slots[j].addr.v4.port) {
in.s_addr =
htonl(le32_to_cpu(slots[i].addr.v4.addr));
addr = inet_ntoa(in);
fprintf(stderr, "quorum slot nr %u and %u have the same address %s:%u\n",
i, j, addr,
le16_to_cpu(slots[i].addr.v4.port));
valid = false;
}
}
}
return valid;
@@ -61,19 +84,23 @@ void print_quorum_slots(struct scoutfs_quorum_slot *slots, int nr, char *indent)
{
struct scoutfs_quorum_slot *sl;
struct in_addr in;
char ip6addr[INET6_ADDRSTRLEN];
bool first = true;
int i;
for (i = 0, sl = slots; i < SCOUTFS_QUORUM_MAX_SLOTS; i++, sl++) {
if (sl->addr.v4.family == cpu_to_le16(SCOUTFS_AF_IPV4)) {
in.s_addr = htonl(le32_to_cpu(sl->addr.v4.addr));
printf("%s%u: %s:%u\n", first ? "" : indent,
i, inet_ntoa(in), le16_to_cpu(sl->addr.v4.port));
if (sl->addr.v4.family != cpu_to_le16(SCOUTFS_AF_IPV4))
continue;
in.s_addr = htonl(le32_to_cpu(sl->addr.v4.addr));
printf("%s%u: %s:%u\n", first ? "" : indent,
i, inet_ntoa(in), le16_to_cpu(sl->addr.v4.port));
first = false;
first = false;
} else if (sl->addr.v6.family == cpu_to_le16(SCOUTFS_AF_IPV6)) {
printf("%s%u: [%s]:%u\n", first ? "" : indent, i,
inet_ntop(AF_INET6, sl->addr.v6.addr, ip6addr, INET6_ADDRSTRLEN),
le16_to_cpu(sl->addr.v6.port));
first = false;
}
}
}