Files
Chris Lu 35fe3c801b feat(nfs): UDP MOUNT v3 responder + real-Linux e2e mount harness (#9267)
* feat(nfs): add UDP MOUNT v3 responder

The upstream willscott/go-nfs library only serves the MOUNT protocol
over TCP. Linux's mount.nfs and the in-kernel NFS client default
mountproto to UDP in many configurations, so against a stock weed nfs
deployment the kernel queries portmap for "MOUNT v3 UDP", gets port=0
("not registered"), and either falls back inconsistently or surfaces
EPROTONOSUPPORT — surfacing as the user-visible "requested NFS version
or transport protocol is not supported" reported in #9263. The user has
to add `mountproto=tcp` or `mountport=2049` to mount options to coerce
TCP just for the MOUNT phase.

Add a small UDP responder that speaks just enough of MOUNT v3 to handle
the procedures the kernel actually invokes during mount setup and
teardown: NULL, MNT, and UMNT. The wire layout for MNT mirrors
handler.go's TCP path so both transports produce the same root
filehandle and the same auth flavor list for the same export. Other
v3 procedures (DUMP, EXPORT, UMNTALL) cleanly return PROC_UNAVAIL.

This commit only adds the responder; portmap-advertise and Server.Start
wire-up follow in subsequent commits so each step stays independently
reviewable.

References: RFC 1813 §5 (NFSv3/MOUNTv3), RFC 5531 (RPC). Existing
constants and parseRPCCall / encodeAcceptedReply helpers from
portmap.go are reused so behaviour stays consistent across both UDP
listening goroutines.

* feat(nfs): advertise UDP MOUNT v3 in the portmap responder

The portmap responder advertised TCP-only entries because go-nfs only
serves TCP, but with the new UDP MOUNT responder in place we can now
honestly advertise MOUNT v3 over UDP as well. Linux clients whose
default mountproto is UDP query portmap during mount setup; if the
answer is "not registered" some kernels translate the result to
EPROTONOSUPPORT instead of falling back to TCP, which is exactly the
failure pattern reported in #9263.

Add the entry, refresh the doc comment, and extend the existing
GETPORT and DUMP unit tests so a regression that drops the entry shows
up at unit-test granularity rather than only in an end-to-end mount.

* feat(nfs): start UDP MOUNT v3 responder alongside the TCP NFS listener

Plug the new mountUDPServer into Server.Start so it comes up on the
same bind/port as the TCP NFS listener. Started before portmap so a
portmap query that races a fast client never returns a UDP MOUNT entry
the responder isn't actually answering, and shut down via the same
defer chain so a portmap-or-listener startup failure doesn't leave the
UDP responder dangling.

The portmap startup log now reflects all three advertised entries
(NFS v3 tcp, MOUNT v3 tcp, MOUNT v3 udp) so operators can confirm at a
glance that the UDP MOUNT path is up.

Verified end-to-end: built a Linux/arm64 binary, ran weed nfs in a
container with -portmap.bind, and mounted from another container using
both the user-reported failing setup from #9263 (vers=3 + tcp without
mountport) and an explicit mountproto=udp to force the new code path.
The trace `mount.nfs: trying ... prog 100005 vers 3 prot UDP port 2049`
now leads to a successful mount instead of EPROTONOSUPPORT.

* docs(nfs): note that the plain mount form works on UDP-default clients

With UDP MOUNT v3 now served alongside TCP, the only path that ever
required mountproto=tcp / mountport=2049 — clients whose default
mountproto is UDP — works against the plain mount example. Update the
startup mount hint and the `weed nfs` long help so users don't go
hunting for a mount-option workaround that no longer applies.

The "without -portmap.bind" branch is unchanged: that path still has
to bypass portmap entirely because there is no portmap responder for
the kernel to query.

* test(nfs): add kernel-mount e2e tests under test/nfs

The existing test/nfs/ harness boots a real master + volume + filer +
weed nfs subprocess stack and drives it via go-nfs-client. That covers
protocol behaviour from a Go client's perspective, but anything
mis-coded once a real Linux kernel parses the wire bytes is invisible:
both ends of the test use the same RPC library, so identical bugs
round-trip cleanly. The two NFS issues hit recently were exactly that
shape — NFSv4 mis-routed to v3 SETATTR (#9262) and missing UDP MOUNT v3
— and only surfaced in a real client.

Add three end-to-end tests that mount the harness's running NFS server
through the in-tree Linux client:

  - TestKernelMountV3TCP: NFSv3 + MOUNT v3 over TCP (baseline).
  - TestKernelMountV3MountProtoUDP: NFSv3 over TCP, MOUNT v3 over UDP
    only — regression test for the new UDP MOUNT v3 responder.
  - TestKernelMountV4RejectsCleanly: vers=4 against the v3-only server,
    asserting the kernel surfaces a protocol/version-level error rather
    than a generic "mount system call failed" — regression test for the
    PROG_MISMATCH path from #9262.

The tests pass explicit port=/mountport= mount options so the kernel
never queries portmap, which means the harness doesn't need to bind
the privileged port 111 and won't collide with a system rpcbind on a
shared CI runner. They t.Skip cleanly when the host isn't Linux, when
mount.nfs isn't installed, or when the test process isn't running as
root.

Run locally with:

	cd test/nfs
	sudo go test -v -run TestKernelMount ./...

CI wiring follows in the next commit.

* ci(nfs): run kernel-mount e2e tests in nfs-tests workflow

Wire the new TestKernelMount* tests from test/nfs into the existing
NFS workflow:

  - Existing protocol-layer step now skips '^TestKernelMount' so a
    "skipped because not root" line doesn't appear on every run.
  - New "Install kernel NFS client" step pulls nfs-common (mount.nfs +
    helpers) and netbase (/etc/protocols, which mount.nfs's protocol-
    name lookups need to resolve `tcp`/`udp`).
  - New privileged step runs only the kernel-mount tests under sudo,
    preserving PATH and pointing GOMODCACHE/GOCACHE at the user's
    caches so the second `go test` invocation reuses already-built
    test binaries instead of redownloading modules under root.

The summary block now lists the three kernel-mount cases explicitly
so a regression on either of #9262 or this PR's UDP MOUNT change is
traceable from the workflow run page.
2026-04-28 14:06:35 -07:00
..
2026-04-14 20:48:24 -07:00
2026-04-14 20:48:24 -07:00
2026-04-14 20:48:24 -07:00
2026-04-14 20:48:24 -07:00
2026-04-14 20:48:24 -07:00
2026-04-14 20:48:24 -07:00

SeaweedFS NFS Integration Tests

End-to-end tests that boot a real SeaweedFS cluster (master + volume + filer) plus the experimental weed nfs frontend and drive it through the NFSv3 wire protocol. The tests talk to the server over TCP using github.com/willscott/go-nfs-client, which means they do not need a kernel NFS mount, privileged ports, or any platform-specific tooling.

Prerequisites

  1. Build the weed binary:
    cd ../../weed
    go build -o weed .
    
  2. Go 1.24 or later.

Running the tests

# Build weed and run everything
make test

# Verbose output, keeps the subprocess stdout
make test-verbose

# Skip integration tests — useful when iterating on the framework itself
make test-short

# Run a single test
go test -v -run TestNfsBasicReadWrite ./...

Every test starts its own cluster on random loopback ports, so runs are isolated and can execute in parallel.

Layout

  • framework.go — launches weed master, weed volume, weed filer, and weed nfs as subprocesses, waits for each to accept TCP, and exposes a Mount() helper that returns an nfsclient.Target.
  • basic_test.go — covers the most common NFS operations:
    • Read/write round-trip (TestNfsBasicReadWrite)
    • Mkdir / ReadDirPlus / RmDir (TestNfsMkdirAndRmdir)
    • Nested directory + leaf file (TestNfsNestedDirectories)
    • Rename preserves content (TestNfsRenamePreservesContent)
    • Overwrite shrinks file size (TestNfsOverwriteShrinksFile)
    • Large binary file round-trip (TestNfsLargeFile)
    • Arbitrary binary and empty files (TestNfsBinaryAndEmptyFiles)
    • Symlink + Readlink (TestNfsSymlinkRoundTrip)
    • ReadDirPlus ordering sanity (TestNfsReadDirPlusOrdering)
    • Remove on missing path errors cleanly (TestNfsRemoveMissingFailsCleanly)
    • FSINFO advertises non-zero limits (TestNfsFSInfoReturnsSaneLimits)
    • Sequential append writes concatenate (TestNfsAppendIsSequential)
    • ReadDir after remove (TestNfsReadDirAfterRemove)

Debugging a failing test

Keep the cluster temp dir for inspection:

config := DefaultTestConfig()
config.SkipCleanup = true

Enable subprocess stdout/stderr:

config := DefaultTestConfig()
config.EnableDebug = true

Or run with -v, which flips EnableDebug automatically via testing.Verbose().

Notes

  • The NFS server binds to 127.0.0.1 with -ip.bind=127.0.0.1 and exports /nfs_export. The test framework pre-creates that directory via the filer's HTTP API before starting the NFS server — the NFS server requires its export root to exist in the filer's namespace with a real entry, and the filer's synthetic / root does not match the Name=="/" check the NFS server performs during ensureIndexedEntry.
  • Ports are allocated dynamically. Each test run opens a short-lived listener on 127.0.0.1:0, reads back the assigned port, closes the listener, and hands the port to weed master/volume/filer/nfs. There is a tiny race window between close and reopen that has not been a problem in practice but is worth remembering if you see a "bind: address already in use" failure.
  • All four weed components are started with explicit -port.grpc=... flags. Without them, the default is -port + 10000, which overflows 65535 whenever the HTTP port lands above 55535 — the kernel's ephemeral port range on macOS routinely does.