seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-05-17 15:21:31 +00:00

Files

Chris Lu a8ba9d106e peer chunk sharing 7/8: tryPeerRead read-path hook (#9136 )

* mount: batched announcer + pooled peer conns for mount-to-mount RPCs

* peer_announcer.go: non-blocking EnqueueAnnounce + ticker flush that
  groups fids by HRW owner, fans out one ChunkAnnounce per owner in
  parallel. announcedAt is pruned at 2× TTL so it stays bounded.

* peer_dialer.go: PeerConnPool caches one grpc.ClientConn per peer
  address; the announcer and (next PR) the fetcher share it so
  steady-state owner RPCs skip the handshake cost entirely. Bounded
  at 4096 cached entries; shutdown conns are transparently replaced.

* WFS starts both alongside the gRPC server; stops them on unmount.

* mount: wire tryPeerRead via FetchChunk streaming gRPC

Replaces the HTTP GET byte-transfer path with a gRPC server-stream
FetchChunk call. Same fall-through semantics: any failure drops
through to entryChunkGroup.ReadDataAt, so reads never slow below
status quo.

* peer_fetcher.go: tryPeerRead resolves the offset to a leaf chunk
  (flattening manifests), asks the HRW owner for holders via
  ChunkLookup, then opens FetchChunk on each holder in LRU order
  (PR #5) until one succeeds. Assembled bytes are verified against
  FileChunk.ETag end-to-end — the peer is still treated as
  untrusted. Reuses the shared PeerConnPool from PR #6 for all
  outbound gRPC.

* peer_grpc.go: expose SelfAddr() so the fetcher can avoid dialing
  itself on a self-owned fid.

* filehandle_read.go: tryPeerRead slot between tryRDMARead and
  entryChunkGroup.ReadDataAt. Gated by option.PeerEnabled and the
  presence of peerGrpcServer (the single identity test).

Read ordering with the feature enabled is now:
   local cache -> RDMA sidecar -> peer mount (gRPC stream) -> volume server

One port, one identity, one connection pool — no more HTTP bytecast.

* test(fuse_p2p): end-to-end CI test for peer chunk sharing

Adds a FUSE-backed integration test that proves mount B can satisfy a
read from mount A's chunk cache instead of the volume tier.

Layout (modelled on test/fuse_dlm):

  test/fuse_p2p/framework_test.go        — cluster harness (1 master,
                                           1 volume, 1 filer, N mounts,
                                           all with -peer.enable)
  test/fuse_p2p/peer_chunk_sharing_test.go
                                         — writer-reader scenario

The test (TestPeerChunkSharing_ReadersPullFromPeerCache):

  1. Starts 3 mounts. Three is the sweet spot: with 2 mounts, HRW owner
     of a chunk is self ~50 % of the time (peer path short-circuits);
     with 3+ it drops to ≤ 1/3, so a multi-chunk file almost certainly
     exercises the remote-owner fan-out.
  2. Mount 0 writes a ~8 MiB file, then reads it back through its own
     FUSE to warm its chunk cache.
  3. Waits for seed convergence (one full MountList refresh) plus an
     announcer flush cycle, so chunk-holder entries have reached each
     HRW owner.
  4. Mount 1 reads the same file.
  5. Verifies byte-for-byte equality AND greps mount 1's log for
     "peer read successful" — content matching alone is not proof
     (the volume fallback would also succeed), so the log marker is
     what distinguishes p2p from fallback.

Workflow .github/workflows/fuse-p2p-integration.yml triggers on any
change to mount/filer peer code, the p2p protos, or the test itself.
Failure artifacts (server + mount logs) are uploaded for 3 days.

Mounts run with -v=4 so the tryPeerRead success/failure glog messages
land in the log file the test greps.

2026-04-19 00:53:12 -07:00

data

volume: large_volume version has bug when using in memory index

2021-06-28 15:48:07 -07:00

erasure_coding

admin: report file and delete counts for EC volumes (#9060 )

2026-04-13 21:10:36 -07:00

foundationdb

go 1.25

2026-03-09 23:10:27 -07:00

fuse_dlm

go fmt