seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-05-21 17:21:34 +00:00

Files

Chris Lu af8d4e00ee fix(ec_mount): reject 0-byte .ecx and aggregate cross-disk failures (#9542 )

* fix(ec_mount): reject 0-byte .ecx and aggregate cross-disk failures

MountEcShards's per-disk loop bailed on the first disk returning a
non-ENOENT error, and NewEcVolume wrapped its ENOENT with %v so the
caller's `err == os.ErrNotExist` check never matched. On a multi-disk
volume server where ec.balance / ec.rebuild had distributed shards
across sibling disks while the matching .ecx never arrived, the mount
loop bailed after disk 0 with "cannot open ec volume index" and the
operator never saw that the rest of the disks were also empty. The
companion failure mode is a 0-byte .ecx stub left by EC distribute's
writeToFile after a mid-stream copy failure: Stat() succeeds, treating
the stub as a valid index, and downstream mount work proceeds against
an empty file.

Wrap the ec-volume open errors with %w, treat a 0-byte .ecx as
os.ErrNotExist (in NewEcVolume, findEcxIdxDirForVolume, and
HasEcxFileOnDisk), and have MountEcShards collect per-disk failures
before returning a single aggregated error. The "no .ecx anywhere"
case gets a distinct error so the orchestrator can re-copy the index
from a healthy replica rather than retry against the same broken
state.

* fix(ec_reconcile): indexEcxOwners also rejects 0-byte .ecx stubs

findEcxIdxDirForVolume already skipped 0-byte .ecx during MountEcShards,
but indexEcxOwners (used by reconcileEcShardsAcrossDisks at startup)
still recorded the first .ecx by name only. On a store where one disk
holds a 0-byte stub left by a failed EC distribute and a sibling disk
holds the real index, the stub would win the owner selection — and
NewEcVolume's new size check would then refuse to load against it,
leaving the orphan shards unloaded even though a valid index exists.

Mirror the size check from findEcxIdxDirForVolume: skip directory
entries whose .ecx Info() reports size 0 or whose Info() call fails.

* fix(ec_mount): accept 0-byte .ecx as valid empty index

The previous commit treated a 0-byte .ecx in NewEcVolume as
os.ErrNotExist, on the assumption that any empty .ecx was a stub left
by a failed copy stream. That broke the legitimate empty-volume case:
when an EC volume's source .idx has no live entries (e.g. all needles
deleted before WriteSortedFileFromIdx), the sorted .ecx is genuinely
0 bytes and must mount. The integration test
TestEcShardsToVolumeMissingShardAndNoLiveEntries fails with
"MountEcShards: no .ecx index found on any local disk" because the
mount path now refuses the legitimate empty index.

A 0-byte .ecx left by a failed copy stream is indistinguishable from
the legitimate empty case by file size alone. Preventing stub files
from being written is the receiver-side cleanup in writeToFile's job
(the companion EC distribute PR), not NewEcVolume's at mount time.

The cross-disk lookup helpers (findEcxIdxDirForVolume, HasEcxFileOnDisk,
indexEcxOwners) keep their size > 0 preference: when a real .ecx
exists on a sibling disk alongside a stub, we still want to route
mounts and reconcile at the real one. If no non-zero .ecx exists
anywhere, the per-disk fallback in MountEcShards can still open the
0-byte .ecx and the volume mounts.

Replace TestMountEcShards_ZeroByteEcxOnlyDisk with
TestMountEcShards_EmptyEcxMountsSuccessfully, which pins the
empty-volume invariant.

2026-05-18 15:00:33 -07:00