Files
seaweedfs/weed/worker
Chris Lu 628363c4a6 fix(erasure_coding): surface replica delete failures from EC task (#9184) (#9187)
* test(erasure_coding): reproduce #9184 deleteOriginalVolume swallowing errors

ErasureCodingTask.deleteOriginalVolume logs a warning when any replica
VolumeDelete fails and then returns nil, so the EC task reports
success to the admin even when a source replica survives. That stale
replica lets a later detection scan re-propose the same volume and,
once retried, drives the mounted-shard-truncation corruption that
issue 9184 also describes.

Reproducer: wire one reachable replica (succeeds) and one unreachable
replica (fails) and assert the function currently returns nil. After
the fix the function must surface the replica failure so the task is
retried rather than marked done, and this test needs to be inverted.

* fix(erasure_coding): surface replica delete failures from EC task

ErasureCodingTask.deleteOriginalVolume previously logged a warning
and returned nil when any VolumeDelete against a source replica
failed. The EC task therefore reported overall success to the admin
even when a source replica stayed on disk, which let a later
detection scan propose a duplicate EC encoding of the same volume.
The retry then walked the ReceiveFile path against servers that
already had mounted EC shards for the volume, truncating the live
shard files in place (the other half of #9184).

This change returns an error describing the per-replica failures
after the best-effort delete pass, so the task is marked failed
instead of silently moving on. Successful deletes are still applied
(per-replica progress is preserved); only the final return changes.

When combined with the ReceiveFile mount-safety check, a stuck
original replica now produces loud, actionable failures instead of
silent corruption.

Tests:
- TestDeleteOriginalVolumeSurfacesReplicaFailures: asserts an error
  is returned and names the unreachable replica, while the reachable
  replica still gets deleted.
- TestDeleteOriginalVolumeSucceedsWhenAllReplicasReachable: pins the
  happy path.
2026-04-22 16:02:51 -07:00
..
2026-04-08 12:43:18 -07:00