mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-06-09 18:32:43 +00:00
4bf27278fa
* operation: bound upload retries and honor context cancellation retriedUploadData hardcoded 3 attempts and an uninterruptible backoff sleep. A synchronous replica write to a dead host therefore paid the full dial timeout three times over before failing. Add UploadOption.MaxAttempts (<=0 keeps the default of 3) so callers can cap attempts, and make the loop return as soon as the context is cancelled so an abandoned upload unwinds instead of retrying. * topology: fail replica writes fast when a replica is unreachable DistributedOperation already returns on the first error, but a single dead replica is itself the slow result: its goroutine retries the upload three times through the dial timeout (~30s) before any error surfaces, stalling the originating client write the whole time. Make the replica write a single attempt (MaxAttempts=1) so a dead replica fails after one dial timeout instead of three, and thread a context into DistributedOperation that is cancelled once the outcome is decided, so a healthy replica is no longer held hostage by one stalled in a dial. The originating client write is what retries. * topology: keep replica deletes off the client request context ReplicatedDelete runs after the local needle is already deleted. Driving the replica deletes off r.Context() means a client disconnect cancels them and orphans needles on the replicas, so use a background context. * operation, topology: trim comments on the replica fail-fast path