Files
seaweedfs/weed/operation
Chris Lu f643893891 fix(master): shed assign load when volume growth is already in flight (#10121)
Under a herd of concurrent assigns with no writable volume, Assign spun
PickForWrite for the full 10s timeout, pinning a goroutine per request and
starving the master of the cycles it needs to process growth and answer
heartbeats. When growth is the relevant remedy and already in flight, stop
spinning: if free space exists, shed with a fast retryable error so clients
back off and retry once growth lands; if the cluster is out of space, fail fast
with the real out-of-space error instead of masking it as retryable.

The gRPC shed uses ResourceExhausted, not Unavailable: operation.Assign retries
it, but the client connection layer doesn't treat it as a dead channel, so a
per-request shed across a herd doesn't tear down the shared master connection
and cancel every other in-flight assign. The HTTP dirAssignHandler sheds with
503 + Retry-After.
2026-06-26 14:23:40 -07:00
..
2019-02-09 21:56:32 -08:00