mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-17 23:31:31 +00:00
After rejoin, the shipper is configured but no I/O triggers Ship(), so the shipper stays Disconnected and the core stays at awaiting_shipper_connected indefinitely. Fix: observePrimaryShipperConnectivity now calls TryReconnectShippers when ShipperConfigured=true but ShipperConnected=false. This triggers the full reconnect protocol (dial + handshake + bounded catch-up) proactively, bringing the replica current without waiting for I/O. Option B approach: uses the same reconnect path as Barrier() — not a fake write or bare dial probe. CatchUpTo(headLSN) replays any retained WAL entries, bringing the replica fully current. New methods: - WALShipper.TryReconnect(): full reconnect without foreground I/O - ShipperGroup.TryReconnectAll(): probes all disconnected shippers - BlockVol.TryReconnectShippers(): volume-level entry point Also fix pre-existing test expectation: engine now emits start_recovery_task on primary assignment with replicas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>