seaweedfs

mirror of https://github.com/seaweedfs/seaweedfs.git synced 2026-07-29 19:43:19 +00:00

Author	SHA1	Message	Date
pingqiuandClaude Opus 4.6	a79cba0be7	fix: PlanRebuild targetLSN=0 when replica is degraded (CommittedLSN fallback) Root cause: StatusSnapshot().CommittedLSN reports 0 in sync_all mode when the replica shipper has no flushed progress (NeedsRebuild state). This is correct for lineage-safe committed boundary, but PlanRebuild uses CommittedLSN as RebuildTargetLSN. With target=0, shouldStartSessionCommand rejects the StartRebuildCommand, and the rebuild IO never executes. Fix: PlanRebuild falls back to HeadLSN when CommittedLSN is 0. The primary's WAL head IS the data boundary the replica needs to reach. The fact that no replica has confirmed durability is exactly why we're rebuilding. Also adds command type logging to coreApplyAndLog so tester can verify which commands are actually emitted vs silently dropped. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:35:31 -07:00
pingqiuandClaude Opus 4.6	bc767eb9d2	fix: rebuild correctness — single completion, fail-closed acks, diagnostic logging Three correctness fixes for the remote rebuild path: 1. No double completion: for remote rebuilds, OnRebuildCompleted skips RebuildCommitted since ObserveReplicaRebuildSessionAck already emitted SessionCompleted on the accepted ack. One rebuild = one completion event. 2. SessionAckFailed with rejected observation: if OnAck rejects the failed ack (stale session), don't use the sentinel errRebuildAckFailed. Return a regular error so ExecutePendingRebuild emits the fallback SessionFailed. No path leaves the engine session hanging. 3. Diagnostic logging in ExecutePendingRebuild: log the replicaID and targetLSN on both nil-return (TakeRebuild mismatch) and successful take paths. Also log the pending store in runRebuild with replicaID, targetLSN, and IO type. This makes the TakeRebuild seam diagnosable on hardware without rebuilding the engine package. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:25:26 -07:00
pingqiuandClaude Opus 4.6	df69c83f41	feat: RemoteRebuildIO — primary coordinates rebuild, replica installs Replace the broken primary-local rebuild executor with RemoteRebuildIO, a server-side engine.RebuildIO implementation that coordinates remotely. The primary sends SessionControlV2 (with RebuildAddr trailer) to the replica's control channel; the replica starts a local rebuild session and auto-connects to the primary's rebuild server for the base lane. Single rebuild route: ALL core-present rebuilds use RemoteRebuildIO. The entire command chain is preserved unchanged: PlanRebuild → pending → RebuildStarted → StartRebuildCommand → ExecutePendingRebuild → RemoteRebuildIO.TransferFullBase Key changes: - SessionControlMsg v2: optional RebuildAddr trailer (len-based decode) - ReplicaRebuilding shipper state: session-gated live WAL lane - RemoteRebuildIO: dials replica ctrl, sends session control, reads acks - Ack forwarding through ObserveReplicaRebuildSessionAck (pins/watchdog) - Completion proof from replica's achievedLSN, not primary's local vol - Transport failures emit SessionFailed (no double-emit on ack failures) - Progress ack rejection fails closed (stale session = abort) - Replica auto-starts base lane client on v2 session control State transitions: NeedsRebuild → [accepted ack] → Rebuilding → [completed] → InSync Rebuilding → [failed/EOF] → NeedsRebuild → [next probe] → retry Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 15:04:22 -07:00
pingqiuandClaude Opus 4.6	befe049b09	refactor: unified primary onboarding + rebuild execution wiring Replace three bypass mechanisms with one unified model. When the probe returns ProbeRebuildRequired, the host now starts the rebuild through the existing recovery manager (StartRecoveryTask), which resolves the rebuild address, plans the rebuild, and executes via the v2bridge executor — the same path as master-driven RoleRebuilding. New per-replica probe API: - WALShipper.ProbeReconnect() → ReplicaProbeResult with typed outcome - ShipperGroup.ProbeReconnectAll() → []ReplicaProbeResult - BlockVol.ProbeReplicaOnboarding() / IsClosed() Host-side wiring: - handleReplicaProbeResult routes outcomes: KeepUp → ShipperConnectedObserved CatchUp → ShipperConnectedObserved (recovery manager handles session) Rebuild → NeedsRebuildObserved + StartRecoveryTask (executes rebuild) TemporaryFailure → no-op - lastAssignmentsForPath reconstructs assignment for recovery manager - onPrimaryRosterChanged probes all replicas (defined, called from watchdog) - observePrimaryShipperConnectivity uses probe API Probe fires via syncProtocolExecutionState immediately after assignment processing — same heartbeat cycle, no timer delay. Deleted: startDirectRebuild, resolveCtrlAddrForShipper, TryReconnect/TryReconnectAll/TryReconnectShippers. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 02:33:07 -07:00
pingqiuandClaude Opus 4.6	d6bc7516f1	feat: primary-direct rebuild — start rebuild session on NeedsRebuild When proactive reconnect finds WAL gap exceeds retained range: 1. Emit per-replica NeedsRebuildObserved to engine (with ReplicaID) 2. Resolve replica ctrl address from shipper group 3. Start direct rebuild session: send sessionControl(start_rebuild) to replica's ctrl channel, stream base blocks, emit RebuildStarted The primary drives the rebuild directly without master round-trip. The master sees the result via heartbeat projection (needs_rebuild → rebuilding → healthy). This matches V2 authority: master owns identity, primary owns data-control recovery. Added WALShipper.CtrlAddr() getter for address resolution. resolveCtrlAddrForShipper maps data address to ctrl address via shipper group (works for RF=2 and RF=3+). startDirectRebuild runs in a goroutine: dials replica ctrl, sends start_rebuild, waits for accepted ack, serves base blocks, emits RebuildStarted to engine on success. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 01:04:00 -07:00
pingqiuandClaude Opus 4.6	8b469cf70b	fix: revert Bridge 2, fix Bridge 1 with per-replica identity Revert detectAndEnqueueRebuildFromHeartbeat (Bridge 2) — master should not drive rebuild assignments from heartbeat. The primary owns data-control recovery per the V2 authority split. Fix Bridge 1: NeedsRebuildObserved now carries per-replica identity. resolveReplicaIDForShipper maps shipper DataAddr to ReplicaID via the shipper group (works for RF=2 and RF=3+). The engine receives the specific replica that needs rebuild, not a volume-level broadcast. Primary-direct rebuild: the primary detects which replica needs rebuild and will drive the session directly. The master learns about it via subsequent heartbeat projection (needs_rebuild → rebuilding → healthy). No master round-trip needed for the rebuild decision. Added WALShipper.DataAddr() getter for address resolution. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 00:55:50 -07:00
pingqiuandClaude Opus 4.6	f90ccf5bfd	fix: proactive shipper reconnect on rejoin (Bug 5) After rejoin, the shipper is configured but no I/O triggers Ship(), so the shipper stays Disconnected and the core stays at awaiting_shipper_connected indefinitely. Fix: observePrimaryShipperConnectivity now calls TryReconnectShippers when ShipperConfigured=true but ShipperConnected=false. This triggers the full reconnect protocol (dial + handshake + bounded catch-up) proactively, bringing the replica current without waiting for I/O. Option B approach: uses the same reconnect path as Barrier() — not a fake write or bare dial probe. CatchUpTo(headLSN) replays any retained WAL entries, bringing the replica fully current. New methods: - WALShipper.TryReconnect(): full reconnect without foreground I/O - ShipperGroup.TryReconnectAll(): probes all disconnected shippers - BlockVol.TryReconnectShippers(): volume-level entry point Also fix pre-existing test expectation: engine now emits start_recovery_task on primary assignment with replicas. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-09 00:14:46 -07:00
pingqiuandClaude Opus 4.6	53246d2780	fix: recover TOCTOU + WAL pressure edge case tests Fix recover path TOCTOU: re-Lookup after AddReplica so the primary refresh assignment includes the freshly added replica addresses. Previously, Lookup (copy) was called before AddReplica modified the registry, so entry.Replicas was empty → primary got replicas=0 → shipper never configured. Add 2 WAL pressure edge case tests: - ShipperCatchUpOrEscalate: 64KB WAL, 200 writes, aggressive flusher. Proves no hang/deadlock/corruption. Shipper either keeps up or correctly escalates to NeedsRebuild. - RebuildWithPinWhilePrimaryWrites: rebuild session active while primary writes 7600+ blocks in 2s. Proves primary never freezes — rebuild pin is on replica only, primary WAL recycles freely. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 23:56:26 -07:00
pingqiuandClaude Opus 4.6	e0116fc631	fix: three hardware blockers — WAL retention + registry race + shutdown beat All 43 actions pass on m01/m02 hardware. Auto-failover PASS. dd_write: 30s → 123ms. Post-failover write: 33,621 IOPS. 1. WAL retention: remove keepup retention floor (MinShippedLSN). WAL cannot be pinned during sustained async writes — any pin strategy either fills WAL (blocking writes) or over-recycles (breaking catch-up). Flusher recycles freely. Future LBA map will provide catch-up without WAL retention. MinShippedLSN on ShipperGroup retained as diagnostic surface. 2. Registry stale-cleanup race: add RegisteredAt grace period. Race: master registers volume → next VS heartbeat arrives before VS discovers the volume → stale cleanup deletes the entry → failover finds 0 entries. Fix: skip stale cleanup for entries registered within 30s (> 2 heartbeat intervals). 2 new tests: grace protects new entry, old entry still cleaned. 3. Shutdown heartbeat: VS disconnect heartbeat no longer claims block inventory authority. Previously, the shutdown beat's empty inventory triggered stale cleanup, deleting the entry before failover could use it. Scenario fix: recovery-baseline-failover.yaml now kills the correct node (discovered primary, not hardcoded), connects to the correct new primary for post-failover verification. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 22:59:46 -07:00
pingqiuandClaude Opus 4.6	39f1232fe2	feat: validation matrix closure — Rebuild Ready 12/12, Restore Ready 10/10 Close all Rebuild Ready and Restore Ready matrix gaps. V2 Ready at 10/14 (2 partial, 2 missing — honest assessment). New tests (tester-written): - R1: syncAck-driven trigger via protocol engine decision - R3: stale replica restart beyond WAL → rebuild converges - R5: connection drop mid-base → cancel → fresh rebuild converges - R10: failover-rejoin with forced WAL recycling, strict rebuild assert - R11: divergent replica full overwrite convergence - R12: crash mid-rebuild → fresh session converges (not resume) - S2: corrupt WAL entry + corrupt base block both rejected - S5: snapshot-tail rebuild (base + WAL tail replay) - S7: crash between base install and tail replay - S8: snapshot under concurrent writes - V5: rebuild complete without DurableLSN blocks publish_healthy - V9: mixed replica health aggregate projection - V14: negative fail-closed matrix (epoch, kind, stale) Bug fix: StartRebuildSession now clears stale dirty map + resets WAL + updates checkpoint AFTER safety check but BEFORE session.Start(). Fixes stale extent data shadowing rebuild base blocks on reopened replicas. Cleanup: remove 14 obsolete design docs (migration batches, old WAL-v2 specs, simulator goals) — all superseded by current protocol docs. 34 component tests + 8 protocol engine tests + server tests all pass. 1GB CRC validation passes in 19s. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 16:31:55 -07:00
pingqiuandClaude Opus 4.6	59a36013d4	feat: rebuild hardening A1-A5 + session-controlled execution path A1 Engine kind-routing fix: SessionProgressObserved/Completed/Failed now respect active session Kind. Rebuild progress no longer leaks into catch-up aggregate. sessionKindMismatch guard + observeRebuildProgress helper. 2 regression tests lock kind isolation. A2 Retention pin: Rebuild session ack drives progress-based WAL retention floor. Pin installed at base_lsn on accepted, advances with wal_applied_lsn, released on completed/failed/cancelled. rebuildProgressPinFloor returns min across all active replicas. Retention pin test: 100 blocks fill WAL, 5 flusher cycles with 20 pinned rebuild entries — all verified correct. A3 Progress ack emission: Automatic sessionAck(running/base_complete/completed/failed) emitted from rebuild session lifecycle transitions. sessionAckLocked builds ack under session lock. emitRebuildSessionAck callback wired through SetOnRebuildSessionAck on BlockVol. ObserveReplicaRebuildSessionAck maps acks to core engine events. WireLocalReplicaRebuildSessionAcks bridges local callback to server. 5 server tests proving ack→core, pin advance, pin cleanup. A4 Deadline/timeout: rebuildAckWatch watchdog: armed on accepted/running/base_complete, refreshed on each ack, cleared on completed/failed. Timeout cancels local session + clears pin + fail-closes. 2 tests: timeout→fail-close, progress→refresh. A5 Session-controlled execution path: v2bridge.Executor.TransferFullBase now uses session-controlled loop: beginControlledFullBase → real sessionControl over TCP → transferExtentToSession via RebuildTransportClient → PrepareFullBaseRebuild → TryCompleteRebuildSession. ReplicaReceiver control channel handles MsgSessionControl alongside MsgBarrierReq. Session acks written back on same TCP connection. RebuildSessionBase request type separates new per-block stream from legacy raw extent stream. Full-base cleanup deferred until success. Deadlock fix: ApplyBaseBlock releases session lock before ioMu. Hydration skip for full-base sessions. 23 rebuild component tests (all pass): 11 kernel correctness, 8 transport/runtime, 3 scenario-scale, including 1GB primary-initiated with CRC validation. 29 files changed, ~2500 insertions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-08 14:39:11 -07:00
pingqiuandClaude Opus 4.6	342f8baa69	feat: rebuild transport wiring — session control + base block streaming Wire protocol messages and transport handlers for the rebuild MVP: Protocol messages (rebuild_transport.go): - SessionControlMsg: epoch, sessionID, command, baseLSN, targetLSN, snapshotID. Encode/Decode with fixed 37-byte wire format. - SessionAckMsg: epoch, sessionID, phase, walAppliedLSN, baseComplete, achievedLSN. Encode/Decode with fixed 34-byte wire format. - MsgSessionControl (0x10) and MsgSessionAck (0x11) on control channel. - SendSessionControl/SendSessionAck convenience functions. Transport handlers: - RebuildTransportServer: primary-side, streams all extent blocks as MsgRebuildExtent frames (reusing existing rebuild message type), ends with MsgRebuildDone. - RebuildTransportClient: replica-side, receives base blocks and routes through vol.ApplyRebuildSessionBaseBlock, marks base complete on MsgRebuildDone. 4 transport tests: - SessionControl wire round-trip - SessionAck wire round-trip - BaseBlockStreaming: full TCP loop, 1024 blocks streamed and verified - SessionControlOverTCP: real TCP send/receive with accepted ack Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:57:43 -07:00
pingqiuandClaude Opus 4.6	49845dd509	feat: server-layer rebuild session skeleton — host routing for MVP Add BlockService replica-side rebuild routing API that bridges transport/host layer to BlockVol session surface: StartReplicaRebuildSession(path, config) ApplyReplicaRebuildWALEntry(path, sessionID, entry) ApplyReplicaRebuildBaseBlock(path, sessionID, lba, data) MarkReplicaRebuildBaseComplete(path, sessionID, totalBlocks) TryCompleteReplicaRebuildSession(path, sessionID) CancelReplicaRebuildSession(path, sessionID, reason) ReplicaRebuildSession(path) → snapshot Each method does one thing: validate → WithVolume → delegate to BlockVol. No wire decoding, no protocol decisions, no state invention. Transport wiring (sessionControl/walData/sessionData handlers) is the next step. 2 focused tests: skeleton routes correctly, stale session ID rejected. Updated v2-rebuild-mvp-session-protocol.md with server skeleton section. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:53:32 -07:00
pingqiuandClaude Opus 4.6	d2d57851b0	feat: rebuild MVP — dual-lane session with bitmap protection Rebuild session protocol implementation for v2-rebuild-mvp-session-protocol.md. New files: - rebuild_bitmap.go: RebuildBitmap — session-scoped dense bitset for WAL-applied LBA tracking. MarkApplied on local WAL write (not receive). ShouldApplyBase returns false for WAL-covered LBAs (WAL always wins). - rebuild_session.go: RebuildSession — replica-side two-line rebuild. WAL lane (ApplyWALEntry) + base lane (ApplyBaseBlock) with bitmap conflict resolution. TryComplete requires BOTH base_complete AND wal_applied_lsn >= target_lsn. Volume-level control surface: StartRebuildSession, ApplyRebuildSessionWALEntry/BaseBlock, MarkRebuildSessionBaseComplete, TryCompleteRebuildSession, CancelRebuildSession, ActiveRebuildSession. - rebuild_mvp_test.go: 4 correctness tests — base+WAL converge, WAL-applied never overwritten by base, bitmap set on applied not received, control surface start/supersede/complete. - rebuild_transport_test.go: 2 transport-level tests — two-line with real WAL shipping, live writes during base copy with bitmap conflict. Design docs: - v2-rebuild-mvp-session-protocol.md: MVP spec with message set, apply rules, completion/failure/crash rules, test matrix - v2-sync-recovery-protocol.md: full protocol context (keepup/catchup/ rebuild unified design, primary decision logic, two-line model) - v2-session-protocol-shape.md: protocol shape overview Protocol engine (reference, not production): - sw-block/protocol/: 7-event engine with ~300 lines, 13 tests 6 rebuild tests pass, all existing component tests pass. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-07 14:30:34 -07:00
pingqiuandClaude Opus 4.6	55013e103b	feat: Phase 20 Stage 0+1 closure — bootstrap + sustained workload on hardware Stage 0 (bootstrap closure): PASS on m01/M02 - create RF=2 sync_all → 10s shipper wait → 4k fsync → publish_healthy - Proves: BarrierAccepted observation, ShipperConnected, DurableLSN > 0 Stage 1 (sustained workload): 32/33 actions PASS - bootstrap → fio 10s randwrite → dd_write 1M×2 fsync → data checksum - Remaining: auto-failover promotion (separate issue) Key fixes: - BarrierAccepted callback: SyncCache success → core DurableLSN update - BarrierRejected callback: barrier failures surface to core with reason - Shipper state callback for new volumes (not just startup volumes) - CatchUpTo ctrl conn reset: prevents stale control channel after recovery - CP13-6 max-bytes budget suspended: uses replicaFlushedLSN which can't advance without barrier; kills healthy shippers during async writes. Will be replaced by v2 negotiated sync/recovery protocol. - Barrier diagnostic logging: start/fail/success with reason and LSN - Scenario restructured: Stage 0 (bootstrap-closure) + Stage 1 (failover) - dd_write: sync_mode param + real stderr capture - sw-test-runner suite command: deploy once, run N scenarios - WAL size plumbing: proto + API + handler (forward-compatible) Known: 6 blockvol/server test failures from Barrier() path change (bounded catch-up in Barrier). Need test updates to match new semantics. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 19:55:12 -07:00
pingqiuandClaude Opus 4.6	44103a1bd7	feat: Phase 20 acceptance fixes + sw-test-runner suite mode Acceptance rows closed: - WriteLBA/SyncCache contract: code comments document write-back vs durability fence semantics - RF=2 stable identity: v2bridge always uses SetReplicaAddrs (preserves ServerID); blockcmd dispatcher also fixed to use setupPrimaryReplicationMulti; test asserts exact expected ReplicaID="vs-2" (not just non-empty) - Tests treating WriteLBA as commit: replica_read_test rewritten with SyncCache as durability fence - publish_healthy contract: 3 gate tests with hard assertions including gate 3 (PrimaryShipperConnected) - SetReplicaAddr deprecation warning added - WALShipper.ReplicaID() getter added for identity verification Test runner enhancements: - sw-test-runner suite command: build → deploy → run N scenarios in one invocation with --skip-deploy support - Suite YAML definitions for T6 Stage 0 and Stage 1 - deploy action: kill stale processes, clean dirs, cross-compile, upload - run-phase20-t6.ps1 PowerShell script (deprecated by suite command) Engine/runtime fixes: - Recovery executor nil-safety improvements - Recovery bundle BuildRecoveryBundle defensive checks - ShipperGroup MinReplicaFlushedLSNAll surface Docs: acceptance checklist refined, test matrix updated, T6 runbook, engine maintainer tutorial, design README updated. 26 files changed, ~1600 insertions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 11:30:54 -07:00
pingqiuandClaude Opus 4.6	275c3ee1c7	docs: Phase 20 acceptance checklist — architect-refined signoff matrix Tighten acceptance matrix with explicit per-boundary rows, signoff reading split into hard blockers vs product hardening, and clear rule: architecture-complete ≠ product-complete. 6 hard blockers before T6/T7: 1. WriteLBA/SyncCache/sync_all contract closure 2. Fresh replica bounded catch-up before live tail 3. Timeout/retention-loss classification for catch-up 4. publish_healthy alignment with one protocol contract 5. RF=2 stable identity on all shipping paths 6. Test audit for incorrect WriteLBA==commit assumptions Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:12:32 -07:00
pingqiuandClaude Opus 4.6	58aa842802	docs: Phase 20 product acceptance checklist 7-area acceptance matrix mapping current state vs product requirements: write/durability contract, fresh replica bootstrap, host observation completeness, serving/publish alignment, snapshot/rebuild convergence, adapter consistency, test contract alignment. Each item marked with: current state, required for product, blocks T6/T7, best test level. Priority ordered into must-close-before-Stage-1, should-close-before-Stage-2, and can-close-after-T6/T7. Key diagnosis: architecture-complete, execution-incomplete. The engine thinks like a product; the data plane still behaves partly like a prototype. The gap is end-to-end contract closure. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-06 00:05:22 -07:00
pingqiuandClaude Opus 4.6	d1a16fac03	feat: protocol-aware execution wave — phase gate for live WAL shipping Add host-side protocol state seam that derives per-replica execution state from V2 sender/session snapshots and blocks live-tail WAL shipping while an active recovery session is in progress. New file: weed/server/block_protocol_state.go - replicaProtocolExecutionState derived from engine snapshots - LiveEligible=false during active catch-up/rebuild sessions - bindProtocolExecutionPolicy wires policy into BlockVol - syncProtocolExecutionState called after assignments + core events Data plane changes: - WALShipper.Ship() checks liveShippingPolicy before dial/send - BlockVol.SetLiveShippingPolicy persists across shipper group rebuilds - ShipperGroup propagates policy to all shippers Design contract: sw-block/design/v2-protocol-aware-execution.md Scope: WAL-first rollout only. Prevents illegal live-tail delivery during active recovery. Does not change snapshot/build behavior or move backlog. Next wave: bounded WAL catch-up under same contract. Tests: 4 unit/component tests for phase gate behavior, plus bootstrap seam tests that confirmed the two pre-existing bugs locally. 13 files changed, 900 insertions, 69 deletions. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 23:47:07 -07:00
pingqiuandClaude Opus 4.6	f8e8c2c4d1	docs: fix Phase 20 test count — 48 not 49 Verified by counting: T1(5) + T2(12) + T3(8) + T4(8) + T5(13) + Proto(2) = 48. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 21:06:00 -07:00
pingqiuandClaude Opus 4.6	c7dd90c623	docs: Phase 20 test matrix — update with Tier 1 results + full roster status Update coverage reading to reflect 49 tests (6 new component tests). Add full roster status table with per-item strong/bounded/missing marking and mapped test function names. Unit+component: 32 of 33 items strong (T4-C7 NVMe bounded). Integration: 6 of 10 missing (Tier 2 next). Hardware: 4 of 4 missing (T6/T7 staged plan). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 19:42:13 -07:00
pingqiuandClaude Opus 4.6	6bf9a6c283	test: Phase 20 Tier 1 component tests — wiring proof for CI/CD 6 new component tests closing gaps identified in the test matrix audit: P20-T4-C3: Missing projection with active V2 core fails closed - v2Core != nil, no projection cached → gate with "missing_engine_projection" P20-T4-C6: Gate actually removes iSCSI target (enforcement) - real TargetServer → HasTarget(iqn)==true before gate - gate → HasTarget(iqn)==false (DisconnectVolume called) - ungate → HasTarget(iqn)==true (AddVolume restores) P20-T5-C3: FailoverDiagnosticSnapshot carries both mode fields - register volume with EngineProjectionMode + ClusterReplicationMode - trigger pending rebuild → volume appears in diagnostic - diagnostic entry carries both modes from registry lookup P20-T3-C5: V2PromotionMode diagnostic tri-state - disabled / placeholder_fail_closed / transport_ready - all three configurations produce correct diagnostic value P20-T1-C3: EngineProjectionMode proto round-trip - set value survives InfoMessageToProto → InfoMessageFromProto - empty value produces nil proto field (presence semantics) P20-T4-C8: ActivationGated proto round-trip - gated=true + reason survives round-trip - not-gated produces no spurious reason Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 19:39:00 -07:00
pingqiuandClaude Opus 4.6	1c7154a11a	docs: Phase 20 test matrix — gap inventory + component test specs Add detailed coverage mapping of 43 existing tests against the test roster. Identify 7 missing component tests and 3 missing integration tests with concrete scenarios, file placement, and must-prove criteria. Key finding: every tester-found bug during T1-T5 was a wiring bug caught by reviewing the production path, not by unit tests on pure logic. This confirms component tests are the highest-value gap for CI/CD protection. Priority order: Tier 1 (7 component tests, do now), Tier 2 (3 integration tests, do before hardware), Tier 3 (4 hardware scenarios, T6/T7). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 19:33:55 -07:00
pingqiuandClaude Opus 4.6	3e6155c18e	docs: Phase 20 T5 — wire ClusterReplicationMode into diagnostic surface Add ClusterReplicationMode and EngineProjectionMode to FailoverVolumeState so each volume in the failover diagnostic carries its cluster/engine mode at diagnosis time. FailoverDiagnosticSnapshot() enriches volume entries by looking up the registry entry for each volume. This covers both the block volume API (GET /block/volume/{name}) and the failover diagnostic snapshot surface. Update phase doc to reflect actual exposure paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:56:40 -07:00
pingqiuandClaude Opus 4.6	ceb68cc66b	fix: Phase 20 T5 — RF2 missing replica degraded + transport signal + API surface Fix three tester findings on T5: 1. RF2 with missing replicas now reports "degraded" instead of "no_replicas". Only RF=1 with no replicas returns "no_replicas". Missing replica in an RF2 set is a degraded cluster state. 2. TransportDegraded signal now incorporated: if master-observed transport is degraded, ClusterReplicationMode is at least "degraded" regardless of individual replica health. 3. API surface exposure: EngineProjectionMode and ClusterReplicationMode now appear on blockapi.VolumeInfo and are populated in entryToVolumeInfo(). Operators can consume both through GET /block/volume/{name} with distinct JSON field names. 12 tests: keepup, catching_up, stale degraded, LSN gap needs_rebuild, rebuilding role, RF1 no_replicas, RF2 missing degraded, transport degraded, distinctness, heartbeat update, worst dominates, API surface distinct naming. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:49:37 -07:00
pingqiuandClaude Opus 4.6	013f3e7ccb	feat: Phase 20 T5 — ClusterReplicationMode on master Add ClusterReplicationMode as a distinct master-owned cluster-level replication health judgment, computed from multi-replica facts: replica LSN lag, heartbeat freshness, role state. Monotonic: worst replica state dominates. Modes: "no_replicas" (RF=1), "keepup" (all healthy), "catching_up" (replica behind but recoverable), "degraded" (stale heartbeat or barrier failure), "needs_rebuild" (unrecoverable gap or rebuilding role). Distinct from EngineProjectionMode (VS-local engine truth) and VolumeMode (legacy). They answer different questions, live in different fields, have different names. Tests explicitly prove the two can differ without conflict. Computed in recomputeReplicaState() alongside existing VolumeMode. Updated on every heartbeat that touches the entry. 9 tests: keepup, catching_up, stale degraded, LSN gap needs_rebuild, rebuilding role, no_replicas, distinctness from EngineProjectionMode, heartbeat-driven update, worst-replica-dominates (RF3). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:41:44 -07:00
pingqiuandClaude Opus 4.6	9cead1b502	fix: Phase 20 T4 — fail-closed on missing projection + NVMe gate Fix two tester findings: 1. Missing engine projection now fails closed: if v2Core is active but CoreProjection(path) is missing, gate locally with reason "missing_engine_projection". Mirrors T2's fail-closed posture. Only skips enforcement when V2 core is entirely absent. 2. NVMe/TCP now gated alongside iSCSI: gateServing() calls both targetServer.DisconnectVolume() and nvmeServer.RemoveVolume(). ungateServing() re-registers with both iSCSI and NVMe. A gated volume is unreachable through all frontend paths. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:29:58 -07:00
pingqiuandClaude Opus 4.6	46f72572c5	fix: Phase 20 T4 — real serving enforcement + wire propagation + runtime ungate Fix three tester findings on T4 activation gate: 1. Real serving enforcement: evaluateActivationGate now calls gateServing() → DisconnectVolume(iqn) on gate (terminates active iSCSI sessions, removes volume from target). ungateServing() → AddVolume(iqn, adapter) on clear (re-registers volume). This is actual serving enforcement, not just bookkeeping. 2. Wire propagation: add activation_gated (field 25) and activation_gate_reason (field 26) to proto BlockVolumeInfoMessage. Add generated Go fields + getters. Add proto conversion in InfoMessageToProto/InfoMessageFromProto. Gate state now rides the real VS→master heartbeat wire. 3. Runtime ungate: evaluateActivationGate() now also runs in applyCoreEvent() (the observation-driven path), not just applyCoreAssignmentEvent(). Recovery/catch-up completion that transitions the projection to publish_healthy/replica_ready now clears the gate and re-registers the volume automatically. ClearActivationGate() remains as an explicit override for edge cases but is no longer the primary ungate path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:22:49 -07:00
pingqiuandClaude Opus 4.6	a27569358b	feat: Phase 20 T4 — local activation gate on promoted primary After assignment executes through V2 core, evaluateActivationGate() checks the resulting projection locally. If mode is degraded, needs_rebuild, bootstrap_pending, or allocated_only, the volume is gated from serving. Gate is enforced immediately after assignment, before the next heartbeat round-trip. Gate cleared only when projection reaches publish_healthy or replica_ready. IsActivationGated() provides the query surface for iSCSI/NVMe adapter enforcement. Heartbeat carries ActivationGated and ActivationGateReason fields so master can observe the gated state (report path, not enforcement path). activationGated map on BlockService tracks per-volume gate state. Initialized in constructor. Test helper updated to include it. 6 tests: degraded gates, needs_rebuild gates, healthy clears gate, gate enforced before heartbeat, recovery re-enables, assignment with degraded projection triggers gate. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 18:13:20 -07:00
pingqiuandClaude Opus 4.6	f825f08680	fix: Phase 20 T3 — correct V2 promotion observability to tri-state mode Replace misleading V2PromotionEnabled/V2PromotionReady booleans with single V2PromotionMode string: "disabled", "placeholder_fail_closed", or "transport_ready". Previous V2PromotionReady was true whenever any querier was installed, including the placeholder that always returns error. Now the diagnostic accurately distinguishes placeholder (fail-closed until proto regen) from real gRPC transport. blockV2EvidenceTransport bool on MasterServer tracks whether the real transport querier is installed. Currently always false (placeholder). Set to true only when real gRPC querier replaces the placeholder after proto regen. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:29:12 -07:00
pingqiuandClaude Opus 4.6	2b97cd04b8	fix: Phase 20 T3 — add V2 promotion observability to FailoverDiagnostic FailoverDiagnostic now carries V2PromotionEnabled and V2PromotionReady fields. MasterServer.FailoverDiagnosticSnapshot() enriches the failover state diagnostic with rollout gate visibility so operators can confirm whether the master is on V1, V2, or V2-fail-closed-placeholder mode. Update phase-20.md: document default=false rollout policy (safe default until proto regen enables evidence RPC, then flip to default true). Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:27:02 -07:00
pingqiuandClaude Opus 4.6	43016e6645	fix: Phase 20 T3 — production wiring + fail-closed on partial evidence Wire V2 promotion into production binary: - Add --block.v2Promotion CLI flag on weed master (default false) - MasterOption.BlockV2Promotion → NewMasterServer wires flag + querier - defaultBlockVSQueryEvidence placeholder (returns explicit error until proto regen on M01 enables gRPC evidence RPC) Fix three fail-closed violations found by tester: 1. blockV2Promotion=true + nil querier now fails closed with explicit log instead of silently falling back to V1 2. Partial evidence (any candidate query failed) now fails closed — unreachable candidate may be the most durable, promoting from incomplete evidence violates durability-first ordering 3. Clear EngineProjectionMode in applyPromotionLocked (already in previous commit, verified in tests here) 2 new tests: NilQuerier_FailsClosed, PartialEvidenceFailure_FailsClosed. Total T3 tests: 7, all pass. Existing V1 failover tests unaffected. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:23:35 -07:00
pingqiuandClaude Opus 4.6	59b2e2d8f9	feat: Phase 20 T3 — durability-first V2 promotion in real failover path Wire V2 promotion into the real master failover decision path: promoteReplica() now dispatches to promoteReplicaV2() when blockV2Promotion flag is true. V2 path queries each candidate for fresh evidence via pluggable BlockPromotionEvidenceQuerier, selects by CommittedLSN (durability-first), and fail-closes when no eligible candidate exists. No silent fallback to V1. Feature flag: blockV2Promotion bool on MasterServer. When false, existing promoteReplicaV1() (health-score-first) is used unchanged. Flag is explicit and observable, not a hidden rescue path. Registry: add PromoteReplicaByServer() for V2 path where master already knows the winner. Clear stale EngineProjectionMode in applyPromotionLocked (complements T1 turnover fix). T2 fix: fail-closed when V2 core projection is absent — Eligible=false with reason "missing_engine_projection". CommittedLSN from core used unconditionally (no WALHeadLSN overstatement). 5 T3 integration tests: higher CommittedLSN wins, all-ineligible fail-closed, evidence-failure fail-closed, flag-off uses legacy, epoch bump + assignment enqueue only after selection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:15:54 -07:00
pingqiuandClaude Opus 4.6	1ca13143b6	feat: Phase 20 T2 — promotion evidence semantics + selection substrate VS-side evidence handler (QueryBlockPromotionEvidence) reads live blockvol.Status() + V2 core projection at call time. Fail-closed: no core projection → ineligible with reason "missing_engine_projection". Engine CommittedLSN used unconditionally when core present (no WALHeadLSN overstatement). Eligibility owned by local V2 engine, not master. Master-side selection (selectDurabilityFirstCandidate): durability-first ordering by CommittedLSN, tie-break WALHeadLSN then HealthScore. All ineligible → fail-closed, no promotion. Pluggable querier (BlockPromotionEvidenceQuerier) for T3 wiring. Proto messages added to volume_server.proto. gRPC transport binding pending proto regen on M01 — this commit delivers evidence semantics and selection substrate, not full end-to-end RPC closure. Phase 20 doc updated with T2-T5 reviewer packs and cross-task guardrails. 13 tests: live facts, core projection mode, fail-closed no-core, 4 gated modes, missing volume, epoch mismatch, CommittedLSN ordering, WALHeadLSN tie-break, HealthScore tie-break, all-ineligible, mixed collection. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 16:10:57 -07:00
pingqiuandClaude Opus 4.6	85dad8e0c9	feat: Phase 20 T1 — EngineProjectionMode in heartbeat Add engine_projection_mode as a distinct proto/wire/registry field that carries pure V2 engine-derived local projection mode from VS to master. Reads ONLY from CoreProjection — no ad-hoc fallback. Separate from existing VolumeMode: EngineProjectionMode is VS-local V2 engine truth, VolumeMode is the existing field that conflates V2 and V1 paths. Both exist during transition; only EngineProjectionMode is V2-authoritative. Clears stale value on primary turnover: when a newly promoted primary heartbeats without the field, the old primary's projection is not preserved (prevents synthetic master-side truth). 5 focused tests: propagation, distinctness (hard assertion), backward compat preservation, turnover-clears, turnover-with-field. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 15:45:26 -07:00
pingqiuandClaude Opus 4.6	044a6d770b	feat: Phase 19 — bounded working RF2 block path Live HTTP evidence transport, continuous Loop2 service, bounded auto failover trigger, runtime-managed frontend export, bounded replica repair, end-to-end RF2 handoff with continued I/O on new primary, bounded operator HTTP surface, and CSI V2 runtime backend adapter. 11 new proof tests covering the full M6-M10 chain plus CSI create/ lookup/publish through the V2 runtime path. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 15:12:00 -07:00
pingqiuandClaude Opus 4.6	5aedada53a	feat: Phase 18 M3 — replicated data continuity closure M3 milestone: write → Loop2 observe → failover → readback verify. Continuity runtime (continuity_runtime.go): - ExecuteReplicatedContinuity: composes mirror write + sync + Loop2 observation + failover + readback verify into one bounded path - ReplicatedContinuityResult: captures pre-failover Loop2 snapshot, failover result, selected primary, readback length, data match Runtime manager extensions: - Local node registry for write/readback during continuity verification - RegisterNode now stores node reference for local I/O access Tests prove two paths: - Happy: write on source → failover → promoted node reads correct data - Gated: degraded peer → failover gate stops → continuity reports failure Phase 18 docs: M3 delivered, M4 next. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 14:10:05 -07:00
pingqiuandClaude Opus 4.6	cae07c0bf1	feat: Phase 18 M2 — active Loop 2 replication runtime M2 milestone: bounded summary-driven active Loop 2 runtime. Loop 2 runtime session (loop2_runtime.go): - Loop2RuntimeSession: primary-led active observation of replica set - ObserveOnce: collects replica summaries via transport seam, evaluates runtime mode (keepup / catching_up / degraded / needs_rebuild) - Fail-closed severity escalation: mode only degrades, never reverts - Detection: epoch mismatch, barrier failure, peer behind primary, recovery in progress, needs_rebuild sticky Runtime manager integration: - NewLoop2RuntimeSession, ObserveLoop2, LastLoop2Snapshot, Loop2Snapshot - Runtime manager now retains active Loop 2 snapshots alongside failover Tests prove three paths: - healthy replica set → keepup - peer behind → catching_up - peer needs_rebuild → needs_rebuild (fail-closed) Phase 18 docs updated: M2 delivered, M3 next. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 14:02:13 -07:00
pingqiuandClaude Opus 4.6	b82df09856	feat: Phase 18 M1 — transport-backed RF2 failover runtime M1 milestone: failover evidence crosses transport/session seam. Adapter seam (failover_adapter.go): - FailoverEvidenceAdapter: query-side (promotion evidence + replica summary) - FailoverTakeoverAdapter: execution-side (prepare + gate) - FailoverTarget: binds NodeID + both adapters - NewInProcessFailoverTarget: factory for in-process case Transport seam (failover_evidence_transport.go): - FailoverEvidenceTransport: request/response interface with nodeID routing - FailoverEvidenceHandler: server-side registration - InMemoryFailoverEvidenceTransport: first transport impl (in-memory) - NewHybridInProcessFailoverTarget: transport-backed evidence + local takeover Runtime manager (runtime_manager.go): - InProcessRuntimeManager: participant registry + ExecuteFailover entry point - Persisted failover snapshots/results per-volume and global-last All failover paths (session/driver/manager) now go through adapter seam. Old FailoverParticipant preserved as compatibility wrapper only. Phase 18 docs: phase-18.md (M1-M5 structure), log, decisions. Design docs updated: kernel-closure-review, claim-and-evidence ledger. Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 13:57:08 -07:00
pingqiuandClaude Opus 4.6	b8c6944e3f	feat: V2 MVP milestone — masterv2 + volumev2 + in-process failover V2 runtime packages: - sw-block/runtime/masterv2: identity authority (desired state, heartbeat handling, promotion arbitration via SelectPromotionCandidate) - sw-block/runtime/volumev2: per-volume micro-cluster shell (node, orchestrator, control session, iSCSI frontend, takeover gate, failover session + driver, replica summary reconstruction) - sw-block/runtime/purev2: RF1 execution shell (engine + store + dispatcher + local boundary observations) - sw-block/runtime/protocolv2: three-channel separation (heartbeat/assignment/query + replica summary) V2 binaries: - sw-block/cmd/v2singleblock: single-node RF1 block server - sw-block/cmd/purev2rf1: minimal RF1 runtime binary Milestone capabilities: - RF1 write/read/sync with engine-driven mode projection - masterv2 ↔ volumev2 heartbeat convergence + assignment reissue - Promotion query with fresh CommittedLSN/WALHeadLSN evidence - Replica summary for bounded takeover reconstruction - Primary-loss reconstruction from peer summaries (fail-closed gate) - In-process failover driver with session observability - Local boundary observations feed engine (Committed/Durable/Checkpoint) Design docs: - v2-two-loop-protocol.md: identity vs data-control separation - v2-automata-ownership-map.md: event/command ownership split - v2-loop1-surface-draft.md: heartbeat/query/assignment field spec - v2-volumev2-single-node-mvp.md: target layering - v2-kernel-closure-review.md: per-volume micro-cluster principle - v2-pure-runtime-rf1-bootstrap.md, v2-capability-map.md, v2-proof-and-retest-pyramid.md Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 13:08:02 -07:00
pingqiuandClaude Opus 4.6	cf16e53b04	feat: Phase 16M/17 + promote fixes + testrunner updates Phase 16M: explicit replica readiness on heartbeat seam - master.proto: optional bool replica_ready = 19 (proto regenerated on M01) - block_heartbeat_proto.go: write/read ReplicaReady with presence semantics - master_block_registry.go: replicaReadyObservedFromHeartbeat prefers explicit proto field, falls back to address heuristic when absent - volume_server_block.go: heartbeat emits ReplicaReady from core projection Phase 17: host effects extraction + stop line - phase-17-log.md: Batch 10/11 delivery notes Promote fixes: - master_block_failover.go: deterministic replica addrs from path hash - qa_promote_replication_test.go: address-upgrade trigger test - qa_promote_rejoin_live_test.go: new live rejoin test Testrunner: - devops.go: action improvements - recovery-baseline-failover.yaml, suite-ha-failover.yaml: scenario updates - cp11b3-manual-promote.yaml: promote scenario alignment - fresh_volume_write_test.go: new component test Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>	2026-04-05 11:38:05 -07:00
pingqiu	7855d5240c	docs: add bounded productionization pilot artifacts Freeze the first bounded pilot/preflight/stop/rollout-review artifact set and sync the global product ledgers so productionization can start from an explicit chosen-envelope discipline instead of ad hoc rollout judgment. Made-with: Cursor	2026-04-04 19:01:56 -07:00
pingqiu	4f95a1e868	docs: package phase 17 product claim checkpoint Freeze the first Phase 17 branch/contract/policy/envelope package, add review and supported-matrix artifacts, and sync the product-completion and claim-evidence ledgers to the new bounded post-Phase-16 checkpoint. Made-with: Cursor	2026-04-04 18:21:16 -07:00
pingqiu	0f72c8d062	refactor: close bounded phase 16 restart truth seams Bind non-authoritative inventory, restart primary-truth rebasing, and sparse replica readiness retention into the heartbeat/master seam, and package the bounded finish-line checkpoint with explicit claims, non-claims, and proof commands. Made-with: Cursor	2026-04-04 16:13:06 -07:00
pingqiu	10833c8b68	refactor: preserve bounded volume mode reason heartbeat truth Carry explicit volume_mode_reason across the heartbeat/master/API seam so outward surfaces retain the bounded core-owned explanation behind mode transitions. Made-with: Cursor	2026-04-04 14:21:31 -07:00
pingqiu	f20ec2ef79	test: align collector readiness check with replica eligibility Use ReplicaEligible instead of PublishHealthy in the heartbeat collector test now that publish health is rebound to publication truth rather than receiver readiness. Made-with: Cursor	2026-04-04 14:03:21 -07:00
pingqiu	6cad5bb8e1	refactor: rebind bounded volume mode heartbeat truth Make the heartbeat/master boundary preserve explicit volume_mode truth so master consume no longer reconstructs outward mode only from secondary heartbeat signals. Keep backward compatibility by falling back to the previous reconstruction when older heartbeats do not send the field. Made-with: Cursor	2026-04-04 13:56:41 -07:00
pingqiu	6794f79df9	refactor: preserve bounded publish healthy heartbeat truth Make the heartbeat/master boundary preserve explicit publish_healthy truth so master consume no longer reconstructs healthy publication only from secondary readiness and degraded heuristics. Keep backward compatibility by falling back to the previous reconstruction when older heartbeats do not send the field. Made-with: Cursor	2026-04-04 13:43:19 -07:00
pingqiu	eb610deb92	refactor: preserve bounded needs_rebuild heartbeat truth Make the heartbeat/master boundary preserve explicit needs_rebuild truth so primary heartbeat consume no longer collapses that stronger mode into a generic degraded signal. Keep backward compatibility by falling back to the previous heuristic when older heartbeats do not send the field. Made-with: Cursor	2026-04-04 13:11:42 -07:00
pingqiu	69b41a7f16	refactor: rebind bounded replica-ready heartbeat truth Make the heartbeat/master boundary carry explicit replica readiness truth so the registry no longer depends only on replica transport-address presence as a readiness proxy. Keep backward compatibility by falling back to the old address heuristic when older heartbeats do not send the field. Made-with: Cursor	2026-04-04 12:06:53 -07:00

1 2 3 4 5 ...