docs: Phase 20 T5 — wire ClusterReplicationMode into diagnostic surface

Add ClusterReplicationMode and EngineProjectionMode to
FailoverVolumeState so each volume in the failover diagnostic
carries its cluster/engine mode at diagnosis time.

FailoverDiagnosticSnapshot() enriches volume entries by looking up
the registry entry for each volume. This covers both the block
volume API (GET /block/volume/{name}) and the failover diagnostic
snapshot surface.

Update phase doc to reflect actual exposure paths.

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
This commit is contained in:
pingqiu
2026-04-05 18:53:31 -07:00
parent ceb68cc66b
commit 3e6155c18e
2 changed files with 21 additions and 7 deletions

View File

@@ -268,7 +268,10 @@ separate cluster-level concept.
- Any replica `needs_rebuild` → `"needs_rebuild"`
- Monotonic: worst replica state dominates
- Store as `entry.ClusterReplicationMode` (NOT `entry.VolumeMode`)
- Expose in `/vol/status` API and heartbeat diagnostics
- Expose in block volume API: `GET /block/volume/{name}` and
`GET /block/volumes` return `cluster_replication_mode` and
`engine_projection_mode` as distinct JSON fields alongside
existing `volume_mode`
**Truth rule**: `ClusterReplicationMode` is the master's cluster-level
replication health judgment. It is distinct from `EngineProjectionMode`

View File

@@ -40,12 +40,14 @@ type blockFailoverState struct {
// FailoverVolumeState is one volume's failover diagnosis entry.
type FailoverVolumeState struct {
VolumeName string
CurrentPrimary string
AffectedServer string // dead server that triggered the failover/rebuild
DeferredPromotion bool // true if a deferred promotion timer is pending
PendingRebuild bool // true if a rebuild is pending for this volume
Reason string // "lease_wait", "rebuild_pending", or ""
VolumeName string
CurrentPrimary string
AffectedServer string // dead server that triggered the failover/rebuild
DeferredPromotion bool // true if a deferred promotion timer is pending
PendingRebuild bool // true if a rebuild is pending for this volume
Reason string // "lease_wait", "rebuild_pending", or ""
ClusterReplicationMode string // T5: cluster-level RF2 health at diagnosis time
EngineProjectionMode string // T1: VS-local engine projection at diagnosis time
}
// FailoverDiagnostic is a bounded read-only snapshot of failover state
@@ -106,6 +108,15 @@ func (ms *MasterServer) FailoverDiagnosticSnapshot() FailoverDiagnostic {
default:
diag.V2PromotionMode = "placeholder_fail_closed"
}
// T5: enrich each volume entry with cluster/engine mode from registry.
if ms.blockRegistry != nil {
for i := range diag.Volumes {
if entry, ok := ms.blockRegistry.Lookup(diag.Volumes[i].VolumeName); ok {
diag.Volumes[i].ClusterReplicationMode = entry.ClusterReplicationMode
diag.Volumes[i].EngineProjectionMode = entry.EngineProjectionMode
}
}
}
return diag
}