docs(p15): add G15b Kubernetes static PV mini-plan

This commit is contained in:
pingqiu
2026-05-03 08:45:39 -07:00
parent 355c3b2cb7
commit 646f8207c8

View File

@@ -0,0 +1,246 @@
# V3 Phase 15 — G15b Kubernetes Static PV Mini-Plan
**Date**: 2026-05-03
**Status**: architect draft; follows G15a close on `seaweed_block@ac49adb`
**Expected branch**: continue from `p15-g15a/csi-static-mvp`, or branch `p15-g15b/k8s-static-pv` from `ac49adb`
**Goal**: prove a Kubernetes pod can consume a pre-provisioned V3 block volume through `cmd/blockcsi`, using real Kubernetes CSI control flow and real Linux iSCSI staging.
---
## §1 Scope
G15b is the Kubernetes form of the G15a static/pre-provisioned CSI path.
The intended product chain is:
```text
cluster-spec / product-loop assignment
-> blockmaster publishes authority + frontend target facts
-> Kubernetes external-attacher calls blockcsi ControllerPublish
-> blockcsi reads frontend facts from blockmaster
-> Kubernetes kubelet calls NodeStage with publish_context
-> blockcsi NodeStage performs real iscsiadm login + mkfs/mount
-> kubelet mounts PVC into pod
-> pod writes/reads byte-equal data
-> pod/PVC cleanup unstages/logs out cleanly
```
Close claim:
> A static Kubernetes PV/PVC/pod can use an already-provisioned V3 volume through the V3 CSI driver without embedding stale target identity in the PV and without CSI minting authority.
---
## §2 Critical Design Decision: Attach Required
G15b must not copy V2's first deployment shape blindly.
The V2 `CSIDriver` used:
```yaml
attachRequired: false
```
That is not the right first-close shape for V3 G15b if the claim is "target facts come from blockmaster through `ControllerPublish`."
With `attachRequired=false`, Kubernetes may bypass `ControllerPublish`. In that mode, `NodeStageVolume` receives only PV `volumeAttributes` / `VolumeContext`, so a static PV would need to embed `iscsiAddr` and `iqn`. That is acceptable as an emergency/debug fallback, but it is not the G15b product path because it duplicates frontend target truth outside blockmaster.
G15b therefore uses:
```yaml
attachRequired: true
```
and deploys the CSI external-attacher. The external-attacher invokes `ControllerPublishVolume`, receives `publish_context`, and kubelet passes that context into `NodeStageVolume`.
Allowed fallback:
- `NodeStageVolume` may continue supporting `VolumeContext` target fields because G15a already uses this as a low-level mechanism fallback.
G15b non-claim:
- static PV target-address fallback is not the close path and must not be used as the primary Kubernetes evidence.
---
## §3 V2 Deploy Port Discipline
V2 source directory:
```text
weed/storage/blockvol/csi/deploy/
```
Port decisions:
| V2 file | G15b decision | Reason |
|---|---|---|
| `csi-driver.yaml` | PORT-REBIND, but change `attachRequired` to `true` | V3 needs ControllerPublish to carry master frontend facts. |
| `csi-node.yaml` | PORT-AS-IS / light rebind | Privileged node plugin, hostNetwork, kubelet/dev/iscsi mounts are mechanism. |
| `rbac.yaml` | PORT-AS-IS / trim to sidecars used | Kubernetes RBAC mechanism. |
| `csi-controller.yaml` | PORT-REBIND | Use `blockcsi` + `csi-attacher`; do not include `csi-provisioner` in G15b. |
| `storageclass.yaml` | SKIP | G15b is static PV, no dynamic provisioning. |
| `example-pvc.yaml` | REWRITE-TINY | Replace dynamic PVC with static PV+PVC+pod example. |
Boundary rule:
- Manifests may wire product binaries and sidecars.
- Manifests must not encode authority epoch, endpoint version, primary role, or replica readiness.
- Static PV must not embed `iscsiAddr` or `iqn` in the close-path example.
---
## §4 Red Tests / Guards
Land these before or with the first manifest commit:
1. `TestG15b_Manifest_CSIDriverRequiresAttach`
- Asserts `CSIDriver.spec.attachRequired == true`.
2. `TestG15b_Manifest_ControllerUsesAttacherNotProvisioner`
- Asserts controller manifest contains `csi-attacher`.
- Asserts controller manifest does not contain `csi-provisioner`.
3. `TestG15b_Manifest_StaticPVDoesNotEmbedTargetFacts`
- Asserts static PV example does not contain `iscsiAddr`, `iqn`, `nqn`, or endpoint/version fields.
4. `TestG15b_Manifest_NodePluginPrivilegedShape`
- Asserts node DaemonSet is privileged, `hostNetwork: true`, and mounts `/var/lib/kubelet`, `/dev`, and `/etc/iscsi`.
5. `TestG15b_Manifest_StaticPVUsesBlockCSIDriver`
- Asserts PV driver is `block.csi.seaweedfs.com`.
- Asserts `volumeHandle` is a V3 volume ID, not an endpoint.
6. `TestG15b_Manifest_NoAuthorityShapedFields`
- Scans deploy examples for `epoch`, `endpointVersion`, `primary`, `healthy`, `ready` outside comments.
These tests are not a substitute for Kubernetes. They prevent the highest-risk configuration drift before the privileged lab run.
---
## §5 Slices
### G15b-1 — K8s Manifest Skeleton + Static Guards
Code:
- Add V3 Kubernetes manifests, likely under:
```text
deploy/k8s/g15b/
```
Files:
- `csi-driver.yaml`
- `rbac.yaml`
- `csi-controller.yaml`
- `csi-node.yaml`
- `static-pv-pvc-pod.yaml`
Tests:
```powershell
go test ./cmd/blockcsi -run TestG15b_Manifest -count=1 -v
```
Pass:
- All §4 manifest guards green.
- No product behavior change.
### G15b-2 — K8s Lab Harness
Artifacts:
- `V:\share\g15b-k8s\run-g15b-k8s-static.sh`
- `sw-block/design/test/v3-phase-15-g15b-k8s-qa-test-instruction.md`
Harness responsibilities:
1. Build V3 binaries/images for `blockmaster`, `blockvolume`, and `blockcsi`.
2. Load images into the test cluster.
3. Apply cluster-spec/product-loop setup for one RF=2 volume.
4. Apply CSI driver/controller/node manifests.
5. Apply static PV/PVC/pod.
6. Wait for pod ready.
7. Exec into pod and perform write/read checksum.
8. Delete pod/PVC/PV and assert node plugin cleanup.
9. Collect logs and relevant Kubernetes events.
Pass:
- Pod sees mounted filesystem.
- Pod writes and reads byte-equal data.
- No dangling iSCSI session for the test IQN after cleanup.
### G15b-3 — First Kubernetes Close Run
Evidence target:
- Real Kubernetes control plane.
- Real CSI external-attacher.
- Real kubelet calling NodeStage/NodePublish.
- Real iSCSI login/mkfs/mount on the node.
- Pod-level byte-equal oracle.
Close anchor:
- Pin commit, image digest, cluster node(s), kernel, Kubernetes version, and artifact directory.
---
## §6 Non-Claims
G15b does not claim:
- dynamic CSI `CreateVolume`;
- snapshots, clones, or expansion;
- NVMe CSI path;
- multi-node RWO enforcement;
- pod remount after primary kill;
- failover under live mounted filesystem;
- network-partition behavior;
- security for routable iSCSI/NVMe target exposure;
- performance or soak.
---
## §7 Open Questions
Q1. First lab topology:
Recommended default: single-node Kubernetes on m01 first, because current frontend loopback guard is intentional and G15a privileged evidence already proves the Linux node path on m01.
Q2. Should `cmd/blockcsi` grow `--mode=controller|node|all` before G15b?
Recommended default: not required for first close. Running all CSI services in both controller and node pods is acceptable if sidecars call only the relevant service. Add `--mode` only if sidecar behavior or logs become confusing.
Q3. Should static PV include target fallback fields?
Recommended default: no for the close-path PV. Keep fallback only in code/tests for debug and plugin-restart recovery.
Q4. Should G15b include V2 dynamic provisioning sidecar?
Recommended default: no. Dynamic provisioning belongs after the product API can create desired volumes and placement safely.
---
## §8 QA Command Targets
Before K8s lab:
```powershell
go test ./cmd/blockcsi -run TestG15b_Manifest -count=1 -v
go test ./core/csi ./cmd/blockcsi ./core/host/volume ./core/host/master ./core/authority ./cmd/blockmaster ./cmd/blockvolume -count=1
```
K8s lab command will be added in `v3-phase-15-g15b-k8s-qa-test-instruction.md` once the harness exists.
---
## §9 Start Decision
Start with G15b-1 manifest skeleton + manifest red tests.
Do not start by applying YAML to Kubernetes. The attach semantics and manifest shape are the failure-prone boundary; pin them first, then run the cluster.