mirror of
https://github.com/seaweedfs/seaweedfs.git
synced 2026-05-28 12:41:15 +00:00
docs(p15): add G15b Kubernetes static PV mini-plan
This commit is contained in:
246
sw-block/design/v3-phase-15-g15b-k8s-static-pv-mini-plan.md
Normal file
246
sw-block/design/v3-phase-15-g15b-k8s-static-pv-mini-plan.md
Normal file
@@ -0,0 +1,246 @@
|
||||
# V3 Phase 15 — G15b Kubernetes Static PV Mini-Plan
|
||||
|
||||
**Date**: 2026-05-03
|
||||
**Status**: architect draft; follows G15a close on `seaweed_block@ac49adb`
|
||||
**Expected branch**: continue from `p15-g15a/csi-static-mvp`, or branch `p15-g15b/k8s-static-pv` from `ac49adb`
|
||||
**Goal**: prove a Kubernetes pod can consume a pre-provisioned V3 block volume through `cmd/blockcsi`, using real Kubernetes CSI control flow and real Linux iSCSI staging.
|
||||
|
||||
---
|
||||
|
||||
## §1 Scope
|
||||
|
||||
G15b is the Kubernetes form of the G15a static/pre-provisioned CSI path.
|
||||
|
||||
The intended product chain is:
|
||||
|
||||
```text
|
||||
cluster-spec / product-loop assignment
|
||||
-> blockmaster publishes authority + frontend target facts
|
||||
-> Kubernetes external-attacher calls blockcsi ControllerPublish
|
||||
-> blockcsi reads frontend facts from blockmaster
|
||||
-> Kubernetes kubelet calls NodeStage with publish_context
|
||||
-> blockcsi NodeStage performs real iscsiadm login + mkfs/mount
|
||||
-> kubelet mounts PVC into pod
|
||||
-> pod writes/reads byte-equal data
|
||||
-> pod/PVC cleanup unstages/logs out cleanly
|
||||
```
|
||||
|
||||
Close claim:
|
||||
|
||||
> A static Kubernetes PV/PVC/pod can use an already-provisioned V3 volume through the V3 CSI driver without embedding stale target identity in the PV and without CSI minting authority.
|
||||
|
||||
---
|
||||
|
||||
## §2 Critical Design Decision: Attach Required
|
||||
|
||||
G15b must not copy V2's first deployment shape blindly.
|
||||
|
||||
The V2 `CSIDriver` used:
|
||||
|
||||
```yaml
|
||||
attachRequired: false
|
||||
```
|
||||
|
||||
That is not the right first-close shape for V3 G15b if the claim is "target facts come from blockmaster through `ControllerPublish`."
|
||||
|
||||
With `attachRequired=false`, Kubernetes may bypass `ControllerPublish`. In that mode, `NodeStageVolume` receives only PV `volumeAttributes` / `VolumeContext`, so a static PV would need to embed `iscsiAddr` and `iqn`. That is acceptable as an emergency/debug fallback, but it is not the G15b product path because it duplicates frontend target truth outside blockmaster.
|
||||
|
||||
G15b therefore uses:
|
||||
|
||||
```yaml
|
||||
attachRequired: true
|
||||
```
|
||||
|
||||
and deploys the CSI external-attacher. The external-attacher invokes `ControllerPublishVolume`, receives `publish_context`, and kubelet passes that context into `NodeStageVolume`.
|
||||
|
||||
Allowed fallback:
|
||||
|
||||
- `NodeStageVolume` may continue supporting `VolumeContext` target fields because G15a already uses this as a low-level mechanism fallback.
|
||||
|
||||
G15b non-claim:
|
||||
|
||||
- static PV target-address fallback is not the close path and must not be used as the primary Kubernetes evidence.
|
||||
|
||||
---
|
||||
|
||||
## §3 V2 Deploy Port Discipline
|
||||
|
||||
V2 source directory:
|
||||
|
||||
```text
|
||||
weed/storage/blockvol/csi/deploy/
|
||||
```
|
||||
|
||||
Port decisions:
|
||||
|
||||
| V2 file | G15b decision | Reason |
|
||||
|---|---|---|
|
||||
| `csi-driver.yaml` | PORT-REBIND, but change `attachRequired` to `true` | V3 needs ControllerPublish to carry master frontend facts. |
|
||||
| `csi-node.yaml` | PORT-AS-IS / light rebind | Privileged node plugin, hostNetwork, kubelet/dev/iscsi mounts are mechanism. |
|
||||
| `rbac.yaml` | PORT-AS-IS / trim to sidecars used | Kubernetes RBAC mechanism. |
|
||||
| `csi-controller.yaml` | PORT-REBIND | Use `blockcsi` + `csi-attacher`; do not include `csi-provisioner` in G15b. |
|
||||
| `storageclass.yaml` | SKIP | G15b is static PV, no dynamic provisioning. |
|
||||
| `example-pvc.yaml` | REWRITE-TINY | Replace dynamic PVC with static PV+PVC+pod example. |
|
||||
|
||||
Boundary rule:
|
||||
|
||||
- Manifests may wire product binaries and sidecars.
|
||||
- Manifests must not encode authority epoch, endpoint version, primary role, or replica readiness.
|
||||
- Static PV must not embed `iscsiAddr` or `iqn` in the close-path example.
|
||||
|
||||
---
|
||||
|
||||
## §4 Red Tests / Guards
|
||||
|
||||
Land these before or with the first manifest commit:
|
||||
|
||||
1. `TestG15b_Manifest_CSIDriverRequiresAttach`
|
||||
- Asserts `CSIDriver.spec.attachRequired == true`.
|
||||
|
||||
2. `TestG15b_Manifest_ControllerUsesAttacherNotProvisioner`
|
||||
- Asserts controller manifest contains `csi-attacher`.
|
||||
- Asserts controller manifest does not contain `csi-provisioner`.
|
||||
|
||||
3. `TestG15b_Manifest_StaticPVDoesNotEmbedTargetFacts`
|
||||
- Asserts static PV example does not contain `iscsiAddr`, `iqn`, `nqn`, or endpoint/version fields.
|
||||
|
||||
4. `TestG15b_Manifest_NodePluginPrivilegedShape`
|
||||
- Asserts node DaemonSet is privileged, `hostNetwork: true`, and mounts `/var/lib/kubelet`, `/dev`, and `/etc/iscsi`.
|
||||
|
||||
5. `TestG15b_Manifest_StaticPVUsesBlockCSIDriver`
|
||||
- Asserts PV driver is `block.csi.seaweedfs.com`.
|
||||
- Asserts `volumeHandle` is a V3 volume ID, not an endpoint.
|
||||
|
||||
6. `TestG15b_Manifest_NoAuthorityShapedFields`
|
||||
- Scans deploy examples for `epoch`, `endpointVersion`, `primary`, `healthy`, `ready` outside comments.
|
||||
|
||||
These tests are not a substitute for Kubernetes. They prevent the highest-risk configuration drift before the privileged lab run.
|
||||
|
||||
---
|
||||
|
||||
## §5 Slices
|
||||
|
||||
### G15b-1 — K8s Manifest Skeleton + Static Guards
|
||||
|
||||
Code:
|
||||
|
||||
- Add V3 Kubernetes manifests, likely under:
|
||||
|
||||
```text
|
||||
deploy/k8s/g15b/
|
||||
```
|
||||
|
||||
Files:
|
||||
|
||||
- `csi-driver.yaml`
|
||||
- `rbac.yaml`
|
||||
- `csi-controller.yaml`
|
||||
- `csi-node.yaml`
|
||||
- `static-pv-pvc-pod.yaml`
|
||||
|
||||
Tests:
|
||||
|
||||
```powershell
|
||||
go test ./cmd/blockcsi -run TestG15b_Manifest -count=1 -v
|
||||
```
|
||||
|
||||
Pass:
|
||||
|
||||
- All §4 manifest guards green.
|
||||
- No product behavior change.
|
||||
|
||||
### G15b-2 — K8s Lab Harness
|
||||
|
||||
Artifacts:
|
||||
|
||||
- `V:\share\g15b-k8s\run-g15b-k8s-static.sh`
|
||||
- `sw-block/design/test/v3-phase-15-g15b-k8s-qa-test-instruction.md`
|
||||
|
||||
Harness responsibilities:
|
||||
|
||||
1. Build V3 binaries/images for `blockmaster`, `blockvolume`, and `blockcsi`.
|
||||
2. Load images into the test cluster.
|
||||
3. Apply cluster-spec/product-loop setup for one RF=2 volume.
|
||||
4. Apply CSI driver/controller/node manifests.
|
||||
5. Apply static PV/PVC/pod.
|
||||
6. Wait for pod ready.
|
||||
7. Exec into pod and perform write/read checksum.
|
||||
8. Delete pod/PVC/PV and assert node plugin cleanup.
|
||||
9. Collect logs and relevant Kubernetes events.
|
||||
|
||||
Pass:
|
||||
|
||||
- Pod sees mounted filesystem.
|
||||
- Pod writes and reads byte-equal data.
|
||||
- No dangling iSCSI session for the test IQN after cleanup.
|
||||
|
||||
### G15b-3 — First Kubernetes Close Run
|
||||
|
||||
Evidence target:
|
||||
|
||||
- Real Kubernetes control plane.
|
||||
- Real CSI external-attacher.
|
||||
- Real kubelet calling NodeStage/NodePublish.
|
||||
- Real iSCSI login/mkfs/mount on the node.
|
||||
- Pod-level byte-equal oracle.
|
||||
|
||||
Close anchor:
|
||||
|
||||
- Pin commit, image digest, cluster node(s), kernel, Kubernetes version, and artifact directory.
|
||||
|
||||
---
|
||||
|
||||
## §6 Non-Claims
|
||||
|
||||
G15b does not claim:
|
||||
|
||||
- dynamic CSI `CreateVolume`;
|
||||
- snapshots, clones, or expansion;
|
||||
- NVMe CSI path;
|
||||
- multi-node RWO enforcement;
|
||||
- pod remount after primary kill;
|
||||
- failover under live mounted filesystem;
|
||||
- network-partition behavior;
|
||||
- security for routable iSCSI/NVMe target exposure;
|
||||
- performance or soak.
|
||||
|
||||
---
|
||||
|
||||
## §7 Open Questions
|
||||
|
||||
Q1. First lab topology:
|
||||
|
||||
Recommended default: single-node Kubernetes on m01 first, because current frontend loopback guard is intentional and G15a privileged evidence already proves the Linux node path on m01.
|
||||
|
||||
Q2. Should `cmd/blockcsi` grow `--mode=controller|node|all` before G15b?
|
||||
|
||||
Recommended default: not required for first close. Running all CSI services in both controller and node pods is acceptable if sidecars call only the relevant service. Add `--mode` only if sidecar behavior or logs become confusing.
|
||||
|
||||
Q3. Should static PV include target fallback fields?
|
||||
|
||||
Recommended default: no for the close-path PV. Keep fallback only in code/tests for debug and plugin-restart recovery.
|
||||
|
||||
Q4. Should G15b include V2 dynamic provisioning sidecar?
|
||||
|
||||
Recommended default: no. Dynamic provisioning belongs after the product API can create desired volumes and placement safely.
|
||||
|
||||
---
|
||||
|
||||
## §8 QA Command Targets
|
||||
|
||||
Before K8s lab:
|
||||
|
||||
```powershell
|
||||
go test ./cmd/blockcsi -run TestG15b_Manifest -count=1 -v
|
||||
go test ./core/csi ./cmd/blockcsi ./core/host/volume ./core/host/master ./core/authority ./cmd/blockmaster ./cmd/blockvolume -count=1
|
||||
```
|
||||
|
||||
K8s lab command will be added in `v3-phase-15-g15b-k8s-qa-test-instruction.md` once the harness exists.
|
||||
|
||||
---
|
||||
|
||||
## §9 Start Decision
|
||||
|
||||
Start with G15b-1 manifest skeleton + manifest red tests.
|
||||
|
||||
Do not start by applying YAML to Kubernetes. The attach semantics and manifest shape are the failure-prone boundary; pin them first, then run the cluster.
|
||||
Reference in New Issue
Block a user