Commit Graph

14 Commits

Author SHA1 Message Date
Shubham Pampattiwar
5ad4e604b8 [release-1.18] Fix VolumeGroupSnapshot restore failure with Ceph RBD CSI driver (#9687)
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m4s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Failing after 12s
Main CI / Build (push) Has been skipped
* Fix VolumeGroupSnapshot restore failure with Ceph RBD CSI driver (#9516)

* Fix VolumeGroupSnapshot restore on Ceph RBD

This PR fixes two related issues affecting CSI snapshot restore on Ceph RBD:

1. VolumeGroupSnapshot restore fails because Ceph RBD populates
   volumeGroupSnapshotHandle on pre-provisioned VSCs, but Velero doesn't
   create the required VGSC during restore.

2. CSI snapshot restore fails because VolumeSnapshotClassName is removed
   from restored VSCs, preventing the CSI controller from getting
   credentials for snapshot verification.

Changes:
- Capture volumeGroupSnapshotHandle during backup as VS annotation
- Create stub VGSC during restore with matching handle in status
- Look up VolumeSnapshotClass by driver and set on restored VSC

Fixes #9512
Fixes #9515

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Add changelog for VGS restore fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Fix gofmt import order

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Add changelog for VGS restore fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Fix import alias corev1 to corev1api per lint config

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Fix: Add snapshot handles to existing stub VGSC and add unit tests

When multiple VolumeSnapshots from the same VolumeGroupSnapshot are
restored, they share the same VolumeGroupSnapshotHandle but have
different individual snapshot handles. This commit:

1. Fixes incomplete logic where existing VGSC wasn't updated with
   new snapshot handles (addresses review feedback)

2. Fixes race condition where Create returning AlreadyExists would
   skip adding the snapshot handle

3. Adds comprehensive unit tests for ensureStubVGSCExists (5 cases)
   and addSnapshotHandleToVGSC (4 cases) functions

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Clean up stub VolumeGroupSnapshotContents during restore finalization

Add cleanup logic for stub VGSCs created during VolumeGroupSnapshot restore.
The stub VGSCs are temporary objects needed to satisfy CSI controller
validation during VSC reconciliation. Once all related VSCs become
ReadyToUse, the stub VGSCs are no longer needed and should be removed.

The cleanup runs in the restore finalizer controller's execute() phase.
Before deleting each VGSC, it polls until all related VolumeSnapshotContents
(correlated by snapshot handle) are ReadyToUse, with a timeout fallback.
Deletion failures and CRD-not-installed scenarios are treated as warnings
rather than errors to avoid failing the restore.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Fix lint: remove unused nolint directive and simplify cleanupStubVGSC return

The cleanupStubVGSC function only produces warnings (not errors), so
simplify its return signature. Also remove the now-unused nolint:unparam
directive on execute() since warnings are no longer always nil.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

---------

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

* Rename changelog file to match cherry-pick PR number

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

---------

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2026-04-08 12:45:02 -07:00
Matthieu MOREL
c6a420bd3a chore: define common aliases for k8s packages (#8672)
Some checks failed
Run the E2E test on kind / build (push) Failing after 6m48s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 35s
Close stale issues and PRs / stale (push) Successful in 8s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m11s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 47s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 49s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 43s
* lchore: define common alias for k8s packages

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-04-22 06:14:47 -04:00
Xun Jiang
6b7dd12bf7 Modify VS and VSC restore actions.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-02-25 10:44:45 +08:00
Tiger Kaovilai
a3cee616dc Upgrade go.mod k8s.io/ go.mod to v0.31.3 and set klog.SetLogger() for client-go (#8450)
Some checks failed
Run the E2E test on kind / build (push) Failing after 5m44s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
build-image / Build (push) Failing after 10s
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 7s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 59s
Trivy Nightly Scan / Trivy nightly scan (velero-restore-helper, main) (push) Failing after 45s
Also bumped to support upgraded k8s.io/ deps.
- controller-gen to v0.16.5
- sigs.k8s.io/controller-runtime v0.19.2

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-02-17 15:05:10 -05:00
Daniel Jiang
dc02caf2b0 Skip patching the PV in finalization for failed operation
This commit makes change in restore finalizer controller, to make it
check the status in item operation of a PVC before patch the PV that is
bound to it.  If the operation is not successful it will skip patching
the PV.

Signed-off-by: Daniel Jiang <daniel.jiang@broadcom.com>
2025-01-09 01:42:50 +08:00
Tiger Kaovilai
c643ee5fd4 Retry completion status patch for backup and restore resources
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

update to design #8063

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2024-09-10 17:01:14 -04:00
Tiger Kaovilai
6d0d1aaccc tautological condition: non-nil != nil
https://pkg.go.dev/golang.org/x/tools/go/analysis/passes/nilness#cond:~:text=p%20%3A%3D%20%26v%0A...%0Aif%20p%20!%3D%20nil%20%7B%20//%20tautological%20condition%0A%7D
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2024-07-31 22:20:48 -04:00
Shubham Pampattiwar
fd6c74715a Expose PVPatchMaximumDuration timeout for custom configuration
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

remove debug log

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

use resource timeout server arg

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

add changelog

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

remove hardcoded PVPatchMaximumtimeout const usagDe

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2024-07-19 12:44:26 -07:00
Shubham Pampattiwar
3bd8a7da7d Skip PV patch step in Restoe workflow for WaitForFirstConsumer VolumeBindingMode Pending state PVCs (#7953)
add changelog file



change log level and add more detailed comments



make update



add return for sc get call if error

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2024-07-11 18:02:21 -04:00
allenxu404
28552258ae Wait for results of restore exec hook executions in Finalizing phase instead of InProgress phase
Signed-off-by: allenxu404 <qix2@vmware.com>
2024-04-15 17:49:36 +08:00
Daniel Jiang
0a280e5786 Track and persist restore volume info
Signed-off-by: Daniel Jiang <jiangd@vmware.com>
2024-04-11 17:32:18 +08:00
Xun Jiang
efb94ae610 Refactor the native snapshot definition code.
Signed-off-by: Xun Jiang <blackpigletbruce@gmail.com>
2024-03-20 15:38:07 +08:00
allenxu404
67b5e82d49 Patch newly dynamically provisioned PV with volume info to restore custom setting of PV
Signed-off-by: allenxu404 <qix2@vmware.com>
2024-03-18 17:32:35 +08:00
allenxu404
2b8bb871d3 Add the finalization phase to the restore workflow
Signed-off-by: allenxu404 <qix2@vmware.com>
2024-02-29 13:51:45 +08:00