* Fix VolumeGroupSnapshot restore on Ceph RBD
This PR fixes two related issues affecting CSI snapshot restore on Ceph RBD:
1. VolumeGroupSnapshot restore fails because Ceph RBD populates
volumeGroupSnapshotHandle on pre-provisioned VSCs, but Velero doesn't
create the required VGSC during restore.
2. CSI snapshot restore fails because VolumeSnapshotClassName is removed
from restored VSCs, preventing the CSI controller from getting
credentials for snapshot verification.
Changes:
- Capture volumeGroupSnapshotHandle during backup as VS annotation
- Create stub VGSC during restore with matching handle in status
- Look up VolumeSnapshotClass by driver and set on restored VSC
Fixes#9512Fixes#9515
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add changelog for VGS restore fix
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix gofmt import order
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add changelog for VGS restore fix
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix import alias corev1 to corev1api per lint config
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix: Add snapshot handles to existing stub VGSC and add unit tests
When multiple VolumeSnapshots from the same VolumeGroupSnapshot are
restored, they share the same VolumeGroupSnapshotHandle but have
different individual snapshot handles. This commit:
1. Fixes incomplete logic where existing VGSC wasn't updated with
new snapshot handles (addresses review feedback)
2. Fixes race condition where Create returning AlreadyExists would
skip adding the snapshot handle
3. Adds comprehensive unit tests for ensureStubVGSCExists (5 cases)
and addSnapshotHandleToVGSC (4 cases) functions
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Clean up stub VolumeGroupSnapshotContents during restore finalization
Add cleanup logic for stub VGSCs created during VolumeGroupSnapshot restore.
The stub VGSCs are temporary objects needed to satisfy CSI controller
validation during VSC reconciliation. Once all related VSCs become
ReadyToUse, the stub VGSCs are no longer needed and should be removed.
The cleanup runs in the restore finalizer controller's execute() phase.
Before deleting each VGSC, it polls until all related VolumeSnapshotContents
(correlated by snapshot handle) are ReadyToUse, with a timeout fallback.
Deletion failures and CRD-not-installed scenarios are treated as warnings
rather than errors to avoid failing the restore.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix lint: remove unused nolint directive and simplify cleanupStubVGSC return
The cleanupStubVGSC function only produces warnings (not errors), so
simplify its return signature. Also remove the now-unused nolint:unparam
directive on execute() since warnings are no longer always nil.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
---------
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
This commit addresses reviewer feedback on PR #9441 regarding
concurrent backup caching concerns. Key changes:
1. Added lazy per-namespace caching for the CSI PVC BIA plugin path:
- Added IsNamespaceBuilt() method to check if namespace is cached
- Added BuildCacheForNamespace() for lazy, per-namespace cache building
- Plugin builds cache incrementally as namespaces are encountered
2. Added NewVolumeHelperImplWithCache constructor for plugins:
- Accepts externally-managed PVC-to-Pod cache
- Follows pattern from PR #9226 (Scott Seago's design)
3. Plugin instance lifecycle clarification:
- Plugin instances are unique per backup (created via newPluginManager)
- Cleaned up via CleanupClients at backup completion
- No mutex or backup UID tracking needed
4. Test coverage:
- Added tests for IsNamespaceBuilt and BuildCacheForNamespace
- Added tests for NewVolumeHelperImplWithCache constructor
- Added test verifying cache usage for fs-backup determination
This maintains the O(N+M) complexity improvement from issue #9179
while addressing architectural concerns about concurrent access.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Address review feedback to have a global VolumeHelper instance per
plugin process instead of creating one on each ShouldPerformSnapshot
call.
Changes:
- Add volumeHelper, cachedForBackup, and mu fields to pvcBackupItemAction
struct for caching the VolumeHelper per backup
- Add getOrCreateVolumeHelper() method for thread-safe lazy initialization
- Update Execute() to use cached VolumeHelper via
ShouldPerformSnapshotWithVolumeHelper()
- Update filterPVCsByVolumePolicy() to accept VolumeHelper parameter
- Add ShouldPerformSnapshotWithVolumeHelper() that accepts optional
VolumeHelper for reuse across multiple calls
- Add NewVolumeHelperForBackup() factory function for BIA plugins
- Add comprehensive unit tests for both nil and non-nil VolumeHelper paths
This completes the fix for issue #9179 by ensuring the PVC-to-Pod cache
is built once per backup and reused across all PVC processing, avoiding
O(N*M) complexity.
Fixes#9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
VolumeGroupSnapshots were querying all PVCs with matching labels
directly from the cluster without respecting volume policies. This
caused errors when labeled PVCs included both CSI and non-CSI volumes,
or volumes from different CSI drivers that were excluded by policies.
This change filters PVCs by volume policy before VGS grouping,
ensuring only PVCs that should be snapshotted are included in the
group. A warning is logged when PVCs are excluded from VGS due to
volume policy.
Fixes#9344
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>