Shubham Pampattiwar
f4c4653c08
Fix linter errors: use 'any' instead of 'interface{}'
...
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
987edf5037
Add global VolumeHelper caching in CSI PVC BIA plugin
...
Address review feedback to have a global VolumeHelper instance per
plugin process instead of creating one on each ShouldPerformSnapshot
call.
Changes:
- Add volumeHelper, cachedForBackup, and mu fields to pvcBackupItemAction
struct for caching the VolumeHelper per backup
- Add getOrCreateVolumeHelper() method for thread-safe lazy initialization
- Update Execute() to use cached VolumeHelper via
ShouldPerformSnapshotWithVolumeHelper()
- Update filterPVCsByVolumePolicy() to accept VolumeHelper parameter
- Add ShouldPerformSnapshotWithVolumeHelper() that accepts optional
VolumeHelper for reuse across multiple calls
- Add NewVolumeHelperForBackup() factory function for BIA plugins
- Add comprehensive unit tests for both nil and non-nil VolumeHelper paths
This completes the fix for issue #9179 by ensuring the PVC-to-Pod cache
is built once per backup and reused across all PVC processing, avoiding
O(N*M) complexity.
Fixes #9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
99e821a870
Address review feedback: move cache building to volumehelper
...
- Rename NewVolumeHelperImplWithCache to NewVolumeHelperImplWithNamespaces
- Move cache building logic from backup.go into volumehelper
- Return error from NewVolumeHelperImplWithNamespaces if cache build fails
- Remove fallback in main backup path - backup fails if cache build fails
- Update NewVolumeHelperImpl to call NewVolumeHelperImplWithNamespaces
- Add comments clarifying fallback is only used by plugins
- Update tests for new error return signature
This addresses review comments from @Lyndon-Li and @kaovilai:
- Cache building is now encapsulated in volumehelper
- No fallback in main backup path ensures predictable performance
- Code reuse between constructors
Fixes #9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
041e5e2a7e
Address review feedback
...
- Use ResolveNamespaceList() instead of GetIncludes() for more accurate
namespace resolution when building the PVC-to-Pod cache
- Refactor NewVolumeHelperImpl to call NewVolumeHelperImplWithCache with
nil cache parameter to avoid code duplication
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
8e58099674
Add test for cache usage without volume policy
...
Add test case to verify that the PVC-to-Pod cache is used even when
no volume policy is configured. When defaultVolumesToFSBackup is true,
the cache is used to find pods using the PVC to determine if fs-backup
should be used instead of snapshot.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
a43f14b071
Add unit tests for ShouldPerformFSBackup with PVC-to-Pod cache
...
Add TestVolumeHelperImplWithCache_ShouldPerformFSBackup to verify:
- Volume policy match with cache returns correct fs-backup decision
- Volume policy match with snapshot action skips fs-backup
- Fallback to direct lookup when cache is not built
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
26053ae6d6
Add unit tests for VolumeHelperImpl with PVC-to-Pod cache
...
Add TestVolumeHelperImplWithCache_ShouldPerformSnapshot to verify:
- Volume policy match with cache returns correct snapshot decision
- fs-backup via opt-out with cache properly skips snapshot
- Fallback to direct lookup when cache is not built
These tests verify the cache-enabled code path added in the previous
commit for improved volume policy performance.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
60203ad01b
Add changelog for PR #9441
...
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar
bcdc30b59a
Add PVC-to-Pod cache to improve volume policy performance
...
The GetPodsUsingPVC function had O(N*M) complexity - for each PVC,
it listed ALL pods in the namespace and iterated through each pod.
With many PVCs and pods, this caused significant performance
degradation (2+ seconds per PV in some cases).
This change introduces a PVC-to-Pod cache that is built once per
backup and reused for all PVC lookups, reducing complexity from
O(N*M) to O(N+M).
Changes:
- Add PVCPodCache struct with thread-safe caching in podvolume pkg
- Add NewVolumeHelperImplWithCache constructor for cache support
- Build cache before backup item processing in backup.go
- Add comprehensive unit tests for cache functionality
- Graceful fallback to direct lookups if cache fails
Fixes #9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-15 14:18:05 -08:00
Tiger Kaovilai
a1026cb531
Update golangci-lint installation script URL to use HEAD for latest version ( #9451 )
...
Run the E2E test on kind / get-go-version (push) Failing after 1m16s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
build-image / Build (push) Failing after 15s
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 39s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m39s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m27s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m27s
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com >
2025-12-12 12:25:45 -05:00
Shubham Pampattiwar
14b34f08cc
Merge pull request #9321 from shubham-pampattiwar/fix-azure-bsl-status-message-8368
...
Run the E2E test on kind / get-go-version (push) Failing after 1m11s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 33s
Sanitize Azure HTTP responses in BSL status messages
2025-12-11 22:00:18 -08:00
Xun Jiang/Bruce Jiang
add66eac42
Merge pull request #9431 from vmware-tanzu/9033_fix
...
Remove VolumeSnapshotClass from CSI B/R process.
2025-12-12 10:59:47 +08:00
Xun Jiang
096436507e
Remove VolumeSnapshotClass from CSI restore and deletion process.
...
Run the E2E test on kind / get-go-version (push) Failing after 1m1s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Remove VolumeSnapshotClass from backup sync process.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com >
2025-12-11 17:56:11 +08:00
Xun Jiang/Bruce Jiang
554b04e6ca
Merge pull request #9132 from mjnagel/crd-upgrade
...
Run the E2E test on kind / get-go-version (push) Failing after 56s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 25s
Close stale issues and PRs / stale (push) Successful in 12s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m36s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m16s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m13s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m4s
feat: add apply flag to install command
2025-12-10 16:41:56 +08:00
Xun Jiang/Bruce Jiang
c594026c1f
Merge pull request #9446 from vmware-tanzu/dependabot/github_actions/actions/stale-10.1.1
...
Run the E2E test on kind / get-go-version (push) Failing after 1m5s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 32s
Bump actions/stale from 10.1.0 to 10.1.1
2025-12-10 13:31:28 +08:00
Xun Jiang/Bruce Jiang
46776898ab
Merge branch 'main' into dependabot/github_actions/actions/stale-10.1.1
2025-12-10 11:34:29 +08:00
Xun Jiang/Bruce Jiang
fdcfed84f9
Add the node-agent ConfigMap document. ( #9434 )
...
Run the E2E test on kind / get-go-version (push) Failing after 1m1s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m45s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m9s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m6s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m14s
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com >
2025-12-09 04:57:30 -05:00
dependabot[bot]
dbeb16aad7
Bump actions/stale from 10.1.0 to 10.1.1
...
Bumps [actions/stale](https://github.com/actions/stale ) from 10.1.0 to 10.1.1.
- [Release notes](https://github.com/actions/stale/releases )
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md )
- [Commits](https://github.com/actions/stale/compare/v10.1.0...v10.1.1 )
---
updated-dependencies:
- dependency-name: actions/stale
dependency-version: 10.1.1
dependency-type: direct:production
update-type: version-update:semver-patch
...
Signed-off-by: dependabot[bot] <support@github.com >
2025-12-08 19:02:57 +00:00
Shubham Pampattiwar
f0c97c489d
Merge pull request #9414 from shubham-pampattiwar/add-maintenance-job-metrics
...
Run the E2E test on kind / get-go-version (push) Failing after 1m8s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 37s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m43s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 58s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 58s
Add Prometheus metrics for maintenance jobs
2025-12-08 09:23:44 -08:00
Micah Nagel
3244cc605f
feat: add apply flag to install command
...
Signed-off-by: Micah Nagel <micah.nagel@defenseunicorns.com >
2025-12-05 11:26:10 +08:00
Shubham Pampattiwar
6a0307142c
Merge pull request #9307 from sseago/parallel-backup
...
Run the E2E test on kind / get-go-version (push) Failing after 1m4s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 30s
Close stale issues and PRs / stale (push) Successful in 17s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m46s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m30s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m36s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m31s
Parallel backup processing
2025-12-04 11:37:24 -08:00
Shubham Pampattiwar
1ec622245b
Run make update to fix gofmt alignment
...
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-03 16:13:13 -08:00
Shubham Pampattiwar
31fb828f8e
Add clarifying comment for histogram metric
...
Explain that the duration histogram tracks distribution of individual
job durations, not accumulated sums, to address reviewer concerns.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com >
2025-12-03 16:05:32 -08:00
Scott Seago
7286d24c35
Updates for merge conflict and to refine reconciler queueing logic
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-03 16:55:59 -05:00
Scott Seago
7e4797f588
Track running backup count via BackupTracker
...
This avoids an unnecessary apiserver List call when
the backup reconciler is already at capacity.
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:23:47 -05:00
Scott Seago
f238a7e47b
make update
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:23:21 -05:00
Scott Seago
0b2e7d1238
Minor refactoring
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:23:21 -05:00
Scott Seago
73864e31ff
Fix linters
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:04:55 -05:00
Scott Seago
8a95d512b3
make update, changelog
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:04:07 -05:00
Scott Seago
4d1802233a
add various scenarios to queue controller unit tests
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 17:01:09 -05:00
Scott Seago
f73443659a
Backup queue controller implementation
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:57:18 -05:00
Scott Seago
7111f3cea2
feat: Remove pvc-for-tmp install arg
...
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:49:17 -05:00
Scott Seago
845eee4e60
feat: Create backup queue controller and add to disableable list
...
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:46:56 -05:00
Scott Seago
c50ab4a6ea
feat: Add pvc-for-tmp install arg to use PVC for server /tmp dir
...
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:40:49 -05:00
Scott Seago
6a3f821606
fix lint
...
Signed-off-by: Scott Seago <sseago@redhat.com >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
34dc381182
Refactor after review
...
Signed-off-by: Scott Seago <sseago@redhat.com >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
29b01c3170
make update
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
84571bc54d
Added doc note around parallel backups and resource limits
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
9c1c7d20ff
Minor refactoring
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
7bc57b5a5f
Refactor queue controller to reduce apiserver list calls
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
e7b5d20f4c
Fix linters
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:10 -05:00
Scott Seago
aedc0fe5e2
make update, changelog
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:39:07 -05:00
Scott Seago
dbaa25405d
move podVolumeContext into backupRequest
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
91357b28c4
Move worker pool creation to backup reconcile.
...
ItemBlockWorkerPool is now created for each backup.
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
e0c08f03cf
add various scenarios to queue controller unit tests
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
a56ab10f23
Move debug logs to info
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
d39ad6f208
run multiple backup reconcilers, only reconcile ReadyToStart backups
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
300bc70c68
Add queue position to backup list/describe
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
13041b40c2
Backup queue controller implementation
...
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:38:41 -05:00
Scott Seago
4ffb29d750
feat: Remove pvc-for-tmp install arg
...
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat >
Signed-off-by: Scott Seago <sseago@redhat.com >
2025-12-02 16:28:08 -05:00