Compare commits

..

92 Commits

Author SHA1 Message Date
Xun Jiang b5ccc4373d Add functions to validate and compare the E2E Velero version.
Run the E2E test on kind / get-go-version (push) Failing after 1m8s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Remove 'self' from MIGRATE_FROM_VELERO_VERSION.
  * Need to modify the BYOT case to specify MIGRATE_FROM_VELERO_VERSION.
Remove the CSI plugin installation check for no older than v1.14

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-12-26 12:53:09 +08:00
lyndon-li 327ea3ea13 Merge pull request #9206 from Joeavaikath/label-cleanup
Run the E2E test on kind / get-go-version (push) Failing after 1m0s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 42s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m24s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m6s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m11s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m7s
Remove labels associated with previous backups
2025-12-22 13:20:04 +08:00
Michal Pryc 4b6708de2c Fix plugin init container names exceeding DNS-1123 limit (#9445)
Run the E2E test on kind / get-go-version (push) Failing after 1m5s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 18s
Main CI / Build (push) Failing after 40s
Close stale issues and PRs / stale (push) Successful in 11s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m45s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m14s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m7s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m6s
Ensure plugin init container names satisfy DNS-1123 label constraints
(max 63 chars). Long names are truncated with an 8-char hash suffix to
maintain uniqueness.

Fixes: #9444

Signed-off-by: Michal Pryc <mpryc@redhat.com>
2025-12-19 13:39:27 -05:00
Tiger Kaovilai 65eaceee0b Refactor cleanupStaleVeleroLabels to use constants for label keys
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-12-19 20:17:47 +07:00
Joseph 2d8a87fec4 Add condition to catch backup.velero.io labels
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph eae5bea469 Remove backup.velero.io/must-include-additional-items label
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph 87dbc16b0a Revert registry local debug value
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph db2193c53a Make update
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph 643dd784ea Refactor into func and add tests
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph e7166fc9e9 Add changelog
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
Joseph bfb431fcdf Clean up stale labels before backing up PVC
Signed-off-by: Joseph <jvaikath@redhat.com>
2025-12-19 20:17:46 +07:00
lyndon-li 2d93ab261e Merge pull request #9141 from kaovilai/9097
Run the E2E test on kind / get-go-version (push) Failing after 1m8s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 16s
Main CI / Build (push) Failing after 45s
feat: Enhance BackupStorageLocation with Secret-based CA certificate support
2025-12-19 13:13:47 +08:00
Xun Jiang/Bruce Jiang fcb7fc9356 Merge pull request #9366 from blackpiglet/9359_fix
Run the E2E test on kind / get-go-version (push) Failing after 1m13s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 16s
Main CI / Build (push) Failing after 42s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m37s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m18s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m24s
Use hookIndex for recording multiple restore exec hooks.
2025-12-18 17:00:42 +08:00
Xun Jiang/Bruce Jiang 727a4fd0ed Merge pull request #9452 from blackpiglet/9201_fix
Add maintenance job and data mover pod's labels and annotations setting.
2025-12-18 16:39:24 +08:00
Xun Jiang/Bruce Jiang aa3bd251dd Merge branch 'main' into 9097 2025-12-18 14:18:04 +08:00
Shubham Pampattiwar dad85b6fc3 Merge pull request #9441 from shubham-pampattiwar/fix-volume-policy-performance-9179
Add PVC-to-Pod cache to improve volume policy performance
2025-12-17 21:47:11 -08:00
Shubham Pampattiwar 78e9470028 Remove unnecessary nolint directives from test file
The nolint:staticcheck directives are not needed in the test file because
it calls NewVolumeHelperImpl within the same package, which doesn't
trigger deprecation warnings. Only cross-package calls need the directive.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-17 11:01:00 -08:00
Shubham Pampattiwar 4ba2effaac Add nolint directives for intentional deprecated function usage
The ShouldPerformSnapshotWithVolumeHelper function and tests intentionally
use NewVolumeHelperImpl (deprecated) for backwards compatibility with
third-party plugins. Add nolint:staticcheck to suppress the linter
warnings with explanatory comments.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-17 10:54:11 -08:00
Shubham Pampattiwar f592a264a6 Address review feedback: remove deprecated functions
Remove deprecated functions that were marked for removal per review:
- Remove GetPodsUsingPVC (replaced by GetPodsUsingPVCWithCache)
- Remove IsPVCDefaultToFSBackup (replaced by IsPVCDefaultToFSBackupWithCache)
- Remove associated tests for deprecated functions
- Add deprecation marker to NewVolumeHelperImpl
- Add deprecation marker to ShouldPerformSnapshotWithBackup

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-17 10:44:15 -08:00
Xun Jiang e39374f335 Add maintenance job and data mover pod's labels and annotations setting.
Add wait in file_system_test's async test cases.
Add related documents.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-12-17 13:21:07 +08:00
Shubham Pampattiwar 10ef43e147 Fix gofmt formatting issues in test files
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-16 12:40:55 -08:00
Shubham Pampattiwar b7052c2cb1 Implement lazy per-namespace PVC-to-Pod caching for plugin path
This commit addresses reviewer feedback on PR #9441 regarding
concurrent backup caching concerns. Key changes:

1. Added lazy per-namespace caching for the CSI PVC BIA plugin path:
   - Added IsNamespaceBuilt() method to check if namespace is cached
   - Added BuildCacheForNamespace() for lazy, per-namespace cache building
   - Plugin builds cache incrementally as namespaces are encountered

2. Added NewVolumeHelperImplWithCache constructor for plugins:
   - Accepts externally-managed PVC-to-Pod cache
   - Follows pattern from PR #9226 (Scott Seago's design)

3. Plugin instance lifecycle clarification:
   - Plugin instances are unique per backup (created via newPluginManager)
   - Cleaned up via CleanupClients at backup completion
   - No mutex or backup UID tracking needed

4. Test coverage:
   - Added tests for IsNamespaceBuilt and BuildCacheForNamespace
   - Added tests for NewVolumeHelperImplWithCache constructor
   - Added test verifying cache usage for fs-backup determination

This maintains the O(N+M) complexity improvement from issue #9179
while addressing architectural concerns about concurrent access.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-16 12:28:47 -08:00
Xun Jiang 57370296ab Use hookIndex for recording multiple restore exec hooks.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-12-16 16:39:42 +08:00
Shubham Pampattiwar f4c4653c08 Fix linter errors: use 'any' instead of 'interface{}'
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 987edf5037 Add global VolumeHelper caching in CSI PVC BIA plugin
Address review feedback to have a global VolumeHelper instance per
plugin process instead of creating one on each ShouldPerformSnapshot
call.

Changes:
- Add volumeHelper, cachedForBackup, and mu fields to pvcBackupItemAction
  struct for caching the VolumeHelper per backup
- Add getOrCreateVolumeHelper() method for thread-safe lazy initialization
- Update Execute() to use cached VolumeHelper via
  ShouldPerformSnapshotWithVolumeHelper()
- Update filterPVCsByVolumePolicy() to accept VolumeHelper parameter
- Add ShouldPerformSnapshotWithVolumeHelper() that accepts optional
  VolumeHelper for reuse across multiple calls
- Add NewVolumeHelperForBackup() factory function for BIA plugins
- Add comprehensive unit tests for both nil and non-nil VolumeHelper paths

This completes the fix for issue #9179 by ensuring the PVC-to-Pod cache
is built once per backup and reused across all PVC processing, avoiding
O(N*M) complexity.

Fixes #9179

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 99e821a870 Address review feedback: move cache building to volumehelper
- Rename NewVolumeHelperImplWithCache to NewVolumeHelperImplWithNamespaces
- Move cache building logic from backup.go into volumehelper
- Return error from NewVolumeHelperImplWithNamespaces if cache build fails
- Remove fallback in main backup path - backup fails if cache build fails
- Update NewVolumeHelperImpl to call NewVolumeHelperImplWithNamespaces
- Add comments clarifying fallback is only used by plugins
- Update tests for new error return signature

This addresses review comments from @Lyndon-Li and @kaovilai:
- Cache building is now encapsulated in volumehelper
- No fallback in main backup path ensures predictable performance
- Code reuse between constructors

Fixes #9179

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 041e5e2a7e Address review feedback
- Use ResolveNamespaceList() instead of GetIncludes() for more accurate
  namespace resolution when building the PVC-to-Pod cache
- Refactor NewVolumeHelperImpl to call NewVolumeHelperImplWithCache with
  nil cache parameter to avoid code duplication

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 8e58099674 Add test for cache usage without volume policy
Add test case to verify that the PVC-to-Pod cache is used even when
no volume policy is configured. When defaultVolumesToFSBackup is true,
the cache is used to find pods using the PVC to determine if fs-backup
should be used instead of snapshot.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar a43f14b071 Add unit tests for ShouldPerformFSBackup with PVC-to-Pod cache
Add TestVolumeHelperImplWithCache_ShouldPerformFSBackup to verify:
- Volume policy match with cache returns correct fs-backup decision
- Volume policy match with snapshot action skips fs-backup
- Fallback to direct lookup when cache is not built

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 26053ae6d6 Add unit tests for VolumeHelperImpl with PVC-to-Pod cache
Add TestVolumeHelperImplWithCache_ShouldPerformSnapshot to verify:
- Volume policy match with cache returns correct snapshot decision
- fs-backup via opt-out with cache properly skips snapshot
- Fallback to direct lookup when cache is not built

These tests verify the cache-enabled code path added in the previous
commit for improved volume policy performance.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar 60203ad01b Add changelog for PR #9441
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Shubham Pampattiwar bcdc30b59a Add PVC-to-Pod cache to improve volume policy performance
The GetPodsUsingPVC function had O(N*M) complexity - for each PVC,
it listed ALL pods in the namespace and iterated through each pod.
With many PVCs and pods, this caused significant performance
degradation (2+ seconds per PV in some cases).

This change introduces a PVC-to-Pod cache that is built once per
backup and reused for all PVC lookups, reducing complexity from
O(N*M) to O(N+M).

Changes:
- Add PVCPodCache struct with thread-safe caching in podvolume pkg
- Add NewVolumeHelperImplWithCache constructor for cache support
- Build cache before backup item processing in backup.go
- Add comprehensive unit tests for cache functionality
- Graceful fallback to direct lookups if cache fails

Fixes #9179

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-15 14:18:05 -08:00
Tiger Kaovilai a1026cb531 Update golangci-lint installation script URL to use HEAD for latest version (#9451)
Run the E2E test on kind / get-go-version (push) Failing after 1m16s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
build-image / Build (push) Failing after 15s
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 39s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m39s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m27s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m27s
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-12-12 12:25:45 -05:00
Tiger Kaovilai f30b9f9504 add comment for restic exec function
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-12-13 00:09:09 +07:00
Tiger Kaovilai 8688568ffc feat: Resolve caCertRef in object store getter for plugin compatibility
This change enables BSL validation to work when using caCertRef
(Secret-based CA certificate) by resolving the certificate from
the Secret in velero core before passing it to the object store
plugin as 'caCert' in the config map.

This approach requires no changes to provider plugins since they
already understand the 'caCert' config key.

Changes:
- Add SecretStore to objectBackupStoreGetter struct
- Add NewObjectBackupStoreGetterWithSecretStore constructor
- Update Get method to resolve caCertRef from Secret
- Update server.go to use new constructor with SecretStore
- Add CACertRef builder method and unit tests

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-12-12 21:07:38 +07:00
Tiger Kaovilai 61bf2ef777 feat: Enhance BackupStorageLocation with Secret-based CA certificate support
- Introduced `CACertRef` field in `ObjectStorageLocation` to reference a Secret containing the CA certificate, replacing the deprecated `CACert` field.
- Implemented validation logic to ensure mutual exclusivity between `CACert` and `CACertRef`.
- Updated BSL controller and repository provider to handle the new certificate resolution logic.
- Enhanced CLI to support automatic certificate discovery from BSL configurations.
- Added unit and integration tests to validate new functionality and ensure backward compatibility.
- Documented migration strategy for users transitioning from inline certificates to Secret-based management.

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-12-12 21:07:37 +07:00
Shubham Pampattiwar 14b34f08cc Merge pull request #9321 from shubham-pampattiwar/fix-azure-bsl-status-message-8368
Run the E2E test on kind / get-go-version (push) Failing after 1m11s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 33s
Sanitize Azure HTTP responses in BSL status messages
2025-12-11 22:00:18 -08:00
Xun Jiang/Bruce Jiang add66eac42 Merge pull request #9431 from vmware-tanzu/9033_fix
Remove VolumeSnapshotClass from CSI B/R process.
2025-12-12 10:59:47 +08:00
Xun Jiang 096436507e Remove VolumeSnapshotClass from CSI restore and deletion process.
Run the E2E test on kind / get-go-version (push) Failing after 1m1s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Remove VolumeSnapshotClass from backup sync process.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-12-11 17:56:11 +08:00
Xun Jiang/Bruce Jiang 554b04e6ca Merge pull request #9132 from mjnagel/crd-upgrade
Run the E2E test on kind / get-go-version (push) Failing after 56s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 25s
Close stale issues and PRs / stale (push) Successful in 12s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m36s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m16s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m13s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m4s
feat: add apply flag to install command
2025-12-10 16:41:56 +08:00
Xun Jiang/Bruce Jiang c594026c1f Merge pull request #9446 from vmware-tanzu/dependabot/github_actions/actions/stale-10.1.1
Run the E2E test on kind / get-go-version (push) Failing after 1m5s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 32s
Bump actions/stale from 10.1.0 to 10.1.1
2025-12-10 13:31:28 +08:00
Xun Jiang/Bruce Jiang 46776898ab Merge branch 'main' into dependabot/github_actions/actions/stale-10.1.1 2025-12-10 11:34:29 +08:00
Xun Jiang/Bruce Jiang fdcfed84f9 Add the node-agent ConfigMap document. (#9434)
Run the E2E test on kind / get-go-version (push) Failing after 1m1s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m45s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m9s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m6s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m14s
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-12-09 04:57:30 -05:00
dependabot[bot] dbeb16aad7 Bump actions/stale from 10.1.0 to 10.1.1
Bumps [actions/stale](https://github.com/actions/stale) from 10.1.0 to 10.1.1.
- [Release notes](https://github.com/actions/stale/releases)
- [Changelog](https://github.com/actions/stale/blob/main/CHANGELOG.md)
- [Commits](https://github.com/actions/stale/compare/v10.1.0...v10.1.1)

---
updated-dependencies:
- dependency-name: actions/stale
  dependency-version: 10.1.1
  dependency-type: direct:production
  update-type: version-update:semver-patch
...

Signed-off-by: dependabot[bot] <support@github.com>
2025-12-08 19:02:57 +00:00
Shubham Pampattiwar f0c97c489d Merge pull request #9414 from shubham-pampattiwar/add-maintenance-job-metrics
Run the E2E test on kind / get-go-version (push) Failing after 1m8s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 37s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m43s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 58s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 58s
Add Prometheus metrics for maintenance jobs
2025-12-08 09:23:44 -08:00
Micah Nagel 3244cc605f feat: add apply flag to install command
Signed-off-by: Micah Nagel <micah.nagel@defenseunicorns.com>
2025-12-05 11:26:10 +08:00
Shubham Pampattiwar 6a0307142c Merge pull request #9307 from sseago/parallel-backup
Run the E2E test on kind / get-go-version (push) Failing after 1m4s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 13s
Main CI / Build (push) Failing after 30s
Close stale issues and PRs / stale (push) Successful in 17s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m46s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m30s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m36s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m31s
Parallel backup processing
2025-12-04 11:37:24 -08:00
Shubham Pampattiwar 1ec622245b Run make update to fix gofmt alignment
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-03 16:13:13 -08:00
Shubham Pampattiwar 31fb828f8e Add clarifying comment for histogram metric
Explain that the duration histogram tracks distribution of individual
job durations, not accumulated sums, to address reviewer concerns.

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-03 16:05:32 -08:00
Scott Seago 7286d24c35 Updates for merge conflict and to refine reconciler queueing logic
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-03 16:55:59 -05:00
Scott Seago 7e4797f588 Track running backup count via BackupTracker
This avoids an unnecessary apiserver List call when
the backup reconciler is already at capacity.

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:23:47 -05:00
Scott Seago f238a7e47b make update
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:23:21 -05:00
Scott Seago 0b2e7d1238 Minor refactoring
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:23:21 -05:00
Scott Seago 73864e31ff Fix linters
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:04:55 -05:00
Scott Seago 8a95d512b3 make update, changelog
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:04:07 -05:00
Scott Seago 4d1802233a add various scenarios to queue controller unit tests
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:01:09 -05:00
Scott Seago f73443659a Backup queue controller implementation
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:57:18 -05:00
Scott Seago 7111f3cea2 feat: Remove pvc-for-tmp install arg
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:49:17 -05:00
Scott Seago 845eee4e60 feat: Create backup queue controller and add to disableable list
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:46:56 -05:00
Scott Seago c50ab4a6ea feat: Add pvc-for-tmp install arg to use PVC for server /tmp dir
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:40:49 -05:00
Scott Seago 6a3f821606 fix lint
Signed-off-by: Scott Seago <sseago@redhat.com>

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago 34dc381182 Refactor after review
Signed-off-by: Scott Seago <sseago@redhat.com>

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago 29b01c3170 make update
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago 84571bc54d Added doc note around parallel backups and resource limits
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago 9c1c7d20ff Minor refactoring
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago 7bc57b5a5f Refactor queue controller to reduce apiserver list calls
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago e7b5d20f4c Fix linters
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:10 -05:00
Scott Seago aedc0fe5e2 make update, changelog
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:39:07 -05:00
Scott Seago dbaa25405d move podVolumeContext into backupRequest
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago 91357b28c4 Move worker pool creation to backup reconcile.
ItemBlockWorkerPool is now created for each backup.

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago e0c08f03cf add various scenarios to queue controller unit tests
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago a56ab10f23 Move debug logs to info
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago d39ad6f208 run multiple backup reconcilers, only reconcile ReadyToStart backups
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago 300bc70c68 Add queue position to backup list/describe
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago 13041b40c2 Backup queue controller implementation
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:38:41 -05:00
Scott Seago 4ffb29d750 feat: Remove pvc-for-tmp install arg
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago fe799d7546 feat: Add concurrent backups configuration to backup reconciler
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago d91d50f696 feat: Add concurrentBackups to backupQueueReconciler
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago 9dfa108579 feat: initialize backup queue controller
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago 4cac891fb9 refactor: Extract backup-queue controller name to constant
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago 5d02af3ce3 feat: Create backup queue controller and add to disableable list
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago 2944c0dad4 update CRDs
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago cd103add11 feat: Add QueuePosition field to BackupStatus
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago dc91d6ee67 feat: Add pvc-for-tmp install arg to use PVC for server /tmp dir
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago cfc12dc6bf feat: Add install arg and config for concurrent backups
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago 9c09d04979 feat: Add Queued and ReadyToStart phases to BackupPhase
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Shubham Pampattiwar 20af2c20c5 Address PR review comments: sanitize errors and add SAS token scrubbing
This commit addresses three review comments on PR #9321:

1. Keep sanitization in controller (response to @ywk253100)
   - Maintaining centralized error handling for easier extension
   - Azure-specific patterns detected and others passed through unchanged

2. Sanitize unavailableErrors array (@priyansh17)
   - Now using sanitizeStorageError() for both unavailableErrors array
     and location.Status.Message for consistency

3. Add SAS token scrubbing (@anshulahuja98)
   - Scrubs Azure SAS token parameters to prevent credential leakage
   - Redacts: sig, se, st, sp, spr, sv, sr, sip, srt, ss
   - Example: ?sig=secret becomes ?sig=***REDACTED***

Added comprehensive test coverage for SAS token scrubbing with 4 new
test cases covering various scenarios.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:37:50 -08:00
Shubham Pampattiwar 60dd3dc832 Add changelog for PR #9321
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:37:50 -08:00
Shubham Pampattiwar a5d32f29da Sanitize Azure HTTP responses in BSL status messages
Azure storage errors include verbose HTTP response details and XML
in error messages, making the BSL status.message field cluttered
and hard to read. This change adds sanitization to extract only
the error code and meaningful message.

Before:
  BackupStorageLocation "test" is unavailable: rpc error: code = Unknown
  desc = GET https://...
  RESPONSE 404: 404 The specified container does not exist.
  ERROR CODE: ContainerNotFound
  <?xml version="1.0"...>

After:
  BackupStorageLocation "test" is unavailable: rpc error: code = Unknown
  desc = ContainerNotFound: The specified container does not exist.

AWS and GCP error messages are preserved as-is since they don't
contain verbose HTTP responses.

Fixes #8368

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:37:50 -08:00
Shubham Pampattiwar 27ca08b5a5 Address review comments: rename metrics to repo_maintenance_*
- Rename metric constants from maintenance_job_* to repo_maintenance_*
- Update metric help text to clarify these are for repo maintenance
- Rename functions: RegisterMaintenanceJob* → RegisterRepoMaintenance*
- Update all test references to use new names

Addresses review comments from @Lyndon-Li on PR #9414

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:36:15 -08:00
Shubham Pampattiwar fdf439963c Add Prometheus metrics for maintenance jobs
Adds three new Prometheus metrics to track backup repository
maintenance job execution:

- velero_maintenance_job_success_total: Counter for successful jobs
- velero_maintenance_job_failure_total: Counter for failed jobs
- velero_maintenance_job_duration_seconds: Histogram for job duration

Metrics use repository_name label to identify specific BackupRepositories.
Duration is recorded for both successful and failed jobs (when job runs),
but not when job fails to start.

Includes comprehensive unit and integration tests.

Fixes #9225

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:36:15 -08:00
Joseph Antony Vaikath 975f647323 Wildcard ns implement (#9255)
Run the E2E test on kind / get-go-version (push) Failing after 1m5s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 11s
Main CI / Build (push) Failing after 25s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m36s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m2s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m19s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m2s
* Add wildcard status fields

Signed-off-by: Joseph <jvaikath@redhat.com>

* Implement wildcard namespace expansion in item collector

- Introduced methods to get active namespaces and expand wildcard includes/excludes in the item collector.
- Updated getNamespacesToList to handle wildcard patterns and return expanded lists.
- Added utility functions for setting includes and excludes in the IncludesExcludes struct.
- Created a new package for wildcard handling, including functions to determine when to expand wildcards and to perform the expansion.

This enhances the backup process by allowing more flexible namespace selection based on wildcard patterns.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Enhance wildcard expansion logic and logging in item collector

- Improved logging to include original includes and excludes when expanding wildcards.
- Updated the ShouldExpandWildcards function to check for wildcard patterns in excludes.
- Added comments for clarity in the expandWildcards function regarding pattern handling.

These changes enhance the clarity and functionality of the wildcard expansion process in the backup system.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Add wildcard namespace fields to Backup CRD and update deepcopy methods

- Introduced `wildcardIncludedNamespaces` and `wildcardExcludedNamespaces` fields to the Backup CRD to support wildcard patterns for namespace inclusion and exclusion.
- Updated deepcopy methods to handle the new fields, ensuring proper copying of data during object manipulation.

These changes enhance the flexibility of namespace selection in backup operations, aligning with recent improvements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor Backup CRD to rename wildcard namespace fields

- Updated `BackupStatus` struct to rename `WildcardIncludedNamespaces` to `WildcardExpandedIncludedNamespaces` and `WildcardExcludedNamespaces` to `WildcardExpandedExcludedNamespaces` for clarity.
- Adjusted associated comments to reflect the new naming and ensure consistency in documentation.
- Modified deepcopy methods to accommodate the renamed fields, ensuring proper data handling during object manipulation.

These changes enhance the clarity and maintainability of the Backup CRD, aligning with recent improvements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Fix

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor where wildcard expansion happens

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor Backup CRD and related components for expanded namespace handling

- Updated `BackupStatus` struct to rename fields for clarity: `WildcardExpandedIncludedNamespaces` and `WildcardExpandedExcludedNamespaces` are now `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces`, respectively.
- Adjusted associated comments and deepcopy methods to reflect the new naming conventions.
- Removed the `getActiveNamespaces` function from the item collector, streamlining the namespace handling process.
- Enhanced logging during wildcard expansion to provide clearer insights into the process.

These changes improve the clarity and maintainability of the Backup CRD and enhance the functionality of namespace selection in backup operations.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor wildcard expansion logic in item collector and enhance testing

- Moved the wildcard expansion logic into a dedicated method, `expandNamespaceWildcards`, improving code organization and readability.
- Updated logging to provide detailed insights during the wildcard expansion process.
- Introduced comprehensive unit tests for wildcard handling, covering various scenarios and edge cases.
- Enhanced the `ShouldExpandWildcards` function to better identify wildcard patterns and validate inputs.

These changes improve the maintainability and robustness of the wildcard handling in the backup system.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Enhance Restore CRD with expanded namespace fields and update logic

- Added `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces` fields to the `RestoreStatus` struct to support expanded wildcard namespace handling.
- Updated the `DeepCopyInto` method to ensure proper copying of the new fields.
- Implemented logic in the restore process to expand wildcard patterns for included and excluded namespaces, improving flexibility in namespace selection during restores.
- Enhanced logging to provide insights into the expanded namespaces.

These changes improve the functionality and maintainability of the restore process, aligning with recent enhancements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor Backup and Restore CRDs to enhance wildcard namespace handling

- Renamed fields in `BackupStatus` and `RestoreStatus` from `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces` to `IncludeWildcardMatches` and `ExcludeWildcardMatches` for clarity.
- Introduced a new field `WildcardResult` to record the final namespaces after applying wildcard logic.
- Updated the `DeepCopyInto` methods to accommodate the new field names and ensure proper data handling.
- Enhanced comments to reflect the changes and improve documentation clarity.

These updates improve the maintainability and clarity of the CRDs, aligning with recent enhancements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Enhance wildcard namespace handling in Backup and Restore processes

- Updated `BackupRequest` and `Restore` status structures to include a new field `WildcardResult`, which captures the final list of namespaces after applying wildcard logic.
- Renamed existing fields to `IncludeWildcardMatches` and `ExcludeWildcardMatches` for improved clarity.
- Enhanced logging to provide detailed insights into the expanded namespaces and final results during backup and restore operations.
- Introduced a new utility function `GetWildcardResult` to streamline the selection of namespaces based on include/exclude criteria.

These changes improve the clarity and functionality of namespace selection in both backup and restore processes, aligning with recent enhancements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Refactor namespace wildcard expansion logic in restore process

- Moved the wildcard expansion logic into a dedicated method, `expandNamespaceWildcards`, improving code organization and readability.
- Enhanced error handling and logging to provide detailed insights into the expanded namespaces during the restore operation.
- Updated the restore context with expanded namespace patterns and final results, ensuring clarity in the restore status.

These changes improve the maintainability and clarity of the restore process, aligning with recent enhancements in wildcard handling.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Add checks for "*" in exclude

Signed-off-by: Joseph <jvaikath@redhat.com>

* Rebase

Signed-off-by: Joseph <jvaikath@redhat.com>

* Create NamespaceIncludesExcludes to get full NS listing for backup w/

Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>

* Add new NamespaceIncludesExcludes struct

Signed-off-by: Joseph <jvaikath@redhat.com>

* Move namespace expansion logic

Signed-off-by: Joseph <jvaikath@redhat.com>

* Update backup status with expansion

Signed-off-by: Joseph <jvaikath@redhat.com>

* Wildcard status update

Signed-off-by: Joseph <jvaikath@redhat.com>

* Skip ns check if wildcard expansion

Signed-off-by: Joseph <jvaikath@redhat.com>

* Move wildcard expansion to getResourceItems

Signed-off-by: Joseph <jvaikath@redhat.com>

* lint

Signed-off-by: Joseph <jvaikath@redhat.com>

* Changelog

Signed-off-by: Joseph <jvaikath@redhat.com>

* linting issues

Signed-off-by: Joseph <jvaikath@redhat.com>

* Remove wildcard restore to check if tests pass

Signed-off-by: Joseph <jvaikath@redhat.com>

* Fix namespace mapping test bug from lint fix

The previous commit (0a4aabcf4) attempted to fix linting issues by
using strings.Builder, but incorrectly wrote commas to a separate
builder and concatenated them at the end instead of between namespace
mappings.

This caused the namespace mapping string to be malformed:
  Before: ns-1:ns-1-mapped,ns-2:ns-2-mapped
  Bug:    ns-1:ns-1-mappedns-2:ns-2-mapped,,

The malformed string was parsed as a single mapping with an invalid
namespace name containing a colon, causing Kubernetes to reject it:
  "ns-1-mappedns-2:ns-2-mapped" is invalid

Fix by properly using strings.Builder to construct the mapping string
with commas between entries, addressing both the linting concern and
the functional bug.

Fixes the MultiNamespacesMappingResticTest and
MultiNamespacesMappingSnapshotTest failures.

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>

* Fix wildcard namespace expansion edge cases

This commit fixes two bugs in the wildcard namespace expansion feature:

1. Empty wildcard results: When a wildcard pattern (e.g., "invalid*")
   matched no namespaces, the backup would incorrectly back up ALL
   namespaces instead of backing up nothing. This was because the empty
   includes list was indistinguishable from "no filter specified".

   Fix: Added wildcardExpanded flag to NamespaceIncludesExcludes to
   track when wildcard expansion has occurred. When true and the
   includes list is empty, ShouldInclude now correctly returns false.

2. Premature namespace filtering: An earlier attempt to fix bug #1
   filtered namespaces too early in collectNamespaces, breaking
   LabelSelector tests where namespaces should be included based on
   resources within them matching the label selector.

   Fix: Removed the premature filtering and rely on the existing
   filterNamespaces call at the end of getAllItems, which correctly
   handles both wildcard expansion and label selector scenarios.

The fixes ensure:
- Wildcard patterns matching nothing result in empty backups
- Label selectors still work correctly (namespace included if any
  resource in it matches the selector)
- State is preserved across multiple ResolveNamespaceList calls

🤖 Generated with [Claude Code](https://claude.com/claude-code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>

* Run wildcard expansion during backup processing

Signed-off-by: Joseph <jvaikath@redhat.com>

* Lint fix

Signed-off-by: Joseph <jvaikath@redhat.com>

* Improve coverage

Signed-off-by: Joseph <jvaikath@redhat.com>

* gofmt fix

Signed-off-by: Joseph <jvaikath@redhat.com>

* Add wildcard details to describe backup status

Signed-off-by: Joseph <jvaikath@redhat.com>

* Revert "Remove wildcard restore to check if tests pass"

This reverts commit 4e22c2af855b71447762cb0a9fab7e7049f38a5f.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Add restore describe for wildcard namespaces Revert restore wildcard removal

Signed-off-by: Joseph <jvaikath@redhat.com>

* Add coverage

Signed-off-by: Joseph <jvaikath@redhat.com>

* Lint

Signed-off-by: Joseph <jvaikath@redhat.com>

* Remove unintentional changes

Signed-off-by: Joseph <jvaikath@redhat.com>

* Remove wildcard status fields and mentionsRemove usage of wildcard fields for backup and restore status.

Signed-off-by: Joseph <jvaikath@redhat.com>

* Remove status update changelog line

Signed-off-by: Joseph <jvaikath@redhat.com>

* Rename getNamespaceIncludesExcludes
Signed-off-by: Scott Seago <sseago@redhat.com>

Signed-off-by: Scott Seago <sseago@redhat.com>

* Rewrite brace pattern validation

Signed-off-by: Joseph <jvaikath@redhat.com>

* Different var for internal loop

Signed-off-by: Joseph <jvaikath@redhat.com>

---------

Signed-off-by: Joseph <jvaikath@redhat.com>
Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-authored-by: Scott Seago <sseago@redhat.com>
Co-authored-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
2025-12-02 12:28:03 -05:00
131 changed files with 8918 additions and 1332 deletions
+1 -1
View File
@@ -7,7 +7,7 @@ jobs:
stale:
runs-on: ubuntu-latest
steps:
- uses: actions/stale@v10.1.0
- uses: actions/stale@v10.1.1
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: "This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands."
+1
View File
@@ -0,0 +1 @@
Add `--apply` flag to `install` command, allowing usage of Kubernetes apply to make changes to existing installs
+1
View File
@@ -0,0 +1 @@
feat: Enhance BackupStorageLocation with Secret-based CA certificate support
+1
View File
@@ -0,0 +1 @@
Remove labels associated with previous backups
+10
View File
@@ -0,0 +1,10 @@
Implement wildcard namespace pattern expansion for backup namespace includes/excludes.
This change adds support for wildcard patterns (*, ?, [abc], {a,b,c}) in namespace includes and excludes during backup operations.
When wildcard patterns are detected, they are expanded against the list of active namespaces in the cluster before the backup proceeds.
Key features:
- Wildcard patterns in namespace includes/excludes are automatically detected and expanded
- Pattern validation ensures unsupported patterns (regex, consecutive asterisks) are rejected
- Empty wildcard results (e.g., "invalid*" matching no namespaces) correctly result in empty backups
- Exact namespace names and "*" continue to work as before (no expansion needed)
+1
View File
@@ -0,0 +1 @@
Concurrent backup processing
@@ -0,0 +1 @@
Sanitize Azure HTTP responses in BSL status messages
+1
View File
@@ -0,0 +1 @@
Use hookIndex for recording multiple restore exec hooks.
@@ -0,0 +1 @@
Add Prometheus metrics for maintenance jobs
+1
View File
@@ -0,0 +1 @@
Remove VolumeSnapshotClass from CSI B/R process.
@@ -0,0 +1 @@
Add PVC-to-Pod cache to improve volume policy performance
+1
View File
@@ -0,0 +1 @@
Fix plugin init container names exceeding DNS-1123 limit
+1
View File
@@ -0,0 +1 @@
Add maintenance job and data mover pod's labels and annotations setting.
@@ -594,6 +594,8 @@ spec:
description: Phase is the current state of the Backup.
enum:
- New
- Queued
- ReadyToStart
- FailedValidation
- InProgress
- WaitingForPluginOperations
@@ -625,6 +627,11 @@ spec:
filters that happen as items are processed.
type: integer
type: object
queuePosition:
description: |-
QueuePosition is the position of the backup in the queue.
Only relevant when Phase is "Queued"
type: integer
startTimestamp:
description: |-
StartTimestamp records the time a backup was started.
@@ -113,10 +113,38 @@ spec:
description: Bucket is the bucket to use for object storage.
type: string
caCert:
description: CACert defines a CA bundle to use when verifying
TLS connections to the provider.
description: |-
CACert defines a CA bundle to use when verifying TLS connections to the provider.
Deprecated: Use CACertRef instead.
format: byte
type: string
caCertRef:
description: |-
CACertRef is a reference to a Secret containing the CA certificate bundle to use
when verifying TLS connections to the provider. The Secret must be in the same
namespace as the BackupStorageLocation.
properties:
key:
description: The key of the secret to select from. Must be
a valid secret key.
type: string
name:
default: ""
description: |-
Name of the referent.
This field is effectively required, but due to backwards compatibility is
allowed to be empty. Instances of this type with an empty value here are
almost certainly wrong.
More info: https://kubernetes.io/docs/concepts/overview/working-with-objects/names/#names
type: string
optional:
description: Specify whether the Secret or its key must be
defined
type: boolean
required:
- key
type: object
x-kubernetes-map-type: atomic
prefix:
description: Prefix is the path inside a bucket to use for Velero
storage. Optional.
File diff suppressed because one or more lines are too long
+70
View File
@@ -0,0 +1,70 @@
# Apply flag for install command
## Abstract
Add an `--apply` flag to the install command that enables applying existing resources rather than creating them. This can be useful as part of the upgrade process for existing installations.
## Background
The current Velero install command creates resources but doesn't provide a direct way to apply updates to an existing installation.
Users attempting to run the install command on an existing installation receive "already exists" messages.
Upgrade steps for existing installs typically involve a three (or more) step process to apply updated CRDs (using `--dry-run` and piping to `kubectl apply`) and then updating/setting images on the Velero deployment and node-agent.
## Goals
- Provide a simple flag to enable applying resources on an existing Velero installation.
- Use server-side apply to update existing resources rather than attempting to create them.
- Maintain consistency with the regular install flow.
## Non Goals
- Implement special logic for specific version-to-version upgrades (i.e. resource deletion, etc).
- Add complex upgrade validation or pre/post-upgrade hooks.
- Provide rollback capabilities.
## High-Level Design
The `--apply` flag will be added to the Velero install command.
When this flag is set, the installation process will use server-side apply to update existing resources instead of using create on new resources.
This flag can be used as _part_ of the upgrade process, but will not always fully handle an upgrade.
## Detailed Design
The implementation adds a new boolean flag `--apply` to the install command.
This flag will be passed through to the underlying install functions where the resource creation logic resides.
When the flag is set to true:
- The `createOrApplyResource` function will use server-side apply with field manager "velero-cli" and `force=true` to update resources.
- Resources will be applied in the same order as they would be created during installation.
- Custom Resource Definitions will still be processed first, and the system will wait for them to be established before continuing.
The server-side apply approach with `force=true` ensures that resources are updated even if there are conflicts with the last applied state.
This provides a best-effort mechanism to apply resources that follows the same flow as installation but updates resources instead of creating them.
No special handling is added for specific versions or resource structures, making this a general-purpose mechanism for applying resources.
## Alternatives Considered
1. Creating a separate `upgrade` command that would duplicate much of the install command logic.
- Rejected due to code duplication and maintenance overhead.
2. Implementing version-specific upgrade logic to handle breaking changes between versions.
- Rejected as overly complex and difficult to maintain across multiple version paths.
- This could be considered again in the future, but is not in the scope of the current design.
3. Adding automatic detection of existing resources and switching to apply mode.
- Rejected as it could lead to unexpected behavior and confusion if users unintentionally apply changes to existing resources.
## Security Considerations
The apply flag maintains the same security profile as the install command.
No additional permissions are required beyond what is needed for resource creation.
The use of `force=true` with server-side apply could potentially override manual changes made to resources, but this is a necessary trade-off to ensure apply is successful.
## Compatibility
This enhancement is compatible with all existing Velero installations as it is a new opt-in flag.
It does not change any resource formats or API contracts.
The apply process is best-effort and does not guarantee compatibility between arbitrary versions of Velero.
Users should still consult release notes for any breaking changes that may require manual intervention.
This flag could be adopted by the helm chart, specifically for CRD updates, to simplify the CRD update job.
## Implementation
The implementation involves:
1. Adding support for `Apply` to the existing Kubernetes client code.
1. Adding the `--apply` flag to the install command options.
1. Changing `createResource` to `createOrApplyResource` and updating it to use server-side apply when the `apply` boolean is set.
The implementation is straightforward and follows existing code patterns.
No migration of state or special handling of specific resources is required.
+417
View File
@@ -0,0 +1,417 @@
# Design for BSL Certificate Support Enhancement
## Abstract
This design document describes the enhancement of BackupStorageLocation (BSL) certificate management in Velero, introducing a Secret-based certificate reference mechanism (`caCertRef`) alongside the existing inline certificate field (`caCert`). This enhancement provides a more secure, Kubernetes-native approach to certificate management while enabling future CLI improvements for automatic certificate discovery.
## Background
Currently, Velero supports TLS certificate verification for object storage providers through an inline `caCert` field in the BSL specification. While functional, this approach has several limitations:
- **Security**: Certificates are stored directly in the BSL YAML, potentially exposing sensitive data
- **Management**: Certificate rotation requires updating the BSL resource itself
- **CLI Usability**: Users must manually specify certificates when using CLI commands
- **Size Limitations**: Large certificate bundles can make BSL resources unwieldy
Issue #9097 and PR #8557 highlight the need for improved certificate management that addresses these concerns while maintaining backward compatibility.
## Goals
- Provide a secure, Secret-based certificate storage mechanism
- Maintain full backward compatibility with existing BSL configurations
- Enable future CLI enhancements for automatic certificate discovery
- Simplify certificate rotation and management
- Provide clear migration path for existing users
## Non-Goals
- Removing support for inline certificates immediately
- Changing the behavior of existing BSL configurations
- Implementing client-side certificate validation
- Supporting certificates from ConfigMaps or other resource types
## High-Level Design
### API Changes
#### New Field: CACertRef
```go
type ObjectStorageLocation struct {
// Existing field (now deprecated)
// +optional
// +kubebuilder:deprecatedversion:warning="caCert is deprecated, use caCertRef instead"
CACert []byte `json:"caCert,omitempty"`
// New field for Secret reference
// +optional
CACertRef *corev1api.SecretKeySelector `json:"caCertRef,omitempty"`
}
```
The `SecretKeySelector` follows standard Kubernetes patterns:
```go
type SecretKeySelector struct {
// Name of the Secret
Name string `json:"name"`
// Key within the Secret
Key string `json:"key"`
}
```
### Certificate Resolution Logic
The system follows a priority-based resolution:
1. If `caCertRef` is specified, retrieve certificate from the referenced Secret
2. If `caCert` is specified (and `caCertRef` is not), use the inline certificate
3. If neither is specified, no custom CA certificate is used
### Validation
BSL validation ensures mutual exclusivity:
```go
func (bsl *BackupStorageLocation) Validate() error {
if bsl.Spec.ObjectStorage != nil &&
bsl.Spec.ObjectStorage.CACert != nil &&
bsl.Spec.ObjectStorage.CACertRef != nil {
return errors.New("cannot specify both caCert and caCertRef in objectStorage")
}
return nil
}
```
## Detailed Design
### BSL Controller Changes
The BSL controller incorporates validation during reconciliation:
```go
func (r *backupStorageLocationReconciler) Reconcile(req ctrl.Request) (ctrl.Result, error) {
// ... existing code ...
// Validate BSL configuration
if err := location.Validate(); err != nil {
r.logger.WithError(err).Error("BSL validation failed")
return ctrl.Result{}, err
}
// ... continue reconciliation ...
}
```
### Repository Provider Integration
All repository providers implement consistent certificate handling:
```go
func configureCACert(bsl *velerov1api.BackupStorageLocation, credGetter *credentials.CredentialGetter) ([]byte, error) {
if bsl.Spec.ObjectStorage == nil {
return nil, nil
}
// Prefer caCertRef (new method)
if bsl.Spec.ObjectStorage.CACertRef != nil {
certString, err := credGetter.FromSecret.Get(bsl.Spec.ObjectStorage.CACertRef)
if err != nil {
return nil, errors.Wrap(err, "error getting CA certificate from secret")
}
return []byte(certString), nil
}
// Fall back to caCert (deprecated)
if bsl.Spec.ObjectStorage.CACert != nil {
return bsl.Spec.ObjectStorage.CACert, nil
}
return nil, nil
}
```
### CLI Certificate Discovery Integration
#### Background: PR #8557 Implementation
PR #8557 ("CLI automatically discovers and uses cacert from BSL") was merged in August 2025, introducing automatic CA certificate discovery from BackupStorageLocation for Velero CLI download operations. This eliminated the need for users to manually specify the `--cacert` flag when performing operations like `backup describe`, `backup download`, `backup logs`, and `restore logs`.
#### Current Implementation (Post PR #8557)
The CLI now automatically discovers certificates from BSL through the `pkg/cmd/util/cacert/bsl_cacert.go` module:
```go
// Current implementation only supports inline caCert
func GetCACertFromBSL(ctx context.Context, client kbclient.Client, namespace, bslName string) (string, error) {
// ... fetch BSL ...
if bsl.Spec.ObjectStorage != nil && len(bsl.Spec.ObjectStorage.CACert) > 0 {
return string(bsl.Spec.ObjectStorage.CACert), nil
}
return "", nil
}
```
#### Enhancement with caCertRef Support
This design extends the existing CLI certificate discovery to support the new `caCertRef` field:
```go
// Enhanced implementation supporting both caCert and caCertRef
func GetCACertFromBSL(ctx context.Context, client kbclient.Client, namespace, bslName string) (string, error) {
// ... fetch BSL ...
// Prefer caCertRef over inline caCert
if bsl.Spec.ObjectStorage.CACertRef != nil {
secret := &corev1api.Secret{}
key := types.NamespacedName{
Name: bsl.Spec.ObjectStorage.CACertRef.Name,
Namespace: namespace,
}
if err := client.Get(ctx, key, secret); err != nil {
return "", errors.Wrap(err, "error getting certificate secret")
}
certData, ok := secret.Data[bsl.Spec.ObjectStorage.CACertRef.Key]
if !ok {
return "", errors.Errorf("key %s not found in secret",
bsl.Spec.ObjectStorage.CACertRef.Key)
}
return string(certData), nil
}
// Fall back to inline caCert (deprecated)
if bsl.Spec.ObjectStorage.CACert != nil {
return string(bsl.Spec.ObjectStorage.CACert), nil
}
return "", nil
}
```
#### Certificate Resolution Priority
The CLI follows this priority order for certificate resolution:
1. **`--cacert` flag** - Manual override, highest priority
2. **`caCertRef`** - Secret-based certificate (recommended)
3. **`caCert`** - Inline certificate (deprecated)
4. **System certificate pool** - Default fallback
#### User Experience Improvements
With both PR #8557 and this enhancement:
```bash
# Automatic discovery - works with both caCert and caCertRef
velero backup describe my-backup
velero backup download my-backup
velero backup logs my-backup
velero restore logs my-restore
# Manual override still available
velero backup describe my-backup --cacert /custom/ca.crt
# Debug output shows certificate source
velero backup download my-backup --log-level=debug
# [DEBUG] Resolved CA certificate from BSL 'default' Secret 'storage-ca-cert' key 'ca-bundle.crt'
```
#### RBAC Considerations for CLI
CLI users need read access to Secrets when using `caCertRef`:
```yaml
apiVersion: rbac.authorization.k8s.io/v1
kind: Role
metadata:
name: velero-cli-user
namespace: velero
rules:
- apiGroups: ["velero.io"]
resources: ["backups", "restores", "backupstoragelocations"]
verbs: ["get", "list"]
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
# Limited to secrets referenced by BSLs
```
### Migration Strategy
#### Phase 1: Introduction (Current)
- Add `caCertRef` field
- Mark `caCert` as deprecated
- Both fields supported, mutual exclusivity enforced
#### Phase 2: Migration Period
- Documentation and tools to help users migrate
- Warning messages for `caCert` usage
- CLI enhancements to leverage `caCertRef`
#### Phase 3: Future Removal
- Remove `caCert` field in major version update
- Provide migration tool for automatic conversion
## User Experience
### Creating a BSL with Certificate Reference
1. Create a Secret containing the CA certificate:
```yaml
apiVersion: v1
kind: Secret
metadata:
name: storage-ca-cert
namespace: velero
type: Opaque
data:
ca-bundle.crt: <base64-encoded-certificate>
```
2. Reference the Secret in BSL:
```yaml
apiVersion: velero.io/v1
kind: BackupStorageLocation
metadata:
name: default
namespace: velero
spec:
provider: aws
objectStorage:
bucket: my-bucket
caCertRef:
name: storage-ca-cert
key: ca-bundle.crt
```
### Certificate Rotation
With Secret-based certificates:
```bash
# Update the Secret with new certificate
kubectl create secret generic storage-ca-cert \
--from-file=ca-bundle.crt=new-ca.crt \
--dry-run=client -o yaml | kubectl apply -f -
# No BSL update required - changes take effect on next use
```
### CLI Usage Examples
#### Immediate Benefits
- No change required for existing workflows
- Certificate validation errors include helpful context
#### Future CLI Enhancements
```bash
# Automatic certificate discovery
velero backup download my-backup
# Manual override still available
velero backup download my-backup --cacert /custom/ca.crt
# Debug certificate resolution
velero backup download my-backup --log-level=debug
# [DEBUG] Resolved CA certificate from BSL 'default' Secret 'storage-ca-cert'
```
## Security Considerations
### Advantages of Secret-based Storage
1. **Encryption at Rest**: Secrets are encrypted in etcd
2. **RBAC Control**: Fine-grained access control via Kubernetes RBAC
3. **Audit Trail**: Secret access is auditable
4. **Separation of Concerns**: Certificates separate from configuration
### Required Permissions
The Velero server requires additional RBAC permissions:
```yaml
- apiGroups: [""]
resources: ["secrets"]
verbs: ["get"]
# Scoped to secrets referenced by BSLs
```
## Compatibility
### Backward Compatibility
- Existing BSLs with `caCert` continue to function unchanged
- No breaking changes to API
- Gradual migration path
### Forward Compatibility
- Design allows for future enhancements:
- Multiple certificate support
- Certificate chain validation
- Automatic certificate discovery from cloud providers
## Implementation Phases
### Phase 1: Core Implementation ✓ (Current PR)
- API changes with new `caCertRef` field
- Controller validation
- Repository provider updates
- Basic testing
### Phase 2: CLI Enhancement (Future)
- Automatic certificate discovery in CLI
- Enhanced error messages
- Debug logging for certificate resolution
### Phase 3: Migration Tools (Future)
- Automated migration scripts
- Validation tools
- Documentation updates
## Testing
### Unit Tests
- BSL validation logic
- Certificate resolution in providers
- Controller behavior
### Integration Tests
- End-to-end backup/restore with `caCertRef`
- Certificate rotation scenarios
- Migration from `caCert` to `caCertRef`
### Manual Testing Scenarios
1. Create BSL with `caCertRef`
2. Perform backup/restore operations
3. Rotate certificate in Secret
4. Verify continued operation
## Documentation
### User Documentation
- Migration guide from `caCert` to `caCertRef`
- Examples for common cloud providers
- Troubleshooting guide
### API Documentation
- Updated API reference
- Deprecation notices
- Field descriptions
## Alternatives Considered
### ConfigMap-based Storage
- Pros: Similar to Secrets, simpler API
- Cons: Not designed for sensitive data, no encryption at rest
- Decision: Secrets are the Kubernetes-standard for sensitive data
### External Certificate Management
- Pros: Integration with cert-manager, etc.
- Cons: Additional complexity, dependencies
- Decision: Keep it simple, allow users to manage certificates as needed
### Immediate Removal of Inline Certificates
- Pros: Cleaner API, forces best practices
- Cons: Breaking change, migration burden
- Decision: Gradual deprecation respects existing users
## Conclusion
This design provides a secure, Kubernetes-native approach to certificate management in Velero while maintaining backward compatibility. It establishes the foundation for enhanced CLI functionality and improved user experience, addressing the concerns raised in issue #9097 and enabling the features proposed in PR #8557.
The phased approach ensures smooth migration for existing users while delivering immediate security benefits for new deployments.
+1 -1
View File
@@ -94,7 +94,7 @@ RUN ARCH=$(go env GOARCH) && \
chmod +x /usr/bin/goreleaser
# get golangci-lint
RUN curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/master/install.sh | sh -s -- -b $(go env GOPATH)/bin v2.5.0
RUN curl -sSfL https://raw.githubusercontent.com/golangci/golangci-lint/HEAD/install.sh | sh -s -- -b $(go env GOPATH)/bin v2.5.0
# install kubectl
RUN curl -LO https://storage.googleapis.com/kubernetes-release/release/$(curl -s https://storage.googleapis.com/kubernetes-release/release/stable.txt)/bin/linux/$(go env GOARCH)/kubectl
@@ -103,6 +103,14 @@ func (p *volumeSnapshotContentDeleteItemAction) Execute(
snapCont.ResourceVersion = ""
if snapCont.Spec.VolumeSnapshotClassName != nil {
// Delete VolumeSnapshotClass from the VolumeSnapshotContent.
// This is necessary to make the deletion independent of the VolumeSnapshotClass.
snapCont.Spec.VolumeSnapshotClassName = nil
p.log.Debugf("Deleted VolumeSnapshotClassName from VolumeSnapshotContent %s to make deletion independent of VolumeSnapshotClass",
snapCont.Name)
}
if err := p.crClient.Create(context.TODO(), &snapCont); err != nil {
return errors.Wrapf(err, "fail to create VolumeSnapshotContent %s", snapCont.Name)
}
@@ -70,7 +70,7 @@ func TestVSCExecute(t *testing.T) {
},
{
name: "Normal case, VolumeSnapshot should be deleted",
vsc: builder.ForVolumeSnapshotContent("bar").ObjectMeta(builder.WithLabelsMap(map[string]string{velerov1api.BackupNameLabel: "backup"})).Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleStr}).Result(),
vsc: builder.ForVolumeSnapshotContent("bar").ObjectMeta(builder.WithLabelsMap(map[string]string{velerov1api.BackupNameLabel: "backup"})).VolumeSnapshotClassName("volumesnapshotclass").Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleStr}).Result(),
backup: builder.ForBackup("velero", "backup").ObjectMeta(builder.WithAnnotationsMap(map[string]string{velerov1api.ResourceTimeoutAnnotation: "5s"})).Result(),
expectErr: false,
function: func(
@@ -82,7 +82,7 @@ func TestVSCExecute(t *testing.T) {
},
},
{
name: "Normal case, VolumeSnapshot should be deleted",
name: "Error case, deletion fails",
vsc: builder.ForVolumeSnapshotContent("bar").ObjectMeta(builder.WithLabelsMap(map[string]string{velerov1api.BackupNameLabel: "backup"})).Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleStr}).Result(),
backup: builder.ForBackup("velero", "backup").ObjectMeta(builder.WithAnnotationsMap(map[string]string{velerov1api.ResourceTimeoutAnnotation: "5s"})).Result(),
expectErr: true,
+4 -4
View File
@@ -169,7 +169,7 @@ func (e *DefaultWaitExecHookHandler) HandleHooks(
hookLog.Error(err)
errors = append(errors, err)
errTracker := multiHookTracker.Record(restoreName, newPod.Namespace, newPod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), i, true, err)
errTracker := multiHookTracker.Record(restoreName, newPod.Namespace, newPod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), hook.hookIndex, true, err)
if errTracker != nil {
hookLog.WithError(errTracker).Warn("Error recording the hook in hook tracker")
}
@@ -195,7 +195,7 @@ func (e *DefaultWaitExecHookHandler) HandleHooks(
hookFailed = true
}
errTracker := multiHookTracker.Record(restoreName, newPod.Namespace, newPod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), i, hookFailed, hookErr)
errTracker := multiHookTracker.Record(restoreName, newPod.Namespace, newPod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), hook.hookIndex, hookFailed, hookErr)
if errTracker != nil {
hookLog.WithError(errTracker).Warn("Error recording the hook in hook tracker")
}
@@ -239,7 +239,7 @@ func (e *DefaultWaitExecHookHandler) HandleHooks(
// containers to become ready.
// Each unexecuted hook is logged as an error and this error will be returned from this function.
for _, hooks := range byContainer {
for i, hook := range hooks {
for _, hook := range hooks {
if hook.executed {
continue
}
@@ -252,7 +252,7 @@ func (e *DefaultWaitExecHookHandler) HandleHooks(
},
)
errTracker := multiHookTracker.Record(restoreName, pod.Namespace, pod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), i, true, err)
errTracker := multiHookTracker.Record(restoreName, pod.Namespace, pod.Name, hook.Hook.Container, hook.HookSource, hook.HookName, HookPhase(""), hook.hookIndex, true, err)
if errTracker != nil {
hookLog.WithError(errTracker).Warn("Error recording the hook in hook tracker")
}
@@ -706,6 +706,130 @@ func TestWaitExecHandleHooks(t *testing.T) {
},
},
},
{
name: "Multiple hooks with non-sequential indices (bug #9359)",
initialPod: builder.ForPod("default", "my-pod").
Containers(&corev1api.Container{
Name: "container1",
}).
ContainerStatuses(&corev1api.ContainerStatus{
Name: "container1",
State: corev1api.ContainerState{
Running: &corev1api.ContainerStateRunning{},
},
}).
Result(),
groupResource: "pods",
byContainer: map[string][]PodExecRestoreHook{
"container1": {
{
HookName: "first-hook",
HookSource: HookSourceAnnotation,
Hook: velerov1api.ExecRestoreHook{
Container: "container1",
Command: []string{"/usr/bin/foo"},
OnError: velerov1api.HookErrorModeContinue,
ExecTimeout: metav1.Duration{Duration: time.Second},
WaitTimeout: metav1.Duration{Duration: time.Minute},
},
hookIndex: 0,
},
{
HookName: "second-hook",
HookSource: HookSourceAnnotation,
Hook: velerov1api.ExecRestoreHook{
Container: "container1",
Command: []string{"/usr/bin/bar"},
OnError: velerov1api.HookErrorModeContinue,
ExecTimeout: metav1.Duration{Duration: time.Second},
WaitTimeout: metav1.Duration{Duration: time.Minute},
},
hookIndex: 2,
},
{
HookName: "third-hook",
HookSource: HookSourceAnnotation,
Hook: velerov1api.ExecRestoreHook{
Container: "container1",
Command: []string{"/usr/bin/third"},
OnError: velerov1api.HookErrorModeContinue,
ExecTimeout: metav1.Duration{Duration: time.Second},
WaitTimeout: metav1.Duration{Duration: time.Minute},
},
hookIndex: 4,
},
},
},
expectedExecutions: []expectedExecution{
{
name: "first-hook",
hook: &velerov1api.ExecHook{
Container: "container1",
Command: []string{"/usr/bin/foo"},
OnError: velerov1api.HookErrorModeContinue,
Timeout: metav1.Duration{Duration: time.Second},
},
error: nil,
pod: builder.ForPod("default", "my-pod").
ObjectMeta(builder.WithResourceVersion("1")).
Containers(&corev1api.Container{
Name: "container1",
}).
ContainerStatuses(&corev1api.ContainerStatus{
Name: "container1",
State: corev1api.ContainerState{
Running: &corev1api.ContainerStateRunning{},
},
}).
Result(),
},
{
name: "second-hook",
hook: &velerov1api.ExecHook{
Container: "container1",
Command: []string{"/usr/bin/bar"},
OnError: velerov1api.HookErrorModeContinue,
Timeout: metav1.Duration{Duration: time.Second},
},
error: nil,
pod: builder.ForPod("default", "my-pod").
ObjectMeta(builder.WithResourceVersion("1")).
Containers(&corev1api.Container{
Name: "container1",
}).
ContainerStatuses(&corev1api.ContainerStatus{
Name: "container1",
State: corev1api.ContainerState{
Running: &corev1api.ContainerStateRunning{},
},
}).
Result(),
},
{
name: "third-hook",
hook: &velerov1api.ExecHook{
Container: "container1",
Command: []string{"/usr/bin/third"},
OnError: velerov1api.HookErrorModeContinue,
Timeout: metav1.Duration{Duration: time.Second},
},
error: nil,
pod: builder.ForPod("default", "my-pod").
ObjectMeta(builder.WithResourceVersion("1")).
Containers(&corev1api.Container{
Name: "container1",
}).
ContainerStatuses(&corev1api.ContainerStatus{
Name: "container1",
State: corev1api.ContainerState{
Running: &corev1api.ContainerStateRunning{},
},
}).
Result(),
},
},
expectedErrors: nil,
},
}
for _, test := range tests {
+77 -1
View File
@@ -1,9 +1,11 @@
package volumehelper
import (
"context"
"fmt"
"strings"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
corev1api "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime"
@@ -11,6 +13,7 @@ import (
crclient "sigs.k8s.io/controller-runtime/pkg/client"
"github.com/vmware-tanzu/velero/internal/resourcepolicies"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/kuberesource"
"github.com/vmware-tanzu/velero/pkg/util/boolptr"
kubeutil "github.com/vmware-tanzu/velero/pkg/util/kube"
@@ -33,8 +36,16 @@ type volumeHelperImpl struct {
// to the volume policy check, but fs-backup is based on the pod resource,
// the resource filter on PVC and PV doesn't work on this scenario.
backupExcludePVC bool
// pvcPodCache provides cached PVC to Pod mappings for improved performance.
// When there are many PVCs and pods, using this cache avoids O(N*M) lookups.
pvcPodCache *podvolumeutil.PVCPodCache
}
// NewVolumeHelperImpl creates a VolumeHelper without PVC-to-Pod caching.
//
// Deprecated: Use NewVolumeHelperImplWithNamespaces or NewVolumeHelperImplWithCache instead
// for better performance. These functions provide PVC-to-Pod caching which avoids O(N*M)
// complexity when there are many PVCs and pods. See issue #9179 for details.
func NewVolumeHelperImpl(
volumePolicy *resourcepolicies.Policies,
snapshotVolumes *bool,
@@ -43,6 +54,43 @@ func NewVolumeHelperImpl(
defaultVolumesToFSBackup bool,
backupExcludePVC bool,
) VolumeHelper {
// Pass nil namespaces - no cache will be built, so this never fails.
// This is used by plugins that don't need the cache optimization.
vh, _ := NewVolumeHelperImplWithNamespaces(
volumePolicy,
snapshotVolumes,
logger,
client,
defaultVolumesToFSBackup,
backupExcludePVC,
nil,
)
return vh
}
// NewVolumeHelperImplWithNamespaces creates a VolumeHelper with a PVC-to-Pod cache for improved performance.
// The cache is built internally from the provided namespaces list.
// This avoids O(N*M) complexity when there are many PVCs and pods.
// See issue #9179 for details.
// Returns an error if cache building fails - callers should not proceed with backup in this case.
func NewVolumeHelperImplWithNamespaces(
volumePolicy *resourcepolicies.Policies,
snapshotVolumes *bool,
logger logrus.FieldLogger,
client crclient.Client,
defaultVolumesToFSBackup bool,
backupExcludePVC bool,
namespaces []string,
) (VolumeHelper, error) {
var pvcPodCache *podvolumeutil.PVCPodCache
if len(namespaces) > 0 {
pvcPodCache = podvolumeutil.NewPVCPodCache()
if err := pvcPodCache.BuildCacheForNamespaces(context.Background(), namespaces, client); err != nil {
return nil, err
}
logger.Infof("Built PVC-to-Pod cache for %d namespaces", len(namespaces))
}
return &volumeHelperImpl{
volumePolicy: volumePolicy,
snapshotVolumes: snapshotVolumes,
@@ -50,7 +98,33 @@ func NewVolumeHelperImpl(
client: client,
defaultVolumesToFSBackup: defaultVolumesToFSBackup,
backupExcludePVC: backupExcludePVC,
pvcPodCache: pvcPodCache,
}, nil
}
// NewVolumeHelperImplWithCache creates a VolumeHelper using an externally managed PVC-to-Pod cache.
// This is used by plugins that build the cache lazily per-namespace (following the pattern from PR #9226).
// The cache can be nil, in which case PVC-to-Pod lookups will fall back to direct API calls.
func NewVolumeHelperImplWithCache(
backup velerov1api.Backup,
client crclient.Client,
logger logrus.FieldLogger,
pvcPodCache *podvolumeutil.PVCPodCache,
) (VolumeHelper, error) {
resourcePolicies, err := resourcepolicies.GetResourcePoliciesFromBackup(backup, client, logger)
if err != nil {
return nil, errors.Wrap(err, "failed to get volume policies from backup")
}
return &volumeHelperImpl{
volumePolicy: resourcePolicies,
snapshotVolumes: backup.Spec.SnapshotVolumes,
logger: logger,
client: client,
defaultVolumesToFSBackup: boolptr.IsSetToTrue(backup.Spec.DefaultVolumesToFsBackup),
backupExcludePVC: boolptr.IsSetToTrue(backup.Spec.SnapshotMoveData),
pvcPodCache: pvcPodCache,
}, nil
}
func (v *volumeHelperImpl) ShouldPerformSnapshot(obj runtime.Unstructured, groupResource schema.GroupResource) (bool, error) {
@@ -105,10 +179,12 @@ func (v *volumeHelperImpl) ShouldPerformSnapshot(obj runtime.Unstructured, group
// If this PV is claimed, see if we've already taken a (pod volume backup)
// snapshot of the contents of this PV. If so, don't take a snapshot.
if pv.Spec.ClaimRef != nil {
pods, err := podvolumeutil.GetPodsUsingPVC(
// Use cached lookup if available for better performance with many PVCs/pods
pods, err := podvolumeutil.GetPodsUsingPVCWithCache(
pv.Spec.ClaimRef.Namespace,
pv.Spec.ClaimRef.Name,
v.client,
v.pvcPodCache,
)
if err != nil {
v.logger.WithError(err).Errorf("fail to get pod for PV %s", pv.Name)
@@ -34,6 +34,7 @@ import (
"github.com/vmware-tanzu/velero/pkg/builder"
"github.com/vmware-tanzu/velero/pkg/kuberesource"
velerotest "github.com/vmware-tanzu/velero/pkg/test"
podvolumeutil "github.com/vmware-tanzu/velero/pkg/util/podvolume"
)
func TestVolumeHelperImpl_ShouldPerformSnapshot(t *testing.T) {
@@ -738,3 +739,498 @@ func TestGetVolumeFromResource(t *testing.T) {
assert.ErrorContains(t, err, "resource is not a PersistentVolume or Volume")
})
}
func TestVolumeHelperImplWithCache_ShouldPerformSnapshot(t *testing.T) {
testCases := []struct {
name string
inputObj runtime.Object
groupResource schema.GroupResource
pod *corev1api.Pod
resourcePolicies *resourcepolicies.ResourcePolicies
snapshotVolumesFlag *bool
defaultVolumesToFSBackup bool
buildCache bool
shouldSnapshot bool
expectedErr bool
}{
{
name: "VolumePolicy match with cache, returns true",
inputObj: builder.ForPersistentVolume("example-pv").StorageClass("gp2-csi").ClaimRef("ns", "pvc-1").Result(),
groupResource: kuberesource.PersistentVolumes,
resourcePolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
VolumePolicies: []resourcepolicies.VolumePolicy{
{
Conditions: map[string]any{
"storageClass": []string{"gp2-csi"},
},
Action: resourcepolicies.Action{
Type: resourcepolicies.Snapshot,
},
},
},
},
snapshotVolumesFlag: ptr.To(true),
buildCache: true,
shouldSnapshot: true,
expectedErr: false,
},
{
name: "VolumePolicy not match, fs-backup via opt-out with cache, skips snapshot",
inputObj: builder.ForPersistentVolume("example-pv").StorageClass("gp3-csi").ClaimRef("ns", "pvc-1").Result(),
groupResource: kuberesource.PersistentVolumes,
pod: builder.ForPod("ns", "pod-1").Volumes(
&corev1api.Volume{
Name: "volume",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
},
).Result(),
resourcePolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
VolumePolicies: []resourcepolicies.VolumePolicy{
{
Conditions: map[string]any{
"storageClass": []string{"gp2-csi"},
},
Action: resourcepolicies.Action{
Type: resourcepolicies.Snapshot,
},
},
},
},
snapshotVolumesFlag: ptr.To(true),
defaultVolumesToFSBackup: true,
buildCache: true,
shouldSnapshot: false,
expectedErr: false,
},
{
name: "Cache not built, falls back to direct lookup",
inputObj: builder.ForPersistentVolume("example-pv").StorageClass("gp2-csi").ClaimRef("ns", "pvc-1").Result(),
groupResource: kuberesource.PersistentVolumes,
resourcePolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
VolumePolicies: []resourcepolicies.VolumePolicy{
{
Conditions: map[string]any{
"storageClass": []string{"gp2-csi"},
},
Action: resourcepolicies.Action{
Type: resourcepolicies.Snapshot,
},
},
},
},
snapshotVolumesFlag: ptr.To(true),
buildCache: false,
shouldSnapshot: true,
expectedErr: false,
},
{
name: "No volume policy, defaultVolumesToFSBackup with cache, skips snapshot",
inputObj: builder.ForPersistentVolume("example-pv").StorageClass("gp2-csi").ClaimRef("ns", "pvc-1").Result(),
groupResource: kuberesource.PersistentVolumes,
pod: builder.ForPod("ns", "pod-1").Volumes(
&corev1api.Volume{
Name: "volume",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
},
).Result(),
resourcePolicies: nil,
snapshotVolumesFlag: ptr.To(true),
defaultVolumesToFSBackup: true,
buildCache: true,
shouldSnapshot: false,
expectedErr: false,
},
}
objs := []runtime.Object{
&corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Namespace: "ns",
Name: "pvc-1",
},
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
fakeClient := velerotest.NewFakeControllerRuntimeClient(t, objs...)
if tc.pod != nil {
require.NoError(t, fakeClient.Create(t.Context(), tc.pod))
}
var p *resourcepolicies.Policies
if tc.resourcePolicies != nil {
p = &resourcepolicies.Policies{}
err := p.BuildPolicy(tc.resourcePolicies)
require.NoError(t, err)
}
var namespaces []string
if tc.buildCache {
namespaces = []string{"ns"}
}
vh, err := NewVolumeHelperImplWithNamespaces(
p,
tc.snapshotVolumesFlag,
logrus.StandardLogger(),
fakeClient,
tc.defaultVolumesToFSBackup,
false,
namespaces,
)
require.NoError(t, err)
obj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(tc.inputObj)
require.NoError(t, err)
actualShouldSnapshot, actualError := vh.ShouldPerformSnapshot(&unstructured.Unstructured{Object: obj}, tc.groupResource)
if tc.expectedErr {
require.Error(t, actualError)
return
}
require.NoError(t, actualError)
require.Equalf(t, tc.shouldSnapshot, actualShouldSnapshot, "Want shouldSnapshot as %t; Got shouldSnapshot as %t", tc.shouldSnapshot, actualShouldSnapshot)
})
}
}
func TestVolumeHelperImplWithCache_ShouldPerformFSBackup(t *testing.T) {
testCases := []struct {
name string
pod *corev1api.Pod
resources []runtime.Object
resourcePolicies *resourcepolicies.ResourcePolicies
snapshotVolumesFlag *bool
defaultVolumesToFSBackup bool
buildCache bool
shouldFSBackup bool
expectedErr bool
}{
{
name: "VolumePolicy match with cache, return true",
pod: builder.ForPod("ns", "pod-1").
Volumes(
&corev1api.Volume{
Name: "vol-1",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
}).Result(),
resources: []runtime.Object{
builder.ForPersistentVolumeClaim("ns", "pvc-1").
VolumeName("pv-1").
StorageClass("gp2-csi").Phase(corev1api.ClaimBound).Result(),
builder.ForPersistentVolume("pv-1").StorageClass("gp2-csi").Result(),
},
resourcePolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
VolumePolicies: []resourcepolicies.VolumePolicy{
{
Conditions: map[string]any{
"storageClass": []string{"gp2-csi"},
},
Action: resourcepolicies.Action{
Type: resourcepolicies.FSBackup,
},
},
},
},
buildCache: true,
shouldFSBackup: true,
expectedErr: false,
},
{
name: "VolumePolicy match with cache, action is snapshot, return false",
pod: builder.ForPod("ns", "pod-1").
Volumes(
&corev1api.Volume{
Name: "vol-1",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
}).Result(),
resources: []runtime.Object{
builder.ForPersistentVolumeClaim("ns", "pvc-1").
VolumeName("pv-1").
StorageClass("gp2-csi").Phase(corev1api.ClaimBound).Result(),
builder.ForPersistentVolume("pv-1").StorageClass("gp2-csi").Result(),
},
resourcePolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
VolumePolicies: []resourcepolicies.VolumePolicy{
{
Conditions: map[string]any{
"storageClass": []string{"gp2-csi"},
},
Action: resourcepolicies.Action{
Type: resourcepolicies.Snapshot,
},
},
},
},
buildCache: true,
shouldFSBackup: false,
expectedErr: false,
},
{
name: "Cache not built, falls back to direct lookup, opt-in annotation",
pod: builder.ForPod("ns", "pod-1").
ObjectMeta(builder.WithAnnotations(velerov1api.VolumesToBackupAnnotation, "vol-1")).
Volumes(
&corev1api.Volume{
Name: "vol-1",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
}).Result(),
resources: []runtime.Object{
builder.ForPersistentVolumeClaim("ns", "pvc-1").
VolumeName("pv-1").
StorageClass("gp2-csi").Phase(corev1api.ClaimBound).Result(),
builder.ForPersistentVolume("pv-1").StorageClass("gp2-csi").Result(),
},
buildCache: false,
defaultVolumesToFSBackup: false,
shouldFSBackup: true,
expectedErr: false,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
fakeClient := velerotest.NewFakeControllerRuntimeClient(t, tc.resources...)
if tc.pod != nil {
require.NoError(t, fakeClient.Create(t.Context(), tc.pod))
}
var p *resourcepolicies.Policies
if tc.resourcePolicies != nil {
p = &resourcepolicies.Policies{}
err := p.BuildPolicy(tc.resourcePolicies)
require.NoError(t, err)
}
var namespaces []string
if tc.buildCache {
namespaces = []string{"ns"}
}
vh, err := NewVolumeHelperImplWithNamespaces(
p,
tc.snapshotVolumesFlag,
logrus.StandardLogger(),
fakeClient,
tc.defaultVolumesToFSBackup,
false,
namespaces,
)
require.NoError(t, err)
actualShouldFSBackup, actualError := vh.ShouldPerformFSBackup(tc.pod.Spec.Volumes[0], *tc.pod)
if tc.expectedErr {
require.Error(t, actualError)
return
}
require.NoError(t, actualError)
require.Equalf(t, tc.shouldFSBackup, actualShouldFSBackup, "Want shouldFSBackup as %t; Got shouldFSBackup as %t", tc.shouldFSBackup, actualShouldFSBackup)
})
}
}
// TestNewVolumeHelperImplWithCache tests the NewVolumeHelperImplWithCache constructor
// which is used by plugins that build the cache lazily per-namespace.
func TestNewVolumeHelperImplWithCache(t *testing.T) {
testCases := []struct {
name string
backup velerov1api.Backup
resourcePolicyConfigMap *corev1api.ConfigMap
pvcPodCache bool // whether to pass a cache
expectError bool
}{
{
name: "creates VolumeHelper with nil cache",
backup: velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
SnapshotVolumes: ptr.To(true),
DefaultVolumesToFsBackup: ptr.To(false),
},
},
pvcPodCache: false,
expectError: false,
},
{
name: "creates VolumeHelper with non-nil cache",
backup: velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
SnapshotVolumes: ptr.To(true),
DefaultVolumesToFsBackup: ptr.To(true),
SnapshotMoveData: ptr.To(true),
},
},
pvcPodCache: true,
expectError: false,
},
{
name: "creates VolumeHelper with resource policies",
backup: velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
SnapshotVolumes: ptr.To(true),
ResourcePolicy: &corev1api.TypedLocalObjectReference{
Kind: "ConfigMap",
Name: "resource-policy",
},
},
},
resourcePolicyConfigMap: &corev1api.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: "resource-policy",
Namespace: "velero",
},
Data: map[string]string{
"policy": `version: v1
volumePolicies:
- conditions:
storageClass:
- gp2-csi
action:
type: snapshot`,
},
},
pvcPodCache: true,
expectError: false,
},
{
name: "fails when resource policy ConfigMap not found",
backup: velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
ResourcePolicy: &corev1api.TypedLocalObjectReference{
Kind: "ConfigMap",
Name: "non-existent-policy",
},
},
},
pvcPodCache: false,
expectError: true,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
var objs []runtime.Object
if tc.resourcePolicyConfigMap != nil {
objs = append(objs, tc.resourcePolicyConfigMap)
}
fakeClient := velerotest.NewFakeControllerRuntimeClient(t, objs...)
var cache *podvolumeutil.PVCPodCache
if tc.pvcPodCache {
cache = podvolumeutil.NewPVCPodCache()
}
vh, err := NewVolumeHelperImplWithCache(
tc.backup,
fakeClient,
logrus.StandardLogger(),
cache,
)
if tc.expectError {
require.Error(t, err)
require.Nil(t, vh)
} else {
require.NoError(t, err)
require.NotNil(t, vh)
}
})
}
}
// TestNewVolumeHelperImplWithCache_UsesCache verifies that the VolumeHelper created
// via NewVolumeHelperImplWithCache actually uses the provided cache for lookups.
func TestNewVolumeHelperImplWithCache_UsesCache(t *testing.T) {
// Create a pod that uses a PVC via opt-out (defaultVolumesToFsBackup=true)
pod := builder.ForPod("ns", "pod-1").Volumes(
&corev1api.Volume{
Name: "volume",
VolumeSource: corev1api.VolumeSource{
PersistentVolumeClaim: &corev1api.PersistentVolumeClaimVolumeSource{
ClaimName: "pvc-1",
},
},
},
).Result()
pvc := &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Namespace: "ns",
Name: "pvc-1",
},
}
pv := builder.ForPersistentVolume("example-pv").StorageClass("gp2-csi").ClaimRef("ns", "pvc-1").Result()
fakeClient := velerotest.NewFakeControllerRuntimeClient(t, pvc, pv, pod)
// Build cache for the namespace
cache := podvolumeutil.NewPVCPodCache()
err := cache.BuildCacheForNamespace(t.Context(), "ns", fakeClient)
require.NoError(t, err)
backup := velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
SnapshotVolumes: ptr.To(true),
DefaultVolumesToFsBackup: ptr.To(true), // opt-out mode
},
}
vh, err := NewVolumeHelperImplWithCache(backup, fakeClient, logrus.StandardLogger(), cache)
require.NoError(t, err)
// Convert PV to unstructured
obj, err := runtime.DefaultUnstructuredConverter.ToUnstructured(pv)
require.NoError(t, err)
// ShouldPerformSnapshot should return false because the volume is selected for fs-backup
// This relies on the cache to find the pod using the PVC
shouldSnapshot, err := vh.ShouldPerformSnapshot(&unstructured.Unstructured{Object: obj}, kuberesource.PersistentVolumes)
require.NoError(t, err)
require.False(t, shouldSnapshot, "Expected snapshot to be skipped due to fs-backup selection via cache")
}
+12 -1
View File
@@ -288,7 +288,7 @@ const (
// BackupPhase is a string representation of the lifecycle phase
// of a Velero backup.
// +kubebuilder:validation:Enum=New;FailedValidation;InProgress;WaitingForPluginOperations;WaitingForPluginOperationsPartiallyFailed;Finalizing;FinalizingPartiallyFailed;Completed;PartiallyFailed;Failed;Deleting
// +kubebuilder:validation:Enum=New;Queued;ReadyToStart;FailedValidation;InProgress;WaitingForPluginOperations;WaitingForPluginOperationsPartiallyFailed;Finalizing;FinalizingPartiallyFailed;Completed;PartiallyFailed;Failed;Deleting
type BackupPhase string
const (
@@ -296,6 +296,12 @@ const (
// yet processed by the BackupController.
BackupPhaseNew BackupPhase = "New"
// BackupPhaseQueued means the backup has been added to the queue and is waiting for the Queue to move it out of the queue.
BackupPhaseQueued BackupPhase = "Queued"
// BackupPhaseReadyToStart means the backup has been pulled from the queue and is ready to start.
BackupPhaseReadyToStart BackupPhase = "ReadyToStart"
// BackupPhaseFailedValidation means the backup has failed
// the controller's validations and therefore will not run.
BackupPhaseFailedValidation BackupPhase = "FailedValidation"
@@ -371,6 +377,11 @@ type BackupStatus struct {
// +optional
Phase BackupPhase `json:"phase,omitempty"`
// QueuePosition is the position of the backup in the queue.
// Only relevant when Phase is "Queued"
// +optional
QueuePosition int `json:"queuePosition,omitempty"`
// ValidationErrors is a slice of all validation errors (if
// applicable).
// +optional
@@ -17,6 +17,8 @@ limitations under the License.
package v1
import (
"errors"
corev1api "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/types"
@@ -146,8 +148,15 @@ type ObjectStorageLocation struct {
Prefix string `json:"prefix,omitempty"`
// CACert defines a CA bundle to use when verifying TLS connections to the provider.
// Deprecated: Use CACertRef instead.
// +optional
CACert []byte `json:"caCert,omitempty"`
// CACertRef is a reference to a Secret containing the CA certificate bundle to use
// when verifying TLS connections to the provider. The Secret must be in the same
// namespace as the BackupStorageLocation.
// +optional
CACertRef *corev1api.SecretKeySelector `json:"caCertRef,omitempty"`
}
// BackupStorageLocationPhase is the lifecycle phase of a Velero BackupStorageLocation.
@@ -177,3 +186,13 @@ const (
// TODO(2.0): remove the AccessMode field from BackupStorageLocationStatus.
// TODO(2.0): remove the LastSyncedRevision field from BackupStorageLocationStatus.
// Validate validates the BackupStorageLocation to ensure that only one of CACert or CACertRef is set.
func (bsl *BackupStorageLocation) Validate() error {
if bsl.Spec.ObjectStorage != nil &&
bsl.Spec.ObjectStorage.CACert != nil &&
bsl.Spec.ObjectStorage.CACertRef != nil {
return errors.New("cannot specify both caCert and caCertRef in objectStorage")
}
return nil
}
@@ -0,0 +1,121 @@
/*
Copyright The Velero Contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package v1
import (
"testing"
corev1api "k8s.io/api/core/v1"
)
func TestBackupStorageLocationValidate(t *testing.T) {
tests := []struct {
name string
bsl *BackupStorageLocation
expectError bool
}{
{
name: "valid - neither CACert nor CACertRef set",
bsl: &BackupStorageLocation{
Spec: BackupStorageLocationSpec{
StorageType: StorageType{
ObjectStorage: &ObjectStorageLocation{
Bucket: "test-bucket",
},
},
},
},
expectError: false,
},
{
name: "valid - only CACert set",
bsl: &BackupStorageLocation{
Spec: BackupStorageLocationSpec{
StorageType: StorageType{
ObjectStorage: &ObjectStorageLocation{
Bucket: "test-bucket",
CACert: []byte("test-cert"),
},
},
},
},
expectError: false,
},
{
name: "valid - only CACertRef set",
bsl: &BackupStorageLocation{
Spec: BackupStorageLocationSpec{
StorageType: StorageType{
ObjectStorage: &ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "ca-cert-secret",
},
Key: "ca.crt",
},
},
},
},
},
expectError: false,
},
{
name: "invalid - both CACert and CACertRef set",
bsl: &BackupStorageLocation{
Spec: BackupStorageLocationSpec{
StorageType: StorageType{
ObjectStorage: &ObjectStorageLocation{
Bucket: "test-bucket",
CACert: []byte("test-cert"),
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "ca-cert-secret",
},
Key: "ca.crt",
},
},
},
},
},
expectError: true,
},
{
name: "valid - no ObjectStorage",
bsl: &BackupStorageLocation{
Spec: BackupStorageLocationSpec{
StorageType: StorageType{
ObjectStorage: nil,
},
},
},
expectError: false,
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
err := test.bsl.Validate()
if test.expectError && err == nil {
t.Errorf("expected error but got none")
}
if !test.expectError && err != nil {
t.Errorf("expected no error but got: %v", err)
}
})
}
}
@@ -915,6 +915,11 @@ func (in *ObjectStorageLocation) DeepCopyInto(out *ObjectStorageLocation) {
*out = make([]byte, len(*in))
copy(*out, *in)
}
if in.CACertRef != nil {
in, out := &in.CACertRef, &out.CACertRef
*out = new(corev1.SecretKeySelector)
(*in).DeepCopyInto(*out)
}
}
// DeepCopy is an autogenerated deepcopy function, copying the receiver, creating a new ObjectStorageLocation.
+49 -8
View File
@@ -76,14 +76,8 @@ func (a *PVCAction) Execute(item runtime.Unstructured, backup *v1.Backup) (runti
pvc.Spec.Selector = nil
}
// remove label selectors with "velero.io/" prefixing in the key which is left by Velero restore
if pvc.Spec.Selector != nil && pvc.Spec.Selector.MatchLabels != nil {
for k := range pvc.Spec.Selector.MatchLabels {
if strings.HasPrefix(k, "velero.io/") {
delete(pvc.Spec.Selector.MatchLabels, k)
}
}
}
// Clean stale Velero labels from PVC metadata and selector
a.cleanupStaleVeleroLabels(pvc, backup)
pvcMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&pvc)
if err != nil {
@@ -92,3 +86,50 @@ func (a *PVCAction) Execute(item runtime.Unstructured, backup *v1.Backup) (runti
return &unstructured.Unstructured{Object: pvcMap}, actionhelpers.RelatedItemsForPVC(pvc, a.log), nil
}
// cleanupStaleVeleroLabels removes stale Velero labels from both the PVC metadata
// and the selector's match labels to ensure clean backups
func (a *PVCAction) cleanupStaleVeleroLabels(pvc *corev1api.PersistentVolumeClaim, backup *v1.Backup) {
// Clean stale Velero labels from selector match labels
if pvc.Spec.Selector != nil && pvc.Spec.Selector.MatchLabels != nil {
for k := range pvc.Spec.Selector.MatchLabels {
if strings.HasPrefix(k, "velero.io/") {
a.log.Infof("Deleting stale Velero label %s from PVC %s selector", k, pvc.Name)
delete(pvc.Spec.Selector.MatchLabels, k)
}
}
}
// Clean stale Velero labels from main metadata
if pvc.Labels != nil {
for k, v := range pvc.Labels {
// Only remove labels that are clearly stale from previous operations
shouldRemove := false
// Always remove restore-name labels as these are from previous restores
if k == v1.RestoreNameLabel {
shouldRemove = true
}
if k == v1.MustIncludeAdditionalItemAnnotation {
shouldRemove = true
}
// Remove backup-name labels that don't match current backup
if k == v1.BackupNameLabel && v != backup.Name {
shouldRemove = true
}
// Remove volume-snapshot-name labels from previous CSI backups
// Note: If this backup creates new CSI snapshots, the CSI action will add them back
if k == v1.VolumeSnapshotLabel {
shouldRemove = true
}
if shouldRemove {
a.log.Infof("Deleting stale Velero label %s=%s from PVC %s", k, v, pvc.Name)
delete(pvc.Labels, k)
}
}
}
}
+173
View File
@@ -149,3 +149,176 @@ func TestBackupPVAction(t *testing.T) {
require.NoError(t, err)
assert.Empty(t, additional)
}
func TestCleanupStaleVeleroLabels(t *testing.T) {
tests := []struct {
name string
inputPVC *corev1api.PersistentVolumeClaim
backup *v1.Backup
expectedLabels map[string]string
expectedSelector *metav1.LabelSelector
}{
{
name: "removes restore-name labels",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"velero.io/restore-name": "old-restore",
"app": "myapp",
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"app": "myapp",
},
},
{
name: "removes backup-name labels that don't match current backup",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"velero.io/backup-name": "old-backup",
"app": "myapp",
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"app": "myapp",
},
},
{
name: "keeps backup-name labels that match current backup",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"velero.io/backup-name": "current-backup",
"app": "myapp",
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"velero.io/backup-name": "current-backup",
"app": "myapp",
},
},
{
name: "removes volume-snapshot-name labels",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"velero.io/volume-snapshot-name": "old-snapshot",
"app": "myapp",
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"app": "myapp",
},
},
{
name: "removes velero labels from selector match labels",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
},
Spec: corev1api.PersistentVolumeClaimSpec{
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"velero.io/restore-name": "old-restore",
"velero.io/backup-name": "old-backup",
"app": "myapp",
},
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: nil,
expectedSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"app": "myapp",
},
},
},
{
name: "handles PVC with no labels",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: nil,
},
{
name: "handles PVC with no selector",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"app": "myapp",
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"app": "myapp",
},
expectedSelector: nil,
},
{
name: "removes multiple stale velero labels",
inputPVC: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Labels: map[string]string{
"velero.io/restore-name": "old-restore",
"velero.io/backup-name": "old-backup",
"velero.io/volume-snapshot-name": "old-snapshot",
"app": "myapp",
"env": "prod",
},
},
Spec: corev1api.PersistentVolumeClaimSpec{
Selector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"velero.io/restore-name": "old-restore",
"app": "myapp",
},
},
},
},
backup: &v1.Backup{ObjectMeta: metav1.ObjectMeta{Name: "current-backup"}},
expectedLabels: map[string]string{
"app": "myapp",
"env": "prod",
},
expectedSelector: &metav1.LabelSelector{
MatchLabels: map[string]string{
"app": "myapp",
},
},
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
action := NewPVCAction(velerotest.NewLogger())
// Create a copy of the input PVC to avoid modifying the test case
pvcCopy := tc.inputPVC.DeepCopy()
action.cleanupStaleVeleroLabels(pvcCopy, tc.backup)
assert.Equal(t, tc.expectedLabels, pvcCopy.Labels, "Labels should match expected values")
assert.Equal(t, tc.expectedSelector, pvcCopy.Spec.Selector, "Selector should match expected values")
})
}
}
+93 -4
View File
@@ -44,6 +44,7 @@ import (
"k8s.io/apimachinery/pkg/api/resource"
internalvolumehelper "github.com/vmware-tanzu/velero/internal/volumehelper"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
velerov2alpha1 "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
veleroclient "github.com/vmware-tanzu/velero/pkg/client"
@@ -57,6 +58,7 @@ import (
"github.com/vmware-tanzu/velero/pkg/util/boolptr"
"github.com/vmware-tanzu/velero/pkg/util/csi"
kubeutil "github.com/vmware-tanzu/velero/pkg/util/kube"
podvolumeutil "github.com/vmware-tanzu/velero/pkg/util/podvolume"
)
// TODO: Replace hardcoded VolumeSnapshot finalizer strings with constants from
@@ -72,6 +74,14 @@ const (
type pvcBackupItemAction struct {
log logrus.FieldLogger
crClient crclient.Client
// pvcPodCache provides lazy per-namespace caching of PVC-to-Pod mappings.
// Since plugin instances are unique per backup (created via newPluginManager and
// cleaned up via CleanupClients at backup completion), we can safely cache this
// without mutex or backup UID tracking.
// This avoids the O(N*M) performance issue when there are many PVCs and pods.
// See issue #9179 and PR #9226 for details.
pvcPodCache *podvolumeutil.PVCPodCache
}
// AppliesTo returns information indicating that the PVCBackupItemAction
@@ -97,6 +107,59 @@ func (p *pvcBackupItemAction) validateBackup(backup velerov1api.Backup) (valid b
return true
}
// ensurePVCPodCacheForNamespace ensures the PVC-to-Pod cache is built for the given namespace.
// This uses lazy per-namespace caching following the pattern from PR #9226.
// Since plugin instances are unique per backup, we can safely cache without mutex or backup UID tracking.
func (p *pvcBackupItemAction) ensurePVCPodCacheForNamespace(ctx context.Context, namespace string) error {
// Initialize cache if needed
if p.pvcPodCache == nil {
p.pvcPodCache = podvolumeutil.NewPVCPodCache()
}
// Build cache for namespace if not already done
if !p.pvcPodCache.IsNamespaceBuilt(namespace) {
p.log.Debugf("Building PVC-to-Pod cache for namespace %s", namespace)
if err := p.pvcPodCache.BuildCacheForNamespace(ctx, namespace, p.crClient); err != nil {
return errors.Wrapf(err, "failed to build PVC-to-Pod cache for namespace %s", namespace)
}
}
return nil
}
// getVolumeHelperWithCache creates a VolumeHelper using the pre-built PVC-to-Pod cache.
// The cache should be ensured for the relevant namespace(s) before calling this.
func (p *pvcBackupItemAction) getVolumeHelperWithCache(backup *velerov1api.Backup) (internalvolumehelper.VolumeHelper, error) {
// Create VolumeHelper with our lazy-built cache
vh, err := internalvolumehelper.NewVolumeHelperImplWithCache(
*backup,
p.crClient,
p.log,
p.pvcPodCache,
)
if err != nil {
return nil, errors.Wrap(err, "failed to create VolumeHelper")
}
return vh, nil
}
// getOrCreateVolumeHelper returns a VolumeHelper with lazy per-namespace caching.
// The VolumeHelper uses the pvcPodCache which is populated lazily as namespaces are encountered.
// Callers should use ensurePVCPodCacheForNamespace before calling methods that need
// PVC-to-Pod lookups for a specific namespace.
// Since plugin instances are unique per backup (created via newPluginManager and
// cleaned up via CleanupClients at backup completion), we can safely cache this.
// See issue #9179 and PR #9226 for details.
func (p *pvcBackupItemAction) getOrCreateVolumeHelper(backup *velerov1api.Backup) (internalvolumehelper.VolumeHelper, error) {
// Initialize the PVC-to-Pod cache if needed
if p.pvcPodCache == nil {
p.pvcPodCache = podvolumeutil.NewPVCPodCache()
}
// Return the VolumeHelper with our lazily-built cache
// The cache will be populated incrementally as namespaces are encountered
return p.getVolumeHelperWithCache(backup)
}
func (p *pvcBackupItemAction) validatePVCandPV(
pvc corev1api.PersistentVolumeClaim,
item runtime.Unstructured,
@@ -248,12 +311,24 @@ func (p *pvcBackupItemAction) Execute(
return item, nil, "", nil, nil
}
shouldSnapshot, err := volumehelper.ShouldPerformSnapshotWithBackup(
// Ensure PVC-to-Pod cache is built for this namespace (lazy per-namespace caching)
if err := p.ensurePVCPodCacheForNamespace(context.TODO(), pvc.Namespace); err != nil {
return nil, nil, "", nil, err
}
// Get or create the cached VolumeHelper for this backup
vh, err := p.getOrCreateVolumeHelper(backup)
if err != nil {
return nil, nil, "", nil, err
}
shouldSnapshot, err := volumehelper.ShouldPerformSnapshotWithVolumeHelper(
item,
kuberesource.PersistentVolumeClaims,
*backup,
p.crClient,
p.log,
vh,
)
if err != nil {
return nil, nil, "", nil, err
@@ -621,8 +696,19 @@ func (p *pvcBackupItemAction) getVolumeSnapshotReference(
return nil, errors.Wrapf(err, "failed to list PVCs in VolumeGroupSnapshot group %q in namespace %q", group, pvc.Namespace)
}
// Ensure PVC-to-Pod cache is built for this namespace (lazy per-namespace caching)
if err := p.ensurePVCPodCacheForNamespace(ctx, pvc.Namespace); err != nil {
return nil, errors.Wrapf(err, "failed to build PVC-to-Pod cache for namespace %s", pvc.Namespace)
}
// Get the cached VolumeHelper for filtering PVCs by volume policy
vh, err := p.getOrCreateVolumeHelper(backup)
if err != nil {
return nil, errors.Wrapf(err, "failed to get VolumeHelper for filtering PVCs in group %q", group)
}
// Filter PVCs by volume policy
filteredPVCs, err := p.filterPVCsByVolumePolicy(groupedPVCs, backup)
filteredPVCs, err := p.filterPVCsByVolumePolicy(groupedPVCs, backup, vh)
if err != nil {
return nil, errors.Wrapf(err, "failed to filter PVCs by volume policy for VolumeGroupSnapshot group %q", group)
}
@@ -759,11 +845,12 @@ func (p *pvcBackupItemAction) listGroupedPVCs(ctx context.Context, namespace, la
func (p *pvcBackupItemAction) filterPVCsByVolumePolicy(
pvcs []corev1api.PersistentVolumeClaim,
backup *velerov1api.Backup,
vh internalvolumehelper.VolumeHelper,
) ([]corev1api.PersistentVolumeClaim, error) {
var filteredPVCs []corev1api.PersistentVolumeClaim
for _, pvc := range pvcs {
// Convert PVC to unstructured for ShouldPerformSnapshotWithBackup
// Convert PVC to unstructured for ShouldPerformSnapshotWithVolumeHelper
pvcMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&pvc)
if err != nil {
return nil, errors.Wrapf(err, "failed to convert PVC %s/%s to unstructured", pvc.Namespace, pvc.Name)
@@ -771,12 +858,14 @@ func (p *pvcBackupItemAction) filterPVCsByVolumePolicy(
unstructuredPVC := &unstructured.Unstructured{Object: pvcMap}
// Check if this PVC should be snapshotted according to volume policies
shouldSnapshot, err := volumehelper.ShouldPerformSnapshotWithBackup(
// Uses the cached VolumeHelper for better performance with many PVCs/pods
shouldSnapshot, err := volumehelper.ShouldPerformSnapshotWithVolumeHelper(
unstructuredPVC,
kuberesource.PersistentVolumeClaims,
*backup,
p.crClient,
p.log,
vh,
)
if err != nil {
return nil, errors.Wrapf(err, "failed to check volume policy for PVC %s/%s", pvc.Namespace, pvc.Name)
+147 -1
View File
@@ -842,7 +842,9 @@ volumePolicies:
crClient: client,
}
result, err := action.filterPVCsByVolumePolicy(tt.pvcs, backup)
// Pass nil for VolumeHelper in tests - it will fall back to creating a new one per call
// This is the expected behavior for testing and third-party plugins
result, err := action.filterPVCsByVolumePolicy(tt.pvcs, backup, nil)
if tt.expectError {
require.Error(t, err)
} else {
@@ -860,6 +862,111 @@ volumePolicies:
}
}
// TestFilterPVCsByVolumePolicyWithVolumeHelper tests filterPVCsByVolumePolicy when a
// pre-created VolumeHelper is passed (non-nil). This exercises the cached path used
// by the CSI PVC BIA plugin for better performance.
func TestFilterPVCsByVolumePolicyWithVolumeHelper(t *testing.T) {
// Create test PVCs and PVs
pvcs := []corev1api.PersistentVolumeClaim{
{
ObjectMeta: metav1.ObjectMeta{Name: "pvc-csi", Namespace: "ns-1"},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "pv-csi",
StorageClassName: pointer.String("sc-csi"),
},
Status: corev1api.PersistentVolumeClaimStatus{Phase: corev1api.ClaimBound},
},
{
ObjectMeta: metav1.ObjectMeta{Name: "pvc-nfs", Namespace: "ns-1"},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "pv-nfs",
StorageClassName: pointer.String("sc-nfs"),
},
Status: corev1api.PersistentVolumeClaimStatus{Phase: corev1api.ClaimBound},
},
}
pvs := []corev1api.PersistentVolume{
{
ObjectMeta: metav1.ObjectMeta{Name: "pv-csi"},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
CSI: &corev1api.CSIPersistentVolumeSource{Driver: "csi-driver"},
},
},
},
{
ObjectMeta: metav1.ObjectMeta{Name: "pv-nfs"},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
NFS: &corev1api.NFSVolumeSource{
Server: "nfs-server",
Path: "/export",
},
},
},
},
}
// Create fake client with PVs
objs := []runtime.Object{}
for i := range pvs {
objs = append(objs, &pvs[i])
}
client := velerotest.NewFakeControllerRuntimeClient(t, objs...)
// Create backup with volume policy that skips NFS volumes
volumePolicyStr := `
version: v1
volumePolicies:
- conditions:
nfs: {}
action:
type: skip
`
cm := &corev1api.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Name: "volume-policy",
Namespace: "velero",
},
Data: map[string]string{
"volume-policy": volumePolicyStr,
},
}
require.NoError(t, client.Create(t.Context(), cm))
backup := &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
ResourcePolicy: &corev1api.TypedLocalObjectReference{
Kind: "ConfigMap",
Name: "volume-policy",
},
},
}
action := &pvcBackupItemAction{
log: velerotest.NewLogger(),
crClient: client,
}
// Create a VolumeHelper using the same method the plugin would use
vh, err := action.getOrCreateVolumeHelper(backup)
require.NoError(t, err)
require.NotNil(t, vh)
// Test with the pre-created VolumeHelper (non-nil path)
result, err := action.filterPVCsByVolumePolicy(pvcs, backup, vh)
require.NoError(t, err)
// Should filter out the NFS PVC, leaving only the CSI PVC
require.Len(t, result, 1)
require.Equal(t, "pvc-csi", result[0].Name)
}
func TestDetermineCSIDriver(t *testing.T) {
tests := []struct {
name string
@@ -1959,3 +2066,42 @@ func TestPVCRequestSize(t *testing.T) {
})
}
}
// TestGetOrCreateVolumeHelper tests the VolumeHelper and PVC-to-Pod cache behavior.
// Since plugin instances are unique per backup (created via newPluginManager and
// cleaned up via CleanupClients at backup completion), we verify that the pvcPodCache
// is properly initialized and reused across calls.
func TestGetOrCreateVolumeHelper(t *testing.T) {
client := velerotest.NewFakeControllerRuntimeClient(t)
action := &pvcBackupItemAction{
log: velerotest.NewLogger(),
crClient: client,
}
backup := &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
UID: types.UID("test-uid-1"),
},
}
// Initially, pvcPodCache should be nil
require.Nil(t, action.pvcPodCache, "pvcPodCache should be nil initially")
// Get VolumeHelper first time - should create new cache and VolumeHelper
vh1, err := action.getOrCreateVolumeHelper(backup)
require.NoError(t, err)
require.NotNil(t, vh1)
// pvcPodCache should now be initialized
require.NotNil(t, action.pvcPodCache, "pvcPodCache should be initialized after first call")
cache1 := action.pvcPodCache
// Get VolumeHelper second time - should reuse the same cache
vh2, err := action.getOrCreateVolumeHelper(backup)
require.NoError(t, err)
require.NotNil(t, vh2)
// The pvcPodCache should be the same instance
require.Same(t, cache1, action.pvcPodCache, "Expected same pvcPodCache instance on repeated calls")
}
+18 -11
View File
@@ -84,17 +84,6 @@ func (p *volumeSnapshotBackupItemAction) Execute(
return nil, nil, "", nil, errors.WithStack(err)
}
additionalItems := make([]velero.ResourceIdentifier, 0)
if vs.Spec.VolumeSnapshotClassName != nil {
additionalItems = append(
additionalItems,
velero.ResourceIdentifier{
GroupResource: kuberesource.VolumeSnapshotClasses,
Name: *vs.Spec.VolumeSnapshotClassName,
},
)
}
if backup.Status.Phase == velerov1api.BackupPhaseFinalizing ||
backup.Status.Phase == velerov1api.BackupPhaseFinalizingPartiallyFailed {
p.log.
@@ -105,6 +94,24 @@ func (p *volumeSnapshotBackupItemAction) Execute(
return item, nil, "", nil, nil
}
additionalItems := make([]velero.ResourceIdentifier, 0)
if vs.Spec.VolumeSnapshotClassName != nil {
// This is still needed to add the VolumeSnapshotClass to the backup.
// The secret with VolumeSnapshotClass is still relevant to backup.
additionalItems = append(
additionalItems,
velero.ResourceIdentifier{
GroupResource: kuberesource.VolumeSnapshotClasses,
Name: *vs.Spec.VolumeSnapshotClassName,
},
)
// Because async operation will update VolumeSnapshot during finalizing phase.
// No matter what we do, VolumeSnapshotClass cannot be deleted. So skip it.
// Just deleting VolumeSnapshotClass during restore and delete is enough.
}
p.log.Infof("Getting VolumesnapshotContent for Volumesnapshot %s/%s",
vs.Namespace, vs.Name)
@@ -97,6 +97,10 @@ func (p *volumeSnapshotContentBackupItemAction) Execute(
})
}
// Because async operation will update VolumeSnapshotContent during finalizing phase.
// No matter what we do, VolumeSnapshotClass cannot be deleted. So skip it.
// Just deleting VolumeSnapshotClass during restore and delete is enough.
snapContMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&snapCont)
if err != nil {
return nil, nil, "", nil, errors.WithStack(err)
@@ -42,7 +42,7 @@ func TestVSCExecute(t *testing.T) {
expectedItems []velero.ResourceIdentifier
}{
{
name: "Invalid VolumeSnapshotClass",
name: "Invalid VolumeSnapshotContent",
item: velerotest.UnstructuredOrDie(
`
{
+92 -49
View File
@@ -117,7 +117,6 @@ type kubernetesBackupper struct {
podCommandExecutor podexec.PodCommandExecutor
podVolumeBackupperFactory podvolume.BackupperFactory
podVolumeTimeout time.Duration
podVolumeContext context.Context
defaultVolumesToFsBackup bool
clientPageSize int
uploaderType string
@@ -168,10 +167,39 @@ func NewKubernetesBackupper(
}, nil
}
// getNamespaceIncludesExcludes returns an IncludesExcludes list containing which namespaces to
// include and exclude from the backup.
func getNamespaceIncludesExcludes(backup *velerov1api.Backup) *collections.IncludesExcludes {
return collections.NewIncludesExcludes().Includes(backup.Spec.IncludedNamespaces...).Excludes(backup.Spec.ExcludedNamespaces...)
// getNamespaceIncludesExcludesAndArgoCDNamespaces returns an IncludesExcludes list containing which namespaces to
// include and exclude from the backup and a list of namespaces managed by ArgoCD.
func getNamespaceIncludesExcludesAndArgoCDNamespaces(backup *velerov1api.Backup, kbClient kbclient.Client) (*collections.NamespaceIncludesExcludes, []string, error) {
nsList := corev1api.NamespaceList{}
activeNamespaces := []string{}
nsManagedByArgoCD := []string{}
if err := kbClient.List(context.Background(), &nsList); err != nil {
return nil, nsManagedByArgoCD, err
}
for _, ns := range nsList.Items {
activeNamespaces = append(activeNamespaces, ns.Name)
}
// Set ActiveNamespaces first, then set includes/excludes
includesExcludes := collections.NewNamespaceIncludesExcludes().
ActiveNamespaces(activeNamespaces).
Includes(backup.Spec.IncludedNamespaces...).
Excludes(backup.Spec.ExcludedNamespaces...)
// Expand wildcards if needed
if err := includesExcludes.ExpandIncludesExcludes(); err != nil {
return nil, []string{}, err
}
// Check for ArgoCD managed namespaces in the namespaces that will be included
for _, ns := range nsList.Items {
nsLabels := ns.GetLabels()
if len(nsLabels[ArgoCDManagedByNamespaceLabel]) > 0 && includesExcludes.ShouldInclude(ns.Name) {
nsManagedByArgoCD = append(nsManagedByArgoCD, ns.Name)
}
}
return includesExcludes, nsManagedByArgoCD, nil
}
func getResourceHooks(hookSpecs []velerov1api.BackupResourceHookSpec, discoveryHelper discovery.Helper) ([]hook.ResourceHook, error) {
@@ -245,8 +273,35 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
if err := kb.writeBackupVersion(tw); err != nil {
return errors.WithStack(err)
}
var err error
var nsManagedByArgoCD []string
backupRequest.NamespaceIncludesExcludes, nsManagedByArgoCD, err = getNamespaceIncludesExcludesAndArgoCDNamespaces(backupRequest.Backup, kb.kbClient)
if err != nil {
log.WithError(err).Errorf("error getting namespace includes/excludes")
return err
}
if backupRequest.NamespaceIncludesExcludes.IsWildcardExpanded() {
expandedIncludes := backupRequest.NamespaceIncludesExcludes.GetIncludes()
expandedExcludes := backupRequest.NamespaceIncludesExcludes.GetExcludes()
// Get the final namespace list after wildcard expansion
wildcardResult, err := backupRequest.NamespaceIncludesExcludes.ResolveNamespaceList()
if err != nil {
log.WithError(err).Errorf("error resolving namespace list")
return err
}
log.WithFields(logrus.Fields{
"expandedIncludes": expandedIncludes,
"expandedExcludes": expandedExcludes,
"wildcardResult": wildcardResult,
"includedCount": len(expandedIncludes),
"excludedCount": len(expandedExcludes),
"resultCount": len(wildcardResult),
}).Info("Successfully expanded wildcard patterns")
}
backupRequest.NamespaceIncludesExcludes = getNamespaceIncludesExcludes(backupRequest.Backup)
log.Infof("Including namespaces: %s", backupRequest.NamespaceIncludesExcludes.IncludesString())
log.Infof("Excluding namespaces: %s", backupRequest.NamespaceIncludesExcludes.ExcludesString())
@@ -254,12 +309,8 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
// We will check for the existence of a ArgoCD label in the includedNamespaces and add a warning
// so that users are at least aware about the existence of argoCD managed ns in their backup
// Related Issue: https://github.com/vmware-tanzu/velero/issues/7905
if len(backupRequest.Spec.IncludedNamespaces) > 0 {
nsManagedByArgoCD := getNamespacesManagedByArgoCD(kb.kbClient, backupRequest.Spec.IncludedNamespaces, log)
if len(nsManagedByArgoCD) > 0 {
log.Warnf("backup operation may encounter complications and potentially produce undesirable results due to the inclusion of namespaces %v managed by ArgoCD in the backup.", nsManagedByArgoCD)
}
if len(nsManagedByArgoCD) > 0 {
log.Warnf("backup operation may encounter complications and potentially produce undesirable results due to the inclusion of namespaces %v managed by ArgoCD in the backup.", nsManagedByArgoCD)
}
if collections.UseOldResourceFilters(backupRequest.Spec) {
@@ -284,7 +335,6 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
log.Infof("Backing up all volumes using pod volume backup: %t", boolptr.IsSetToTrue(backupRequest.Backup.Spec.DefaultVolumesToFsBackup))
var err error
backupRequest.ResourceHooks, err = getResourceHooks(backupRequest.Spec.Hooks.Resources, kb.discoveryHelper)
if err != nil {
log.WithError(errors.WithStack(err)).Debugf("Error from getResourceHooks")
@@ -314,12 +364,12 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
}
var podVolumeCancelFunc context.CancelFunc
kb.podVolumeContext, podVolumeCancelFunc = context.WithTimeout(context.Background(), podVolumeTimeout)
podVolumeContext, podVolumeCancelFunc := context.WithTimeout(context.Background(), podVolumeTimeout)
defer podVolumeCancelFunc()
var podVolumeBackupper podvolume.Backupper
if kb.podVolumeBackupperFactory != nil {
podVolumeBackupper, err = kb.podVolumeBackupperFactory.NewBackupper(kb.podVolumeContext, log, backupRequest.Backup, kb.uploaderType)
podVolumeBackupper, err = kb.podVolumeBackupperFactory.NewBackupper(podVolumeContext, log, backupRequest.Backup, kb.uploaderType)
if err != nil {
log.WithError(errors.WithStack(err)).Debugf("Error from NewBackupper")
return errors.WithStack(err)
@@ -358,6 +408,28 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
}
backupRequest.Status.Progress = &velerov1api.BackupProgress{TotalItems: len(items)}
// Resolve namespaces for PVC-to-Pod cache building in volumehelper.
// See issue #9179 for details.
namespaces, err := backupRequest.NamespaceIncludesExcludes.ResolveNamespaceList()
if err != nil {
log.WithError(err).Error("Failed to resolve namespace list for PVC-to-Pod cache")
return err
}
volumeHelperImpl, err := volumehelper.NewVolumeHelperImplWithNamespaces(
backupRequest.ResPolicies,
backupRequest.Spec.SnapshotVolumes,
log,
kb.kbClient,
boolptr.IsSetToTrue(backupRequest.Spec.DefaultVolumesToFsBackup),
!backupRequest.ResourceIncludesExcludes.ShouldInclude(kuberesource.PersistentVolumeClaims.String()),
namespaces,
)
if err != nil {
log.WithError(err).Error("Failed to build PVC-to-Pod cache for volume policy lookups")
return err
}
itemBackupper := &itemBackupper{
backupRequest: backupRequest,
tarWriter: tw,
@@ -365,20 +437,14 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
kbClient: kb.kbClient,
discoveryHelper: kb.discoveryHelper,
podVolumeBackupper: podVolumeBackupper,
podVolumeContext: podVolumeContext,
podVolumeSnapshotTracker: podvolume.NewTracker(),
volumeSnapshotterCache: NewVolumeSnapshotterCache(volumeSnapshotterGetter),
itemHookHandler: &hook.DefaultItemHookHandler{
PodCommandExecutor: kb.podCommandExecutor,
},
hookTracker: hook.NewHookTracker(),
volumeHelperImpl: volumehelper.NewVolumeHelperImpl(
backupRequest.ResPolicies,
backupRequest.Spec.SnapshotVolumes,
log,
kb.kbClient,
boolptr.IsSetToTrue(backupRequest.Spec.DefaultVolumesToFsBackup),
!backupRequest.ResourceIncludesExcludes.ShouldInclude(kuberesource.PersistentVolumeClaims.String()),
),
hookTracker: hook.NewHookTracker(),
volumeHelperImpl: volumeHelperImpl,
kubernetesBackupper: kb,
}
@@ -546,7 +612,7 @@ func (kb *kubernetesBackupper) BackupWithResolvers(
log.Infof("Backing Up Item Block including %s %s/%s (%v items in block)", items[i].groupResource.String(), items[i].namespace, items[i].name, len(itemBlock.Items))
wg.Add(1)
backupRequest.ItemBlockChannel <- ItemBlockInput{
backupRequest.WorkerPool.GetInputChannel() <- ItemBlockInput{
itemBlock: itemBlock,
returnChan: itemBlockReturn,
}
@@ -797,7 +863,7 @@ func (kb *kubernetesBackupper) handleItemBlockPostHooks(itemBlock *BackupItemBlo
log := itemBlock.Log
// the post hooks will not execute until all PVBs of the item block pods are processed
if err := kb.waitUntilPVBsProcessed(kb.podVolumeContext, log, itemBlock, hookPods); err != nil {
if err := kb.waitUntilPVBsProcessed(itemBlock.itemBackupper.podVolumeContext, log, itemBlock, hookPods); err != nil {
log.WithError(err).Error("failed to wait PVBs processed for the ItemBlock")
return
}
@@ -1256,26 +1322,3 @@ func putVolumeInfos(
return backupStore.PutBackupVolumeInfos(backupName, backupVolumeInfoBuf)
}
func getNamespacesManagedByArgoCD(kbClient kbclient.Client, includedNamespaces []string, log logrus.FieldLogger) []string {
var nsManagedByArgoCD []string
for _, nsName := range includedNamespaces {
ns := corev1api.Namespace{}
if err := kbClient.Get(context.Background(), kbclient.ObjectKey{Name: nsName}, &ns); err != nil {
// check for only those ns that exist and are included in backup
// here we ignore cases like "" or "*" specified under includedNamespaces
if apierrors.IsNotFound(err) {
continue
}
log.WithError(err).Errorf("error getting namespace %s", nsName)
continue
}
nsLabels := ns.GetLabels()
if len(nsLabels[ArgoCDManagedByNamespaceLabel]) > 0 {
nsManagedByArgoCD = append(nsManagedByArgoCD, nsName)
}
}
return nsManagedByArgoCD
}
+36 -36
View File
@@ -79,7 +79,7 @@ func TestBackedUpItemsMatchesTarballContents(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: h.itemBlockPool.GetInputChannel(),
WorkerPool: &h.itemBlockPool,
}
backupFile := bytes.NewBuffer([]byte{})
@@ -141,7 +141,7 @@ func TestBackupProgressIsUpdated(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: h.itemBlockPool.GetInputChannel(),
WorkerPool: &h.itemBlockPool,
}
backupFile := bytes.NewBuffer([]byte{})
@@ -881,7 +881,7 @@ func TestBackupOldResourceFiltering(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1062,7 +1062,7 @@ func TestCRDInclusion(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1161,7 +1161,7 @@ func TestBackupResourceCohabitation(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1190,7 +1190,7 @@ func TestBackupUsesNewCohabitatingResourcesForEachBackup(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: h.itemBlockPool.GetInputChannel(),
WorkerPool: &h.itemBlockPool,
}
backup1File := bytes.NewBuffer([]byte{})
@@ -1206,7 +1206,7 @@ func TestBackupUsesNewCohabitatingResourcesForEachBackup(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: h.itemBlockPool.GetInputChannel(),
WorkerPool: &h.itemBlockPool,
}
backup2File := bytes.NewBuffer([]byte{})
@@ -1260,7 +1260,7 @@ func TestBackupResourceOrdering(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1381,7 +1381,7 @@ func TestBackupItemActionsForSkippedPV(t *testing.T) {
Backup: defaultBackup().SnapshotVolumes(false).Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
resPolicies: &resourcepolicies.ResourcePolicies{
Version: "v1",
@@ -1428,8 +1428,8 @@ func TestBackupItemActionsForSkippedPV(t *testing.T) {
},
includedPVs: map[string]struct{}{},
},
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
BackedUpItems: NewBackedUpItemsMap(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVCs(
@@ -1679,7 +1679,7 @@ func TestBackupActionsRunForCorrectItems(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1764,7 +1764,7 @@ func TestBackupWithInvalidActions(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -1918,7 +1918,7 @@ func TestBackupActionModifications(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -2178,7 +2178,7 @@ func TestBackupActionAdditionalItems(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -2439,7 +2439,7 @@ func TestItemBlockActionsRunForCorrectItems(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -2524,7 +2524,7 @@ func TestBackupWithInvalidItemBlockActions(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -2780,7 +2780,7 @@ func TestItemBlockActionRelatedItems(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -2948,7 +2948,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -2984,7 +2984,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3021,7 +3021,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3058,7 +3058,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3095,7 +3095,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3130,7 +3130,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3148,7 +3148,7 @@ func TestBackupWithSnapshots(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3169,7 +3169,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3188,7 +3188,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3210,7 +3210,7 @@ func TestBackupWithSnapshots(t *testing.T) {
},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.PVs(
@@ -3344,7 +3344,7 @@ func TestBackupWithAsyncOperations(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.Pods(
@@ -3376,7 +3376,7 @@ func TestBackupWithAsyncOperations(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.Pods(
@@ -3408,7 +3408,7 @@ func TestBackupWithAsyncOperations(t *testing.T) {
Backup: defaultBackup().Result(),
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
},
apiResources: []*test.APIResource{
test.Pods(
@@ -3494,7 +3494,7 @@ func TestBackupWithInvalidHooks(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -3968,7 +3968,7 @@ func TestBackupWithHooks(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
podCommandExecutor = new(test.MockPodCommandExecutor)
@@ -4193,7 +4193,7 @@ func TestBackupWithPodVolume(t *testing.T) {
SnapshotLocations: []*velerov1.VolumeSnapshotLocation{tc.vsl},
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -5312,7 +5312,7 @@ func TestBackupNewResourceFiltering(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
@@ -5477,7 +5477,7 @@ func TestBackupNamespaces(t *testing.T) {
Backup: tc.backup,
SkippedPVTracker: NewSkipPVTracker(),
BackedUpItems: NewBackedUpItemsMap(),
ItemBlockChannel: itemBlockPool.GetInputChannel(),
WorkerPool: itemBlockPool,
}
backupFile = bytes.NewBuffer([]byte{})
)
+1
View File
@@ -69,6 +69,7 @@ type itemBackupper struct {
kbClient kbClient.Client
discoveryHelper discovery.Helper
podVolumeBackupper podvolume.Backupper
podVolumeContext context.Context
podVolumeSnapshotTracker *podvolume.Tracker
kubernetesBackupper *kubernetesBackupper
volumeSnapshotterCache *VolumeSnapshotterCache
+19 -11
View File
@@ -71,7 +71,7 @@ type itemCollector struct {
type nsTracker struct {
singleLabelSelector labels.Selector
orLabelSelector []labels.Selector
namespaceFilter *collections.IncludesExcludes
namespaceFilter *collections.NamespaceIncludesExcludes
logger logrus.FieldLogger
namespaceMap map[string]bool
@@ -103,7 +103,7 @@ func (nt *nsTracker) init(
unstructuredNSs []unstructured.Unstructured,
singleLabelSelector labels.Selector,
orLabelSelector []labels.Selector,
namespaceFilter *collections.IncludesExcludes,
namespaceFilter *collections.NamespaceIncludesExcludes,
logger logrus.FieldLogger,
) {
if nt.namespaceMap == nil {
@@ -635,7 +635,7 @@ func coreGroupResourcePriority(resource string) int {
// getNamespacesToList examines ie and resolves the includes and excludes to a full list of
// namespaces to list. If ie is nil or it includes *, the result is just "" (list across all
// namespaces). Otherwise, the result is a list of every included namespace minus all excluded ones.
func getNamespacesToList(ie *collections.IncludesExcludes) []string {
func getNamespacesToList(ie *collections.NamespaceIncludesExcludes) []string {
if ie == nil {
return []string{""}
}
@@ -753,21 +753,28 @@ func (r *itemCollector) collectNamespaces(
}
unstructuredList, err := resourceClient.List(metav1.ListOptions{})
activeNamespacesHashSet := make(map[string]bool)
for _, namespace := range unstructuredList.Items {
activeNamespacesHashSet[namespace.GetName()] = true
}
if err != nil {
log.WithError(errors.WithStack(err)).Error("error list namespaces")
return nil, errors.WithStack(err)
}
for _, includedNSName := range r.backupRequest.Backup.Spec.IncludedNamespaces {
// Change to look at the struct includes/excludes
// In case wildcards are expanded, we need to look at the struct includes/excludes
for _, includedNSName := range r.backupRequest.NamespaceIncludesExcludes.GetIncludes() {
nsExists := false
// Skip checking the namespace existing when it's "*".
if includedNSName == "*" {
continue
}
for _, unstructuredNS := range unstructuredList.Items {
if unstructuredNS.GetName() == includedNSName {
nsExists = true
}
if _, ok := activeNamespacesHashSet[includedNSName]; ok {
nsExists = true
}
if !nsExists {
@@ -809,17 +816,18 @@ func (r *itemCollector) collectNamespaces(
var items []*kubernetesResource
for index := range unstructuredList.Items {
nsName := unstructuredList.Items[index].GetName()
path, err := r.writeToFile(&unstructuredList.Items[index])
if err != nil {
log.WithError(err).Errorf("Error writing item %s to file",
unstructuredList.Items[index].GetName())
log.WithError(err).Errorf("Error writing item %s to file", nsName)
continue
}
items = append(items, &kubernetesResource{
groupResource: gr,
preferredGVR: preferredGVR,
name: unstructuredList.Items[index].GetName(),
name: nsName,
path: path,
kind: resource.Kind,
})
+9 -9
View File
@@ -153,7 +153,7 @@ func TestFilterNamespaces(t *testing.T) {
func TestItemCollectorBackupNamespaces(t *testing.T) {
tests := []struct {
name string
ie *collections.IncludesExcludes
ie *collections.NamespaceIncludesExcludes
namespaces []*corev1api.Namespace
backup *velerov1api.Backup
expectedTrackedNS []string
@@ -162,7 +162,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
{
name: "ns filter by namespace IE filter",
backup: builder.ForBackup("velero", "backup").Result(),
ie: collections.NewIncludesExcludes().Includes("ns1"),
ie: collections.NewNamespaceIncludesExcludes().Includes("ns1"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -174,7 +174,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
backup: builder.ForBackup("velero", "backup").LabelSelector(&metav1.LabelSelector{
MatchLabels: map[string]string{"name": "ns1"},
}).Result(),
ie: collections.NewIncludesExcludes().Includes("*"),
ie: collections.NewNamespaceIncludesExcludes().Includes("*"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -186,7 +186,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
backup: builder.ForBackup("velero", "backup").OrLabelSelector([]*metav1.LabelSelector{
{MatchLabels: map[string]string{"name": "ns1"}},
}).Result(),
ie: collections.NewIncludesExcludes().Includes("*"),
ie: collections.NewNamespaceIncludesExcludes().Includes("*"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -198,7 +198,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
backup: builder.ForBackup("velero", "backup").LabelSelector(&metav1.LabelSelector{
MatchLabels: map[string]string{"name": "ns1"},
}).Result(),
ie: collections.NewIncludesExcludes().Excludes("ns1"),
ie: collections.NewNamespaceIncludesExcludes().Excludes("ns1"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -210,7 +210,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
backup: builder.ForBackup("velero", "backup").OrLabelSelector([]*metav1.LabelSelector{
{MatchLabels: map[string]string{"name": "ns1"}},
}).Result(),
ie: collections.NewIncludesExcludes().Excludes("ns1", "ns2"),
ie: collections.NewNamespaceIncludesExcludes().Excludes("ns1", "ns2"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -221,7 +221,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
{
name: "No ns filters",
backup: builder.ForBackup("velero", "backup").Result(),
ie: collections.NewIncludesExcludes().Includes("*"),
ie: collections.NewNamespaceIncludesExcludes().Includes("*"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -231,7 +231,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
{
name: "ns specified by the IncludeNamespaces cannot be found",
backup: builder.ForBackup("velero", "backup").IncludedNamespaces("ns1", "invalid", "*").Result(),
ie: collections.NewIncludesExcludes().Includes("ns1", "invalid", "*"),
ie: collections.NewNamespaceIncludesExcludes().Includes("ns1", "invalid", "*"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").ObjectMeta(builder.WithLabels("name", "ns1")).Phase(corev1api.NamespaceActive).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
@@ -242,7 +242,7 @@ func TestItemCollectorBackupNamespaces(t *testing.T) {
{
name: "terminating ns should not tracked",
backup: builder.ForBackup("velero", "backup").Result(),
ie: collections.NewIncludesExcludes().Includes("ns1", "ns2"),
ie: collections.NewNamespaceIncludesExcludes().Includes("ns1", "ns2"),
namespaces: []*corev1api.Namespace{
builder.ForNamespace("ns1").Phase(corev1api.NamespaceTerminating).Result(),
builder.ForNamespace("ns2").Phase(corev1api.NamespaceActive).Result(),
+6 -2
View File
@@ -57,7 +57,7 @@ type Request struct {
*velerov1api.Backup
StorageLocation *velerov1api.BackupStorageLocation
SnapshotLocations []*velerov1api.VolumeSnapshotLocation
NamespaceIncludesExcludes *collections.IncludesExcludes
NamespaceIncludesExcludes *collections.NamespaceIncludesExcludes
ResourceIncludesExcludes collections.IncludesExcludesInterface
ResourceHooks []hook.ResourceHook
ResolvedActions []framework.BackupItemResolvedActionV2
@@ -69,7 +69,7 @@ type Request struct {
ResPolicies *resourcepolicies.Policies
SkippedPVTracker *skipPVTracker
VolumesInformation volume.BackupVolumesInformation
ItemBlockChannel chan ItemBlockInput
WorkerPool *ItemBlockWorkerPool
}
// BackupVolumesInformation contains the information needs by generating
@@ -103,3 +103,7 @@ func (r *Request) FillVolumesInformation() {
r.VolumesInformation.BackupOperations = *r.GetItemOperationsList()
r.VolumesInformation.BackupName = r.Backup.Name
}
func (r *Request) StopWorkerPool() {
r.WorkerPool.Stop()
}
+6
View File
@@ -222,6 +222,12 @@ func (b *BackupBuilder) Phase(phase velerov1api.BackupPhase) *BackupBuilder {
return b
}
// Phase sets the Backup's queue position.
func (b *BackupBuilder) QueuePosition(queuePos int) *BackupBuilder {
b.object.Status.QueuePosition = queuePos
return b
}
// StorageLocation sets the Backup's storage location.
func (b *BackupBuilder) StorageLocation(location string) *BackupBuilder {
b.object.Spec.StorageLocation = location
@@ -93,6 +93,15 @@ func (b *BackupStorageLocationBuilder) CACert(val []byte) *BackupStorageLocation
return b
}
// CACertRef sets the BackupStorageLocation's object storage CACertRef (Secret reference).
func (b *BackupStorageLocationBuilder) CACertRef(selector *corev1api.SecretKeySelector) *BackupStorageLocationBuilder {
if b.object.Spec.StorageType.ObjectStorage == nil {
b.object.Spec.StorageType.ObjectStorage = new(velerov1api.ObjectStorageLocation)
}
b.object.Spec.ObjectStorage.CACertRef = selector
return b
}
// Default sets the BackupStorageLocation's is default or not
func (b *BackupStorageLocationBuilder) Default(isDefault bool) *BackupStorageLocationBuilder {
b.object.Spec.Default = isDefault
+9 -4
View File
@@ -22,6 +22,8 @@ import (
corev1api "k8s.io/api/core/v1"
apimachineryRuntime "k8s.io/apimachinery/pkg/runtime"
"github.com/vmware-tanzu/velero/pkg/label"
)
// ContainerBuilder builds Container objects
@@ -45,9 +47,9 @@ func ForPluginContainer(image string, pullPolicy corev1api.PullPolicy) *Containe
return ForContainer(getName(image), image).PullPolicy(pullPolicy).VolumeMounts(volumeMount)
}
// getName returns the 'name' component of a docker
// image that includes the entire string except the registry name, and transforms the combined
// string into a RFC-1123 compatible name.
// getName returns the 'name' component of a docker image that includes the entire string
// except the registry name, and transforms the combined string into a DNS-1123 compatible name
// that fits within the 63-character limit for Kubernetes container names.
func getName(image string) string {
slashIndex := strings.Index(image, "/")
slashCount := 0
@@ -83,7 +85,10 @@ func getName(image string) string {
re := strings.NewReplacer("/", "-",
"_", "-",
".", "-")
return re.Replace(image[start:end])
name := re.Replace(image[start:end])
// Ensure the name doesn't exceed Kubernetes container name length limit
return label.GetValidName(name)
}
// Result returns the built Container.
+47
View File
@@ -100,3 +100,50 @@ func TestGetName(t *testing.T) {
})
}
}
func TestGetNameWithLongPaths(t *testing.T) {
tests := []struct {
name string
image string
validate func(t *testing.T, result string)
}{
{
name: "plugin with deeply nested repository path exceeding 63 characters",
image: "arohcpsvcdev.azurecr.io/redhat-user-workloads/ocp-art-tenant/oadp-hypershift-oadp-plugin-main@sha256:adb840bf3890b4904a8cdda1a74c82cf8d96c52eba9944ac10e795335d6fd450",
validate: func(t *testing.T, result string) {
t.Helper()
// Should not exceed DNS-1123 label limit of 63 characters
assert.LessOrEqual(t, len(result), 63, "Container name must satisfy DNS-1123 label constraints (max 63 chars)")
// Should be exactly 63 characters (truncated with hash)
assert.Len(t, result, 63)
// Should be deterministic
result2 := getName("arohcpsvcdev.azurecr.io/redhat-user-workloads/ocp-art-tenant/oadp-hypershift-oadp-plugin-main@sha256:adb840bf3890b4904a8cdda1a74c82cf8d96c52eba9944ac10e795335d6fd450")
assert.Equal(t, result, result2)
},
},
{
name: "plugin with normal path length (should remain unchanged)",
image: "arohcpsvcdev.azurecr.io/konveyor/velero-plugin-for-microsoft-azure@sha256:b2db5f09da514e817a74c992dcca5f90b77c2ab0b2797eba947d224271d6070e",
validate: func(t *testing.T, result string) {
t.Helper()
assert.Equal(t, "konveyor-velero-plugin-for-microsoft-azure", result)
assert.LessOrEqual(t, len(result), 63)
},
},
{
name: "very long nested path",
image: "registry.example.com/org/team/project/subproject/component/service/application-name-with-many-words:v1.2.3",
validate: func(t *testing.T, result string) {
t.Helper()
assert.LessOrEqual(t, len(result), 63)
},
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
result := getName(test.image)
test.validate(t, result)
})
}
}
+10
View File
@@ -102,6 +102,11 @@ type StatusUpdater interface {
UpdateStatus(obj *unstructured.Unstructured, opts metav1.UpdateOptions) (*unstructured.Unstructured, error)
}
// Applier applies changes to an object using server-side apply
type Applier interface {
Apply(name string, obj *unstructured.Unstructured, opts metav1.ApplyOptions) (*unstructured.Unstructured, error)
}
// Dynamic contains client methods that Velero needs for backing up and restoring resources.
type Dynamic interface {
Creator
@@ -111,6 +116,7 @@ type Dynamic interface {
Patcher
Deletor
StatusUpdater
Applier
}
// dynamicResourceClient implements Dynamic.
@@ -136,6 +142,10 @@ func (d *dynamicResourceClient) Get(name string, opts metav1.GetOptions) (*unstr
return d.resourceClient.Get(context.TODO(), name, opts)
}
func (d *dynamicResourceClient) Apply(name string, obj *unstructured.Unstructured, opts metav1.ApplyOptions) (*unstructured.Unstructured, error) {
return d.resourceClient.Apply(context.TODO(), name, obj, opts)
}
func (d *dynamicResourceClient) Patch(name string, data []byte) (*unstructured.Unstructured, error) {
return d.resourceClient.Patch(context.TODO(), name, types.MergePatchType, data, metav1.PatchOptions{})
}
+11 -1
View File
@@ -89,8 +89,10 @@ type Options struct {
RepoMaintenanceJobConfigMap string
NodeAgentConfigMap string
ItemBlockWorkerCount int
ConcurrentBackups int
NodeAgentDisableHostPath bool
kubeletRootDir string
Apply bool
ServerPriorityClassName string
NodeAgentPriorityClassName string
}
@@ -101,6 +103,7 @@ func (o *Options) BindFlags(flags *pflag.FlagSet) {
flags.StringVar(&o.BucketName, "bucket", o.BucketName, "Name of the object storage bucket where backups should be stored")
flags.StringVar(&o.SecretFile, "secret-file", o.SecretFile, "File containing credentials for backup and volume provider. If not specified, --no-secret must be used for confirmation. Optional.")
flags.BoolVar(&o.NoSecret, "no-secret", o.NoSecret, "Flag indicating if a secret should be created. Must be used as confirmation if --secret-file is not provided. Optional.")
flags.BoolVar(&o.Apply, "apply", o.Apply, "Flag indicating if resources should be applied instead of created. This can be used for updating existing resources.")
flags.BoolVar(&o.NoDefaultBackupLocation, "no-default-backup-location", o.NoDefaultBackupLocation, "Flag indicating if a default backup location should be created. Must be used as confirmation if --bucket or --provider are not provided. Optional.")
flags.StringVar(&o.Image, "image", o.Image, "Image to use for the Velero and node agent pods. Optional.")
flags.StringVar(&o.Prefix, "prefix", o.Prefix, "Prefix under which all Velero data should be stored within the bucket. Optional.")
@@ -196,6 +199,12 @@ func (o *Options) BindFlags(flags *pflag.FlagSet) {
o.ItemBlockWorkerCount,
"Number of worker threads to process ItemBlocks. Default is one. Optional.",
)
flags.IntVar(
&o.ConcurrentBackups,
"concurrent-backups",
o.ConcurrentBackups,
"Number of backups to process concurrently. Default is one. Optional.",
)
flags.StringVar(
&o.ServerPriorityClassName,
"server-priority-class-name",
@@ -313,6 +322,7 @@ func (o *Options) AsVeleroOptions() (*install.VeleroOptions, error) {
RepoMaintenanceJobConfigMap: o.RepoMaintenanceJobConfigMap,
NodeAgentConfigMap: o.NodeAgentConfigMap,
ItemBlockWorkerCount: o.ItemBlockWorkerCount,
ConcurrentBackups: o.ConcurrentBackups,
KubeletRootDir: o.kubeletRootDir,
NodeAgentDisableHostPath: o.NodeAgentDisableHostPath,
ServerPriorityClassName: o.ServerPriorityClassName,
@@ -408,7 +418,7 @@ func (o *Options) Run(c *cobra.Command, f client.Factory) error {
errorMsg := fmt.Sprintf("\n\nError installing Velero. Use `kubectl logs deploy/velero -n %s` to check the deploy logs", o.Namespace)
err = install.Install(dynamicFactory, kbClient, resources, os.Stdout)
err = install.Install(dynamicFactory, kbClient, resources, os.Stdout, o.Apply)
if err != nil {
return errors.Wrap(err, errorMsg)
}
+52 -2
View File
@@ -354,16 +354,62 @@ func (s *nodeAgentServer) run() {
s.logger.Infof("Using customized cachePVC config %v", cachePVCConfig)
}
var podLabels map[string]string
if s.dataPathConfigs != nil && len(s.dataPathConfigs.PodLabels) > 0 {
podLabels = s.dataPathConfigs.PodLabels
s.logger.Infof("Using customized pod labels %+v", podLabels)
}
var podAnnotations map[string]string
if s.dataPathConfigs != nil && len(s.dataPathConfigs.PodAnnotations) > 0 {
podAnnotations = s.dataPathConfigs.PodAnnotations
s.logger.Infof("Using customized pod annotations %+v", podAnnotations)
}
if s.backupRepoConfigs != nil {
s.logger.Infof("Using backup repo config %v", s.backupRepoConfigs)
}
pvbReconciler := controller.NewPodVolumeBackupReconciler(s.mgr.GetClient(), s.mgr, s.kubeClient, s.dataPathMgr, s.vgdpCounter, s.nodeName, s.config.dataMoverPrepareTimeout, s.config.resourceTimeout, podResources, s.metrics, s.logger, dataMovePriorityClass, privilegedFsBackup)
pvbReconciler := controller.NewPodVolumeBackupReconciler(
s.mgr.GetClient(),
s.mgr,
s.kubeClient,
s.dataPathMgr,
s.vgdpCounter,
s.nodeName,
s.config.dataMoverPrepareTimeout,
s.config.resourceTimeout,
podResources,
s.metrics,
s.logger,
dataMovePriorityClass,
privilegedFsBackup,
podLabels,
podAnnotations,
)
if err := pvbReconciler.SetupWithManager(s.mgr); err != nil {
s.logger.Fatal(err, "unable to create controller", "controller", constant.ControllerPodVolumeBackup)
}
pvrReconciler := controller.NewPodVolumeRestoreReconciler(s.mgr.GetClient(), s.mgr, s.kubeClient, s.dataPathMgr, s.vgdpCounter, s.nodeName, s.config.dataMoverPrepareTimeout, s.config.resourceTimeout, s.backupRepoConfigs, cachePVCConfig, podResources, s.logger, dataMovePriorityClass, privilegedFsBackup, s.repoConfigMgr)
pvrReconciler := controller.NewPodVolumeRestoreReconciler(
s.mgr.GetClient(),
s.mgr,
s.kubeClient,
s.dataPathMgr,
s.vgdpCounter,
s.nodeName,
s.config.dataMoverPrepareTimeout,
s.config.resourceTimeout,
s.backupRepoConfigs,
cachePVCConfig,
podResources,
s.logger,
dataMovePriorityClass,
privilegedFsBackup,
s.repoConfigMgr,
podLabels,
podAnnotations,
)
if err := pvrReconciler.SetupWithManager(s.mgr); err != nil {
s.logger.WithError(err).Fatal("Unable to create the pod volume restore controller")
}
@@ -388,6 +434,8 @@ func (s *nodeAgentServer) run() {
s.logger,
s.metrics,
dataMovePriorityClass,
podLabels,
podAnnotations,
)
if err := dataUploadReconciler.SetupWithManager(s.mgr); err != nil {
s.logger.WithError(err).Fatal("Unable to create the data upload controller")
@@ -416,6 +464,8 @@ func (s *nodeAgentServer) run() {
s.metrics,
dataMovePriorityClass,
s.repoConfigMgr,
podLabels,
podAnnotations,
)
if err := dataDownloadReconciler.SetupWithManager(s.mgr); err != nil {
+10
View File
@@ -47,11 +47,13 @@ const (
defaultDisableInformerCache = false
DefaultItemBlockWorkerCount = 1
DefaultConcurrentBackups = 1
)
var (
// DisableableControllers is a list of controllers that can be disabled
DisableableControllers = []string{
constant.ControllerBackupQueue,
constant.ControllerBackup,
constant.ControllerBackupOperations,
constant.ControllerBackupDeletion,
@@ -174,6 +176,7 @@ type Config struct {
BackupRepoConfig string
RepoMaintenanceJobConfig string
ItemBlockWorkerCount int
ConcurrentBackups int
}
func GetDefaultConfig() *Config {
@@ -206,6 +209,7 @@ func GetDefaultConfig() *Config {
ScheduleSkipImmediately: false,
CredentialsDirectory: credentials.DefaultStoreDirectory(),
ItemBlockWorkerCount: DefaultItemBlockWorkerCount,
ConcurrentBackups: DefaultConcurrentBackups,
}
return config
@@ -261,4 +265,10 @@ func (c *Config) BindFlags(flags *pflag.FlagSet) {
c.ItemBlockWorkerCount,
"Number of worker threads to process ItemBlocks. Default is one. Optional.",
)
flags.IntVar(
&c.ConcurrentBackups,
"concurrent-backups",
c.ConcurrentBackups,
"Number of backups to process concurrently. Default is one. Optional.",
)
}
+16 -1
View File
@@ -558,7 +558,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
return clientmgmt.NewManager(logger, s.logLevel, s.pluginRegistry)
}
backupStoreGetter := persistence.NewObjectBackupStoreGetter(s.credentialFileStore)
backupStoreGetter := persistence.NewObjectBackupStoreGetterWithSecretStore(s.credentialFileStore, s.credentialSecretStore)
backupTracker := controller.NewBackupTracker()
@@ -581,6 +581,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
constant.ControllerSchedule: {},
constant.ControllerServerStatusRequest: {},
constant.ControllerRestoreFinalizer: {},
constant.ControllerBackupQueue: {},
}
if s.config.RestoreOnly {
@@ -668,6 +669,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
s.config.MaxConcurrentK8SConnections,
s.config.DefaultSnapshotMoveData,
s.config.ItemBlockWorkerCount,
s.config.ConcurrentBackups,
s.crClient,
).SetupWithManager(s.mgr); err != nil {
s.logger.Fatal(err, "unable to create controller", "controller", constant.ControllerBackup)
@@ -756,6 +758,7 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
s.config.RepoMaintenanceJobConfig,
s.logLevel,
s.config.LogFormat,
s.metrics,
).SetupWithManager(s.mgr); err != nil {
s.logger.Fatal(err, "unable to create controller", "controller", constant.ControllerBackupRepo)
}
@@ -909,6 +912,18 @@ func (s *server) runControllers(defaultVolumeSnapshotLocations map[string]string
}
}
if _, ok := enabledRuntimeControllers[constant.ControllerBackupQueue]; ok {
if err := controller.NewBackupQueueReconciler(
s.mgr.GetClient(),
s.mgr.GetScheme(),
s.logger,
s.config.ConcurrentBackups,
backupTracker,
).SetupWithManager(s.mgr); err != nil {
s.logger.Fatal(err, "unable to create controller", "controller", constant.ControllerBackupQueue)
}
}
s.logger.Info("Server starting...")
if err := s.mgr.Start(s.ctx); err != nil {
+41 -1
View File
@@ -20,7 +20,9 @@ import (
"context"
"github.com/pkg/errors"
corev1api "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/types"
kbclient "sigs.k8s.io/controller-runtime/pkg/client"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
@@ -52,6 +54,7 @@ func GetCACertFromRestore(ctx context.Context, client kbclient.Client, namespace
}
// GetCACertFromBSL fetches a BackupStorageLocation directly and returns its cacert
// Priority order: caCertRef (from Secret) > caCert (inline, deprecated)
func GetCACertFromBSL(ctx context.Context, client kbclient.Client, namespace, bslName string) (string, error) {
if bslName == "" {
return "", nil
@@ -71,7 +74,44 @@ func GetCACertFromBSL(ctx context.Context, client kbclient.Client, namespace, bs
return "", errors.Wrapf(err, "error getting backup storage location %s", bslName)
}
if bsl.Spec.ObjectStorage != nil && len(bsl.Spec.ObjectStorage.CACert) > 0 {
if bsl.Spec.ObjectStorage == nil {
return "", nil
}
// Prefer caCertRef over inline caCert
if bsl.Spec.ObjectStorage.CACertRef != nil {
// Fetch certificate from Secret
secret := &corev1api.Secret{}
secretKey := types.NamespacedName{
Name: bsl.Spec.ObjectStorage.CACertRef.Name,
Namespace: namespace,
}
if err := client.Get(ctx, secretKey, secret); err != nil {
if apierrors.IsNotFound(err) {
return "", errors.Errorf("certificate secret %s not found in namespace %s",
bsl.Spec.ObjectStorage.CACertRef.Name, namespace)
}
return "", errors.Wrapf(err, "error getting certificate secret %s",
bsl.Spec.ObjectStorage.CACertRef.Name)
}
keyName := bsl.Spec.ObjectStorage.CACertRef.Key
if keyName == "" {
return "", errors.New("caCertRef key is empty")
}
certData, ok := secret.Data[keyName]
if !ok {
return "", errors.Errorf("key %s not found in secret %s",
keyName, bsl.Spec.ObjectStorage.CACertRef.Name)
}
return string(certData), nil
}
// Fall back to inline caCert (deprecated)
if len(bsl.Spec.ObjectStorage.CACert) > 0 {
return string(bsl.Spec.ObjectStorage.CACert), nil
}
+266
View File
@@ -21,6 +21,7 @@ import (
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
corev1api "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
@@ -294,6 +295,271 @@ func TestGetCACertFromBSL(t *testing.T) {
}
}
// TestGetCACertFromBSL_WithCACertRef tests the new caCertRef functionality
func TestGetCACertFromBSL_WithCACertRef(t *testing.T) {
testCases := []struct {
name string
bslName string
bsl *velerov1api.BackupStorageLocation
secret *corev1api.Secret
expectedCACert string
expectedError bool
errorContains string
}{
{
name: "BSL with caCertRef pointing to valid secret",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
Key: "ca-bundle.crt",
},
},
},
},
},
secret: &corev1api.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-secret",
},
Data: map[string][]byte{
"ca-bundle.crt": []byte("test-cacert-from-secret"),
},
},
expectedCACert: "test-cacert-from-secret",
expectedError: false,
},
{
name: "BSL with both caCertRef and caCert - caCertRef takes precedence",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACert: []byte("inline-cacert-deprecated"),
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
Key: "ca-bundle.crt",
},
},
},
},
},
secret: &corev1api.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-secret",
},
Data: map[string][]byte{
"ca-bundle.crt": []byte("cacert-from-secret-takes-precedence"),
},
},
expectedCACert: "cacert-from-secret-takes-precedence",
expectedError: false,
},
{
name: "BSL with caCertRef but secret not found",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "missing-secret",
},
Key: "ca-bundle.crt",
},
},
},
},
},
secret: nil,
expectedCACert: "",
expectedError: true,
errorContains: "certificate secret missing-secret not found",
},
{
name: "BSL with caCertRef but key not found in secret",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
Key: "missing-key",
},
},
},
},
},
secret: &corev1api.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-secret",
},
Data: map[string][]byte{
"ca-bundle.crt": []byte("test-cacert"),
},
},
expectedCACert: "",
expectedError: true,
errorContains: "key missing-key not found in secret test-secret",
},
{
name: "BSL with caCertRef but empty key",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
Key: "",
},
},
},
},
},
secret: &corev1api.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-secret",
},
Data: map[string][]byte{
"ca-bundle.crt": []byte("test-cacert"),
},
},
expectedCACert: "",
expectedError: true,
errorContains: "caCertRef key is empty",
},
{
name: "BSL with caCertRef containing multi-line PEM certificate",
bslName: "test-bsl",
bsl: &velerov1api.BackupStorageLocation{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-bsl",
},
Spec: velerov1api.BackupStorageLocationSpec{
Provider: "aws",
StorageType: velerov1api.StorageType{
ObjectStorage: &velerov1api.ObjectStorageLocation{
Bucket: "test-bucket",
CACertRef: &corev1api.SecretKeySelector{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
Key: "ca.pem",
},
},
},
},
},
secret: &corev1api.Secret{
ObjectMeta: metav1.ObjectMeta{
Namespace: "test-ns",
Name: "test-secret",
},
Data: map[string][]byte{
"ca.pem": []byte("-----BEGIN CERTIFICATE-----\nMIIDETC...\n-----END CERTIFICATE-----\n"),
},
},
expectedCACert: "-----BEGIN CERTIFICATE-----\nMIIDETC...\n-----END CERTIFICATE-----\n",
expectedError: false,
},
{
name: "BSL falls back to inline caCert when caCertRef is nil",
bslName: "test-bsl",
bsl: builder.ForBackupStorageLocation("test-ns", "test-bsl").
Provider("aws").
Bucket("test-bucket").
CACert([]byte("fallback-inline-cacert")).
Result(),
secret: nil,
expectedCACert: "fallback-inline-cacert",
expectedError: false,
},
}
for _, tc := range testCases {
t.Run(tc.name, func(t *testing.T) {
var objs []runtime.Object
if tc.bsl != nil {
objs = append(objs, tc.bsl)
}
if tc.secret != nil {
objs = append(objs, tc.secret)
}
scheme := runtime.NewScheme()
_ = velerov1api.AddToScheme(scheme)
_ = corev1api.AddToScheme(scheme)
fakeClient := fake.NewClientBuilder().
WithScheme(scheme).
WithRuntimeObjects(objs...).
Build()
cacert, err := GetCACertFromBSL(t.Context(), fakeClient, "test-ns", tc.bslName)
if tc.expectedError {
require.Error(t, err)
if tc.errorContains != "" {
assert.Contains(t, err.Error(), tc.errorContains)
}
} else {
require.NoError(t, err)
assert.Equal(t, tc.expectedCACert, cacert)
}
})
}
}
// TestGetCACertFromBackup_ClientError tests error scenarios where client.Get returns non-NotFound errors
func TestGetCACertFromBackup_ClientError(t *testing.T) {
testCases := []struct {
+12 -2
View File
@@ -75,6 +75,7 @@ func DescribeBackup(
case velerov1api.BackupPhaseFinalizing, velerov1api.BackupPhaseFinalizingPartiallyFailed:
case velerov1api.BackupPhaseInProgress:
case velerov1api.BackupPhaseNew:
case velerov1api.BackupPhaseQueued, velerov1api.BackupPhaseReadyToStart:
}
logsNote := ""
@@ -83,6 +84,9 @@ func DescribeBackup(
}
d.Printf("Phase:\t%s%s\n", phaseString, logsNote)
if phase == velerov1api.BackupPhaseQueued {
d.Printf("Queue position:\t%v\n", backup.Status.QueuePosition)
}
if backup.Spec.ResourcePolicy != nil {
d.Println()
@@ -315,8 +319,14 @@ func DescribeBackupSpec(d *Describer, spec velerov1api.BackupSpec) {
}
// DescribeBackupStatus describes a backup status in human-readable format.
func DescribeBackupStatus(ctx context.Context, kbClient kbclient.Client, d *Describer, backup *velerov1api.Backup, details bool,
insecureSkipTLSVerify bool, caCertPath string, podVolumeBackups []velerov1api.PodVolumeBackup) {
func DescribeBackupStatus(ctx context.Context,
kbClient kbclient.Client,
d *Describer,
backup *velerov1api.Backup,
details bool,
insecureSkipTLSVerify bool,
caCertPath string,
podVolumeBackups []velerov1api.PodVolumeBackup) {
status := backup.Status
// Status.Version has been deprecated, use Status.FormatVersion
+11
View File
@@ -20,6 +20,7 @@ import (
"fmt"
"regexp"
"sort"
"strconv"
"time"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
@@ -40,6 +41,7 @@ var (
{Name: "Created"},
{Name: "Expires"},
{Name: "Storage Location"},
{Name: "Queue Position"},
{Name: "Selector"},
}
)
@@ -108,6 +110,7 @@ func printBackup(backup *velerov1api.Backup) []metav1.TableRow {
backup.Status.StartTimestamp,
humanReadableTimeFromNow(expiration),
backup.Spec.StorageLocation,
queuePosition(backup.Status.QueuePosition),
metav1.FormatLabelSelector(backup.Spec.LabelSelector),
)
@@ -127,3 +130,11 @@ func humanReadableTimeFromNow(when time.Time) string {
return fmt.Sprintf("%s ago", duration.ShortHumanDuration(now.Sub(when)))
}
}
func queuePosition(pos int) string {
if pos == 0 {
return ""
} else {
return strconv.Itoa(pos)
}
}
+1
View File
@@ -1,6 +1,7 @@
package constant
const (
ControllerBackupQueue = "backup-queue"
ControllerBackup = "backup"
ControllerBackupOperations = "backup-operations"
ControllerBackupDeletion = "backup-deletion"
+34 -8
View File
@@ -36,7 +36,11 @@ import (
"k8s.io/apimachinery/pkg/util/wait"
"k8s.io/utils/clock"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/builder"
kbclient "sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/controller"
"sigs.k8s.io/controller-runtime/pkg/event"
"sigs.k8s.io/controller-runtime/pkg/predicate"
"github.com/vmware-tanzu/velero/internal/credentials"
"github.com/vmware-tanzu/velero/internal/resourcepolicies"
@@ -105,7 +109,7 @@ type backupReconciler struct {
defaultSnapshotMoveData bool
globalCRClient kbclient.Client
itemBlockWorkerCount int
workerPool *pkgbackup.ItemBlockWorkerPool
concurrentBackups int
}
func NewBackupReconciler(
@@ -132,6 +136,7 @@ func NewBackupReconciler(
maxConcurrentK8SConnections int,
defaultSnapshotMoveData bool,
itemBlockWorkerCount int,
concurrentBackups int,
globalCRClient kbclient.Client,
) *backupReconciler {
b := &backupReconciler{
@@ -159,8 +164,8 @@ func NewBackupReconciler(
maxConcurrentK8SConnections: maxConcurrentK8SConnections,
defaultSnapshotMoveData: defaultSnapshotMoveData,
itemBlockWorkerCount: itemBlockWorkerCount,
concurrentBackups: max(concurrentBackups, 1),
globalCRClient: globalCRClient,
workerPool: pkgbackup.StartItemBlockWorkerPool(ctx, itemBlockWorkerCount, logger),
}
b.updateTotalBackupMetric()
return b
@@ -168,7 +173,24 @@ func NewBackupReconciler(
func (b *backupReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&velerov1api.Backup{}).
For(&velerov1api.Backup{}, builder.WithPredicates(predicate.Funcs{
UpdateFunc: func(ue event.UpdateEvent) bool {
backup := ue.ObjectNew.(*velerov1api.Backup)
return backup.Status.Phase == velerov1api.BackupPhaseReadyToStart
},
CreateFunc: func(ce event.CreateEvent) bool {
return false
},
DeleteFunc: func(de event.DeleteEvent) bool {
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
return false
},
})).
WithOptions(controller.Options{
MaxConcurrentReconciles: b.concurrentBackups,
}).
Named(constant.ControllerBackup).
Complete(b)
}
@@ -254,8 +276,8 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
// InProgress, we still need this check so we can return nil to indicate we've finished processing
// this key (even though it was a no-op).
switch original.Status.Phase {
case "", velerov1api.BackupPhaseNew:
// only process new backups
case velerov1api.BackupPhaseReadyToStart:
// only process ReadytToStart backups
default:
b.logger.WithFields(logrus.Fields{
"backup": kubeutil.NamespaceAndName(original),
@@ -265,7 +287,9 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
}
log.Debug("Preparing backup request")
request := b.prepareBackupRequest(original, log)
request := b.prepareBackupRequest(ctx, original, log)
// delete worker pool after reconcile
defer request.StopWorkerPool()
if len(request.Status.ValidationErrors) > 0 {
request.Status.Phase = velerov1api.BackupPhaseFailedValidation
} else {
@@ -299,6 +323,8 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
switch request.Status.Phase {
case velerov1api.BackupPhaseCompleted, velerov1api.BackupPhasePartiallyFailed, velerov1api.BackupPhaseFailed, velerov1api.BackupPhaseFailedValidation:
b.backupTracker.Delete(request.Namespace, request.Name)
case velerov1api.BackupPhaseWaitingForPluginOperations, velerov1api.BackupPhaseWaitingForPluginOperationsPartiallyFailed, velerov1api.BackupPhaseFinalizing, velerov1api.BackupPhaseFinalizingPartiallyFailed:
b.backupTracker.AddPostProcessing(request.Namespace, request.Name)
}
}()
@@ -347,12 +373,12 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
return ctrl.Result{}, nil
}
func (b *backupReconciler) prepareBackupRequest(backup *velerov1api.Backup, logger logrus.FieldLogger) *pkgbackup.Request {
func (b *backupReconciler) prepareBackupRequest(ctx context.Context, backup *velerov1api.Backup, logger logrus.FieldLogger) *pkgbackup.Request {
request := &pkgbackup.Request{
Backup: backup.DeepCopy(), // don't modify items in the cache
SkippedPVTracker: pkgbackup.NewSkipPVTracker(),
BackedUpItems: pkgbackup.NewBackedUpItemsMap(),
ItemBlockChannel: b.workerPool.GetInputChannel(),
WorkerPool: pkgbackup.StartItemBlockWorkerPool(ctx, b.itemBlockWorkerCount, logger),
}
request.VolumesInformation.Init()
+34 -33
View File
@@ -95,7 +95,11 @@ func (b *fakeBackupper) FinalizeBackup(
}
func defaultBackup() *builder.BackupBuilder {
return builder.ForBackup(velerov1api.DefaultNamespace, "backup-1")
return builder.ForBackup(velerov1api.DefaultNamespace, "backup-1").Phase(velerov1api.BackupPhaseReadyToStart)
}
func namedBackup(name string) *builder.BackupBuilder {
return builder.ForBackup(velerov1api.DefaultNamespace, name).Phase(velerov1api.BackupPhaseReadyToStart)
}
func TestProcessBackupNonProcessedItems(t *testing.T) {
@@ -104,6 +108,16 @@ func TestProcessBackupNonProcessedItems(t *testing.T) {
key string
backup *velerov1api.Backup
}{
{
name: "New backup is not processed",
key: "velero/backup-1",
backup: defaultBackup().Phase(velerov1api.BackupPhaseNew).Result(),
},
{
name: "Queued backup is not processed",
key: "velero/backup-1",
backup: defaultBackup().Phase(velerov1api.BackupPhaseQueued).Result(),
},
{
name: "FailedValidation backup is not processed",
key: "velero/backup-1",
@@ -135,9 +149,7 @@ func TestProcessBackupNonProcessedItems(t *testing.T) {
kbClient: velerotest.NewFakeControllerRuntimeClient(t),
formatFlag: formatFlag,
logger: logger,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
if test.backup != nil {
require.NoError(t, c.kbClient.Create(t.Context(), test.backup))
}
@@ -234,9 +246,7 @@ func TestProcessBackupValidationFailures(t *testing.T) {
clock: &clock.RealClock{},
formatFlag: formatFlag,
metrics: metrics.NewServerMetrics(),
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
require.NotNil(t, test.backup)
require.NoError(t, c.kbClient.Create(t.Context(), test.backup))
@@ -299,11 +309,10 @@ func TestBackupLocationLabel(t *testing.T) {
defaultBackupLocation: test.backupLocation.Name,
clock: &clock.RealClock{},
formatFlag: formatFlag,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
res := c.prepareBackupRequest(test.backup, logger)
res := c.prepareBackupRequest(ctx, test.backup, logger)
defer res.WorkerPool.Stop()
assert.NotNil(t, res)
assert.Equal(t, test.expectedBackupLocation, res.Labels[velerov1api.StorageLocationLabel])
})
@@ -331,7 +340,7 @@ func Test_prepareBackupRequest_BackupStorageLocation(t *testing.T) {
}{
{
name: "BackupLocation is specified in backup CR'spec and it can be found in ApiServer",
backup: builder.ForBackup("velero", "backup-1").Result(),
backup: defaultBackup().Result(),
backupLocationNameInBackup: "test-backup-location",
backupLocationInAPIServer: builder.ForBackupStorageLocation("velero", "test-backup-location").Result(),
defaultBackupLocationInAPIServer: builder.ForBackupStorageLocation("velero", "default-location").Result(),
@@ -340,7 +349,7 @@ func Test_prepareBackupRequest_BackupStorageLocation(t *testing.T) {
},
{
name: "BackupLocation is specified in backup CR'spec and it can't be found in ApiServer",
backup: builder.ForBackup("velero", "backup-1").Result(),
backup: defaultBackup().Result(),
backupLocationNameInBackup: "test-backup-location",
backupLocationInAPIServer: nil,
defaultBackupLocationInAPIServer: nil,
@@ -349,7 +358,7 @@ func Test_prepareBackupRequest_BackupStorageLocation(t *testing.T) {
},
{
name: "Using default BackupLocation and it can be found in ApiServer",
backup: builder.ForBackup("velero", "backup-1").Result(),
backup: defaultBackup().Result(),
backupLocationNameInBackup: "",
backupLocationInAPIServer: builder.ForBackupStorageLocation("velero", "test-backup-location").Result(),
defaultBackupLocationInAPIServer: builder.ForBackupStorageLocation("velero", "default-location").Result(),
@@ -358,7 +367,7 @@ func Test_prepareBackupRequest_BackupStorageLocation(t *testing.T) {
},
{
name: "Using default BackupLocation and it can't be found in ApiServer",
backup: builder.ForBackup("velero", "backup-1").Result(),
backup: defaultBackup().Result(),
backupLocationNameInBackup: "",
backupLocationInAPIServer: nil,
defaultBackupLocationInAPIServer: nil,
@@ -396,14 +405,13 @@ func Test_prepareBackupRequest_BackupStorageLocation(t *testing.T) {
defaultBackupTTL: defaultBackupTTL.Duration,
clock: testclocks.NewFakeClock(now),
formatFlag: formatFlag,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
test.backup.Spec.StorageLocation = test.backupLocationNameInBackup
// Run
res := c.prepareBackupRequest(test.backup, logger)
res := c.prepareBackupRequest(ctx, test.backup, logger)
defer res.WorkerPool.Stop()
// Assert
if test.expectedSuccess {
@@ -472,11 +480,10 @@ func TestDefaultBackupTTL(t *testing.T) {
defaultBackupTTL: defaultBackupTTL.Duration,
clock: testclocks.NewFakeClock(now),
formatFlag: formatFlag,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
res := c.prepareBackupRequest(test.backup, logger)
res := c.prepareBackupRequest(ctx, test.backup, logger)
defer res.WorkerPool.Stop()
assert.NotNil(t, res)
assert.Equal(t, test.expectedTTL, res.Spec.TTL)
assert.Equal(t, test.expectedExpiration, *res.Status.Expiration)
@@ -497,7 +504,7 @@ func TestPrepareBackupRequest_SetsVGSLabelKey(t *testing.T) {
}{
{
name: "backup with spec label key set",
backup: builder.ForBackup("velero", "backup-1").
backup: defaultBackup().
VolumeGroupSnapshotLabelKey("spec-key").
Result(),
serverFlagKey: "server-key",
@@ -505,13 +512,13 @@ func TestPrepareBackupRequest_SetsVGSLabelKey(t *testing.T) {
},
{
name: "backup with no spec key, uses server flag",
backup: builder.ForBackup("velero", "backup-2").Result(),
backup: namedBackup("backup-2").Result(),
serverFlagKey: "server-key",
expectedLabelKey: "server-key",
},
{
name: "backup with no spec or server flag, uses default",
backup: builder.ForBackup("velero", "backup-3").Result(),
backup: namedBackup("backup-3").Result(),
serverFlagKey: velerov1api.DefaultVGSLabelKey,
expectedLabelKey: velerov1api.DefaultVGSLabelKey,
},
@@ -533,11 +540,10 @@ func TestPrepareBackupRequest_SetsVGSLabelKey(t *testing.T) {
defaultVGSLabelKey: test.serverFlagKey,
discoveryHelper: discoveryHelper,
clock: testclocks.NewFakeClock(now),
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
res := c.prepareBackupRequest(test.backup, logger)
res := c.prepareBackupRequest(ctx, test.backup, logger)
defer res.WorkerPool.Stop()
assert.NotNil(t, res)
assert.Equal(t, test.expectedLabelKey, res.Spec.VolumeGroupSnapshotLabelKey)
@@ -635,11 +641,10 @@ func TestDefaultVolumesToResticDeprecation(t *testing.T) {
clock: &clock.RealClock{},
formatFlag: formatFlag,
defaultVolumesToFsBackup: test.globalVal,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
res := c.prepareBackupRequest(test.backup, logger)
res := c.prepareBackupRequest(ctx, test.backup, logger)
defer res.WorkerPool.Stop()
assert.NotNil(t, res)
assert.NotNil(t, res.Spec.DefaultVolumesToFsBackup)
if test.expectRemap {
@@ -1390,7 +1395,7 @@ func TestProcessBackupCompletions(t *testing.T) {
},
{
name: "backup with namespace-scoped and cluster-scoped resource filters",
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-1").
backup: defaultBackup().
ExcludedClusterScopedResources("clusterroles").
IncludedClusterScopedResources("storageclasses").
ExcludedNamespaceScopedResources("secrets").
@@ -1439,7 +1444,7 @@ func TestProcessBackupCompletions(t *testing.T) {
},
{
name: "backup's include filter overlap with default exclude resources",
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-1").
backup: defaultBackup().
ExcludedClusterScopedResources("clusterroles").
IncludedClusterScopedResources("storageclasses", "volumesnapshotcontents.snapshot.storage.k8s.io").
ExcludedNamespaceScopedResources("secrets").
@@ -1555,9 +1560,7 @@ func TestProcessBackupCompletions(t *testing.T) {
backupper: backupper,
formatFlag: formatFlag,
globalCRClient: fakeGlobalClient,
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
pluginManager.On("GetBackupItemActionsV2").Return(nil, nil)
pluginManager.On("GetItemBlockActions").Return(nil, nil)
@@ -1763,9 +1766,7 @@ func TestValidateAndGetSnapshotLocations(t *testing.T) {
logger: logger,
defaultSnapshotLocations: test.defaultLocations,
kbClient: velerotest.NewFakeControllerRuntimeClient(t),
workerPool: pkgbackup.StartItemBlockWorkerPool(t.Context(), 1, logger),
}
defer c.workerPool.Stop()
// set up a Backup object to represent what we expect to be passed to backupper.Backup()
backup := test.backup.DeepCopy()
+363
View File
@@ -0,0 +1,363 @@
/*
Copyright the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controller
import (
"context"
"slices"
"time"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
corev1api "k8s.io/api/core/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
"k8s.io/apimachinery/pkg/util/sets"
ctrl "sigs.k8s.io/controller-runtime"
"sigs.k8s.io/controller-runtime/pkg/builder"
"sigs.k8s.io/controller-runtime/pkg/client"
"sigs.k8s.io/controller-runtime/pkg/event"
"sigs.k8s.io/controller-runtime/pkg/handler"
"sigs.k8s.io/controller-runtime/pkg/predicate"
"sigs.k8s.io/controller-runtime/pkg/reconcile"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/constant"
"github.com/vmware-tanzu/velero/pkg/util/collections"
"github.com/vmware-tanzu/velero/pkg/util/kube"
)
// backupQueueReconciler reconciles a Backup object
type backupQueueReconciler struct {
client.Client
Scheme *runtime.Scheme
logger logrus.FieldLogger
concurrentBackups int
backupTracker BackupTracker
frequency time.Duration
}
const (
defaultQueuedBackupRecheckFrequency = time.Minute
)
// NewBackupQueueReconciler returns a new backupQueueReconciler
func NewBackupQueueReconciler(
client client.Client,
scheme *runtime.Scheme,
logger logrus.FieldLogger,
concurrentBackups int,
backupTracker BackupTracker,
) *backupQueueReconciler {
return &backupQueueReconciler{
Client: client,
Scheme: scheme,
logger: logger,
concurrentBackups: max(concurrentBackups, 1),
backupTracker: backupTracker,
frequency: defaultQueuedBackupRecheckFrequency,
}
}
func queuePositionOrderFunc(objList client.ObjectList) client.ObjectList {
backupList := objList.(*velerov1api.BackupList)
slices.SortFunc(backupList.Items, func(backup1, backup2 velerov1api.Backup) int {
if backup1.Status.QueuePosition < backup2.Status.QueuePosition {
return -1
} else if backup1.Status.QueuePosition == backup2.Status.QueuePosition {
return 0
} else {
return 1
}
})
return backupList
}
// SetupWithManager adds the reconciler to the manager
func (r *backupQueueReconciler) SetupWithManager(mgr ctrl.Manager) error {
// For periodic requeue, only consider Queued backups, order by QueuePosition
gp := kube.NewGenericEventPredicate(func(object client.Object) bool {
backup := object.(*velerov1api.Backup)
return backup.Status.Phase == velerov1api.BackupPhaseQueued
})
s := kube.NewPeriodicalEnqueueSource(r.logger.WithField("controller", constant.ControllerBackupQueue), mgr.GetClient(), &velerov1api.BackupList{}, r.frequency, kube.PeriodicalEnqueueSourceOption{
Predicates: []predicate.Predicate{gp},
OrderFunc: queuePositionOrderFunc,
})
return ctrl.NewControllerManagedBy(mgr).
For(&velerov1api.Backup{}, builder.WithPredicates(predicate.Funcs{
UpdateFunc: func(ue event.UpdateEvent) bool {
backup := ue.ObjectNew.(*velerov1api.Backup)
return backup.Status.Phase == "" || backup.Status.Phase == velerov1api.BackupPhaseNew
},
CreateFunc: func(ce event.CreateEvent) bool {
backup := ce.Object.(*velerov1api.Backup)
return backup.Status.Phase == "" || backup.Status.Phase == velerov1api.BackupPhaseNew
},
DeleteFunc: func(de event.DeleteEvent) bool {
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
return false
},
})).
Watches(
&velerov1api.Backup{},
handler.EnqueueRequestsFromMapFunc(r.findQueuedBackupsToRequeue),
builder.WithPredicates(predicate.Funcs{
UpdateFunc: func(ue event.UpdateEvent) bool {
oldBackup := ue.ObjectOld.(*velerov1api.Backup)
newBackup := ue.ObjectNew.(*velerov1api.Backup)
return oldBackup.Status.Phase == velerov1api.BackupPhaseInProgress &&
newBackup.Status.Phase != velerov1api.BackupPhaseInProgress ||
oldBackup.Status.Phase != velerov1api.BackupPhaseQueued &&
newBackup.Status.Phase == velerov1api.BackupPhaseQueued &&
r.backupTracker.RunningCount() < r.concurrentBackups
},
CreateFunc: func(event.CreateEvent) bool {
return false
},
DeleteFunc: func(de event.DeleteEvent) bool {
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
return false
},
})).
WatchesRawSource(s).
Named(constant.ControllerBackupQueue).
Complete(r)
}
func (r *backupQueueReconciler) detectNamespaceConflict(ctx context.Context, backup *velerov1api.Backup, earlierBackups []velerov1api.Backup) (bool, string, []string, error) {
nsList := &corev1api.NamespaceList{}
if err := r.Client.List(ctx, nsList); err != nil {
return false, "", nil, err
}
var clusterNamespaces []string
for _, ns := range nsList.Items {
clusterNamespaces = append(clusterNamespaces, ns.Name)
}
foundConflict, conflictBackup := detectNSConflictsInternal(backup, earlierBackups, clusterNamespaces)
return foundConflict, conflictBackup, clusterNamespaces, nil
}
func detectNSConflictsInternal(backup *velerov1api.Backup, earlierBackups []velerov1api.Backup, clusterNamespaces []string) (bool, string) {
backupNamespaces := sets.NewString(namespacesForBackup(backup, clusterNamespaces)...)
for _, earlierBackup := range earlierBackups {
// This will never be true for the primary backup, but for the secondary
// runnability check for queued backups ahead of the current backup, we
// only care about backups ahead of it.
// Backup isn't earlier than this one, skip
if earlierBackup.Status.Phase == velerov1api.BackupPhaseQueued &&
earlierBackup.Status.QueuePosition >= backup.Status.QueuePosition {
continue
}
if backupNamespaces.HasAny(namespacesForBackup(&earlierBackup, clusterNamespaces)...) {
return true, earlierBackup.Name
}
}
return false, ""
}
// Returns true if there are backups ahead of the current backup that are runnable
// This could happen if velero just reconciled the one earlier in the queue and rejected it
// due to too many running backups, but a backup completed in between that reconcile and this one
// so exit, as the recent completion has triggered another reconcile of all queued backups
func (r *backupQueueReconciler) checkForEarlierRunnableBackups(backup *velerov1api.Backup, earlierBackups []velerov1api.Backup, clusterNamespaces []string) (bool, string) {
for _, earlierBackup := range earlierBackups {
// if this backup is queued and ahead of current backup, check for conflicts
if earlierBackup.Status.Phase != velerov1api.BackupPhaseQueued ||
earlierBackup.Status.QueuePosition >= backup.Status.QueuePosition {
continue
}
conflict, _ := detectNSConflictsInternal(&earlierBackup, earlierBackups, clusterNamespaces)
// !conflict means we've found an earlier backup that is currently runnable
// so current reconcile should exit to run this one
if !conflict {
return true, earlierBackup.Name
}
}
return false, ""
}
func namespacesForBackup(backup *velerov1api.Backup, clusterNamespaces []string) []string {
// Ignore error here. If a backup has invalid namespace wildcards, the backup controller
// will validate and fail it. Consider the ns list empty for conflict detection purposes.
nsList, err := collections.NewNamespaceIncludesExcludes().Includes(backup.Spec.IncludedNamespaces...).Excludes(backup.Spec.ExcludedNamespaces...).ActiveNamespaces(clusterNamespaces).ResolveNamespaceList()
if err != nil {
return []string{}
}
return nsList
}
func (r *backupQueueReconciler) getMaxQueuePosition(lister *queuedBackupsLister) int {
queuedBackups := lister.orderedQueued()
maxPos := 0
if len(queuedBackups) > 0 {
maxPos = queuedBackups[len(queuedBackups)-1].Status.QueuePosition
}
return maxPos
}
func (r *backupQueueReconciler) findQueuedBackupsToRequeue(ctx context.Context, obj client.Object) []reconcile.Request {
backup := obj.(*velerov1api.Backup)
requests := []reconcile.Request{}
allBackups := &velerov1api.BackupList{}
if err := r.Client.List(ctx, allBackups, &client.ListOptions{Namespace: backup.Namespace}); err != nil {
r.logger.WithError(err).Error("error listing backups")
return requests
}
backups := r.newQueuedBackupsLister(allBackups).orderedQueued()
for _, item := range backups {
requests = append(requests, reconcile.Request{
NamespacedName: types.NamespacedName{
Namespace: item.GetNamespace(),
Name: item.GetName(),
},
})
}
return requests
}
// Reconcile reconciles a Backup object
func (r *backupQueueReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctrl.Result, error) {
log := r.logger.WithField("backup", req.NamespacedName.String())
log.Debug("Getting backup")
backup := &velerov1api.Backup{}
if err := r.Get(ctx, req.NamespacedName, backup); err != nil {
log.WithError(err).Error("unable to get backup")
return ctrl.Result{}, client.IgnoreNotFound(err)
}
switch backup.Status.Phase {
case "", velerov1api.BackupPhaseNew:
// queue new backup
allBackups := &velerov1api.BackupList{}
if err := r.Client.List(ctx, allBackups, &client.ListOptions{Namespace: backup.Namespace}); err != nil {
r.logger.WithError(err).Error("error listing backups")
return ctrl.Result{}, nil
}
lister := r.newQueuedBackupsLister(allBackups)
maxQueuePosition := r.getMaxQueuePosition(lister)
original := backup.DeepCopy()
backup.Status.Phase = velerov1api.BackupPhaseQueued
backup.Status.QueuePosition = maxQueuePosition + 1
log.Infof("Queueing backup %v, queue position %v", backup.Name, backup.Status.QueuePosition)
if err := kube.PatchResource(original, backup, r.Client); err != nil {
return ctrl.Result{}, errors.Wrapf(err, "error updating Backup status to %s", backup.Status.Phase)
}
case velerov1api.BackupPhaseQueued:
// handle queued backup
// Find backups ahead of this one (InProgress, ReadyToStart, or Queued with higher position)
allBackups := &velerov1api.BackupList{}
if err := r.Client.List(ctx, allBackups, &client.ListOptions{Namespace: backup.Namespace}); err != nil {
r.logger.WithError(err).Error("error listing backups")
return ctrl.Result{}, nil
}
lister := r.newQueuedBackupsLister(allBackups)
if r.backupTracker.RunningCount() >= r.concurrentBackups {
log.Debugf("%v concurrent backups are already running, leaving %v queued", r.concurrentBackups, backup.Name)
return ctrl.Result{}, nil
}
earlierBackups := lister.earlierThan(backup.Status.QueuePosition)
foundConflict, conflictBackup, clusterNamespaces, err := r.detectNamespaceConflict(ctx, backup, earlierBackups)
if err != nil {
log.WithError(err).Error("error listing namespaces")
return ctrl.Result{}, nil
}
if foundConflict {
log.Infof("Backup %v has namespace conflict with %v, leaving queued", backup.Name, conflictBackup)
return ctrl.Result{}, nil
}
foundEarlierRunnable, earlierRunnable := r.checkForEarlierRunnableBackups(backup, earlierBackups, clusterNamespaces)
if foundEarlierRunnable {
log.Infof("Earlier queued backup %v is runnable, leaving %v queued", earlierRunnable, backup.Name)
return ctrl.Result{}, nil
}
log.Infof("Dequeueing backup %v, moving to ReadyToStart", backup.Name)
original := backup.DeepCopy()
backup.Status.Phase = velerov1api.BackupPhaseReadyToStart
backup.Status.QueuePosition = 0
if err := kube.PatchResource(original, backup, r.Client); err != nil {
return ctrl.Result{}, errors.Wrapf(err, "error updating Backup status to %s", backup.Status.Phase)
}
r.backupTracker.AddReadyToStart(backup.Namespace, backup.Name)
log.Debug("Updating queuePosition for remaining queued backups")
queuedBackups := lister.orderedQueued()
newQueuePos := 1
for _, queuedBackup := range queuedBackups {
if queuedBackup.Name != backup.Name {
original := queuedBackup.DeepCopy()
queuedBackup.Status.QueuePosition = newQueuePos
if err := kube.PatchResource(original, &queuedBackup, r.Client); err != nil {
log.WithError(errors.Wrapf(err, "error updating Backup %s queuePosition to %v", queuedBackup.Name, newQueuePos))
return ctrl.Result{}, nil
}
newQueuePos++
}
}
return ctrl.Result{}, nil
default:
log.Debug("Backup is not New or Queued, skipping")
return ctrl.Result{}, nil
}
return ctrl.Result{}, nil
}
// queuedBackupsLister manages a list of all backups Queued, ReadyToStart, or InProgress
// with methods to return specific subsets as needed
type queuedBackupsLister struct {
backups *velerov1api.BackupList
}
func (r *backupQueueReconciler) newQueuedBackupsLister(backupList *velerov1api.BackupList) *queuedBackupsLister {
backups := []velerov1api.Backup{}
for _, backup := range backupList.Items {
if backup.Status.Phase == velerov1api.BackupPhaseQueued ||
backup.Status.Phase == velerov1api.BackupPhaseInProgress ||
backup.Status.Phase == velerov1api.BackupPhaseReadyToStart {
backups = append(backups, backup)
}
}
backupList.Items = backups
return &queuedBackupsLister{backupList}
}
func (l *queuedBackupsLister) earlierThan(queuePos int) []velerov1api.Backup {
backups := []velerov1api.Backup{}
for _, backup := range l.backups.Items {
// InProgress and ReadyToStart backups have QueuePosition==0
if backup.Status.QueuePosition < queuePos {
backups = append(backups, backup)
}
}
return backups
}
func (l *queuedBackupsLister) orderedQueued() []velerov1api.Backup {
var returnList []velerov1api.Backup
orderedBackupList := queuePositionOrderFunc(l.backups).(*velerov1api.BackupList)
for _, item := range orderedBackupList.Items {
if item.Status.Phase == velerov1api.BackupPhaseQueued {
returnList = append(returnList, item)
}
}
return returnList
}
@@ -0,0 +1,241 @@
/*
Copyright the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package controller
import (
"testing"
"github.com/sirupsen/logrus"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
//metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/types"
ctrl "sigs.k8s.io/controller-runtime"
//"sigs.k8s.io/controller-runtime/pkg/client/fake"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/builder"
velerotest "github.com/vmware-tanzu/velero/pkg/test"
)
func TestBackupQueueReconciler(t *testing.T) {
scheme := runtime.NewScheme()
velerov1api.AddToScheme(scheme)
tests := []struct {
name string
priorBackups []*velerov1api.Backup
namespaces []string
backup *velerov1api.Backup
concurrentBackups int
expectError bool
expectPhase velerov1api.BackupPhase
expectQueuePosition int
}{
{
name: "New Backup gets queued",
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "InProgress Backup is ignored",
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).Result(),
expectPhase: velerov1api.BackupPhaseInProgress,
},
{
name: "Second New Backup gets queued with queuePosition 2",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).Result(),
},
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 2,
},
{
name: "Queued Backup moves to ReadyToStart if no others are running",
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseQueued).Result(),
expectPhase: velerov1api.BackupPhaseReadyToStart,
},
{
name: "Queued Backup remains queued if no spaces available",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).Result(),
builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Phase(velerov1api.BackupPhaseInProgress).Result(),
},
concurrentBackups: 2,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Queued Backup remains queued if no spaces available including ReadyToStart",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).Result(),
builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Phase(velerov1api.BackupPhaseReadyToStart).Result(),
},
concurrentBackups: 2,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Queued Backup remains queued if earlier runnable backup is also queued",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).Result(),
builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).Result(),
},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(2).Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 2,
},
{
name: "Queued Backup remains queued if in conflict with running backup",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).IncludedNamespaces("foo").Result(),
},
namespaces: []string{"foo"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("foo").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Queued Backup remains queued if in conflict with ReadyToStart backup",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseReadyToStart).IncludedNamespaces("foo").Result(),
},
namespaces: []string{"foo"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("foo").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Queued Backup remains queued if in conflict with earlier queued backup",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("foo").Result(),
},
namespaces: []string{"foo", "bar"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(2).IncludedNamespaces("foo", "bar").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 2,
},
{
name: "Queued Backup remains queued if earlier non-ns-conflict backup exists",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).IncludedNamespaces("bar").Result(),
builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("foo").Result(),
},
namespaces: []string{"foo", "bar", "baz"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(2).IncludedNamespaces("baz").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 2,
},
{
name: "Running all-namespace backup conflicts with queued one-namespace backup ",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).IncludedNamespaces("*").Result(),
},
namespaces: []string{"foo", "bar"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("foo").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Running one-namespace backup conflicts with queued all-namespace backup ",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).IncludedNamespaces("bar").Result(),
},
namespaces: []string{"foo", "bar"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("*").Result(),
expectPhase: velerov1api.BackupPhaseQueued,
expectQueuePosition: 1,
},
{
name: "Queued Backup moves to ReadyToStart if running count < concurrentBackups",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseInProgress).Result(),
builder.ForBackup(velerov1api.DefaultNamespace, "backup-12").Phase(velerov1api.BackupPhaseInProgress).Result(),
},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).Result(),
expectPhase: velerov1api.BackupPhaseReadyToStart,
},
{
name: "Queued Backup moves to ReadyToStart if running count < concurrentBackups and no ns conflict found",
priorBackups: []*velerov1api.Backup{
builder.ForBackup(velerov1api.DefaultNamespace, "backup-11").Phase(velerov1api.BackupPhaseReadyToStart).IncludedNamespaces("foo").Result(),
},
namespaces: []string{"foo", "bar"},
concurrentBackups: 3,
backup: builder.ForBackup(velerov1api.DefaultNamespace, "backup-20").Phase(velerov1api.BackupPhaseQueued).QueuePosition(1).IncludedNamespaces("bar").Result(),
expectPhase: velerov1api.BackupPhaseReadyToStart,
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
if test.backup == nil {
return
}
backupTracker := NewBackupTracker()
initObjs := []runtime.Object{}
for _, priorBackup := range test.priorBackups {
initObjs = append(initObjs, priorBackup)
if priorBackup.Status.Phase == velerov1api.BackupPhaseReadyToStart {
backupTracker.AddReadyToStart(priorBackup.Namespace, priorBackup.Name)
} else if priorBackup.Status.Phase == velerov1api.BackupPhaseInProgress {
backupTracker.Add(priorBackup.Namespace, priorBackup.Name)
}
}
for _, ns := range test.namespaces {
initObjs = append(initObjs, builder.ForNamespace(ns).Result())
}
initObjs = append(initObjs, test.backup)
fakeClient := velerotest.NewFakeControllerRuntimeClient(t, initObjs...)
logger := logrus.New()
log := logger.WithField("controller", "backup-queue-test")
r := NewBackupQueueReconciler(fakeClient, scheme, log, test.concurrentBackups, backupTracker)
req := ctrl.Request{NamespacedName: types.NamespacedName{Namespace: test.backup.Namespace, Name: test.backup.Name}}
res, err := r.Reconcile(t.Context(), req)
gotErr := err != nil
require.NoError(t, err)
assert.Equal(t, ctrl.Result{}, res)
assert.Equal(t, test.expectError, gotErr)
backupAfter := velerov1api.Backup{}
err = fakeClient.Get(t.Context(), types.NamespacedName{
Namespace: test.backup.Namespace,
Name: test.backup.Name,
}, &backupAfter)
require.NoError(t, err)
assert.Equal(t, test.expectPhase, backupAfter.Status.Phase)
assert.Equal(t, test.expectQueuePosition, backupAfter.Status.QueuePosition)
})
}
}
@@ -42,6 +42,7 @@ import (
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/constant"
"github.com/vmware-tanzu/velero/pkg/label"
"github.com/vmware-tanzu/velero/pkg/metrics"
repoconfig "github.com/vmware-tanzu/velero/pkg/repository/config"
"github.com/vmware-tanzu/velero/pkg/repository/maintenance"
repomanager "github.com/vmware-tanzu/velero/pkg/repository/manager"
@@ -66,6 +67,7 @@ type BackupRepoReconciler struct {
repoMaintenanceConfig string
logLevel logrus.Level
logFormat *logging.FormatFlag
metrics *metrics.ServerMetrics
}
func NewBackupRepoReconciler(
@@ -78,6 +80,7 @@ func NewBackupRepoReconciler(
repoMaintenanceConfig string,
logLevel logrus.Level,
logFormat *logging.FormatFlag,
metrics *metrics.ServerMetrics,
) *BackupRepoReconciler {
c := &BackupRepoReconciler{
client,
@@ -90,6 +93,7 @@ func NewBackupRepoReconciler(
repoMaintenanceConfig,
logLevel,
logFormat,
metrics,
}
return c
@@ -197,11 +201,22 @@ func (r *BackupRepoReconciler) needInvalidBackupRepo(oldObj client.Object, newOb
return true
}
// Check if either CACert or CACertRef has changed
if !bytes.Equal(oldStorage.CACert, newStorage.CACert) {
logger.Info("BSL's CACert has changed, invalid backup repositories")
return true
}
// Check if CACertRef has changed
if (oldStorage.CACertRef == nil && newStorage.CACertRef != nil) ||
(oldStorage.CACertRef != nil && newStorage.CACertRef == nil) ||
(oldStorage.CACertRef != nil && newStorage.CACertRef != nil &&
(oldStorage.CACertRef.Name != newStorage.CACertRef.Name ||
oldStorage.CACertRef.Key != newStorage.CACertRef.Key)) {
logger.Info("BSL's CACertRef has changed, invalid backup repositories")
return true
}
if !reflect.DeepEqual(oldConfig, newConfig) {
logger.Info("BSL's storage config has changed, invalid backup repositories")
@@ -491,6 +506,12 @@ func (r *BackupRepoReconciler) runMaintenanceIfDue(ctx context.Context, req *vel
job, err := funcStartMaintenanceJob(r.Client, ctx, req, r.repoMaintenanceConfig, r.logLevel, r.logFormat, log)
if err != nil {
log.WithError(err).Warn("Starting repo maintenance failed")
// Record failure metric when job fails to start
if r.metrics != nil {
r.metrics.RegisterRepoMaintenanceFailure(req.Name)
}
return r.patchBackupRepository(ctx, req, func(rr *velerov1api.BackupRepository) {
updateRepoMaintenanceHistory(rr, velerov1api.BackupRepositoryMaintenanceFailed, &metav1.Time{Time: startTime}, nil, fmt.Sprintf("Failed to start maintenance job, err: %v", err))
})
@@ -505,11 +526,30 @@ func (r *BackupRepoReconciler) runMaintenanceIfDue(ctx context.Context, req *vel
if status.Result == velerov1api.BackupRepositoryMaintenanceFailed {
log.WithError(err).Warn("Pruning repository failed")
// Record failure metric
if r.metrics != nil {
r.metrics.RegisterRepoMaintenanceFailure(req.Name)
if status.StartTimestamp != nil && status.CompleteTimestamp != nil {
duration := status.CompleteTimestamp.Sub(status.StartTimestamp.Time).Seconds()
r.metrics.ObserveRepoMaintenanceDuration(req.Name, duration)
}
}
return r.patchBackupRepository(ctx, req, func(rr *velerov1api.BackupRepository) {
updateRepoMaintenanceHistory(rr, velerov1api.BackupRepositoryMaintenanceFailed, status.StartTimestamp, status.CompleteTimestamp, status.Message)
})
}
// Record success metric
if r.metrics != nil {
r.metrics.RegisterRepoMaintenanceSuccess(req.Name)
if status.StartTimestamp != nil && status.CompleteTimestamp != nil {
duration := status.CompleteTimestamp.Sub(status.StartTimestamp.Time).Seconds()
r.metrics.ObserveRepoMaintenanceDuration(req.Name, duration)
}
}
return r.patchBackupRepository(ctx, req, func(rr *velerov1api.BackupRepository) {
rr.Status.LastMaintenanceTime = &metav1.Time{Time: status.CompleteTimestamp.Time}
updateRepoMaintenanceHistory(rr, velerov1api.BackupRepositoryMaintenanceSucceeded, status.StartTimestamp, status.CompleteTimestamp, status.Message)
@@ -19,6 +19,8 @@ import (
"testing"
"time"
"github.com/prometheus/client_golang/prometheus"
dto "github.com/prometheus/client_model/go"
"github.com/sirupsen/logrus"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/mock"
@@ -32,6 +34,7 @@ import (
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/builder"
"github.com/vmware-tanzu/velero/pkg/metrics"
"github.com/vmware-tanzu/velero/pkg/repository"
"github.com/vmware-tanzu/velero/pkg/repository/maintenance"
repomaintenance "github.com/vmware-tanzu/velero/pkg/repository/maintenance"
@@ -65,6 +68,7 @@ func mockBackupRepoReconciler(t *testing.T, mockOn string, arg any, ret ...any)
"",
logrus.InfoLevel,
nil,
nil,
)
}
@@ -584,6 +588,7 @@ func TestGetRepositoryMaintenanceFrequency(t *testing.T) {
"",
logrus.InfoLevel,
nil,
nil,
)
freq := reconciler.getRepositoryMaintenanceFrequency(test.repo)
@@ -716,6 +721,7 @@ func TestNeedInvalidBackupRepo(t *testing.T) {
"",
logrus.InfoLevel,
nil,
nil,
)
need := reconciler.needInvalidBackupRepo(test.oldBSL, test.newBSL)
@@ -1581,6 +1587,7 @@ func TestDeleteOldMaintenanceJobWithConfigMap(t *testing.T) {
repoMaintenanceConfigName,
logrus.InfoLevel,
nil,
nil,
)
_, err := reconciler.Reconcile(t.Context(), ctrl.Request{NamespacedName: types.NamespacedName{Namespace: test.repo.Namespace, Name: "repo"}})
@@ -1638,6 +1645,7 @@ func TestInitializeRepoWithRepositoryTypes(t *testing.T) {
"",
logrus.InfoLevel,
nil,
nil,
)
err := reconciler.initializeRepo(t.Context(), rr, location, reconciler.logger)
@@ -1689,6 +1697,7 @@ func TestInitializeRepoWithRepositoryTypes(t *testing.T) {
"",
logrus.InfoLevel,
nil,
nil,
)
err := reconciler.initializeRepo(t.Context(), rr, location, reconciler.logger)
@@ -1739,6 +1748,7 @@ func TestInitializeRepoWithRepositoryTypes(t *testing.T) {
"",
logrus.InfoLevel,
nil,
nil,
)
err := reconciler.initializeRepo(t.Context(), rr, location, reconciler.logger)
@@ -1750,3 +1760,189 @@ func TestInitializeRepoWithRepositoryTypes(t *testing.T) {
assert.Equal(t, velerov1api.BackupRepositoryPhaseReady, rr.Status.Phase)
})
}
func TestRepoMaintenanceMetricsRecording(t *testing.T) {
now := time.Now().Round(time.Second)
tests := []struct {
name string
repo *velerov1api.BackupRepository
startJobFunc func(client.Client, context.Context, *velerov1api.BackupRepository, string, logrus.Level, *logging.FormatFlag, logrus.FieldLogger) (string, error)
waitJobFunc func(client.Client, context.Context, string, string, logrus.FieldLogger) (velerov1api.BackupRepositoryMaintenanceStatus, error)
expectSuccess bool
expectFailure bool
expectDuration bool
}{
{
name: "metrics recorded on successful maintenance",
repo: &velerov1api.BackupRepository{
ObjectMeta: metav1.ObjectMeta{
Namespace: velerov1api.DefaultNamespace,
Name: "test-repo-success",
},
Spec: velerov1api.BackupRepositorySpec{
MaintenanceFrequency: metav1.Duration{Duration: time.Hour},
},
Status: velerov1api.BackupRepositoryStatus{
LastMaintenanceTime: &metav1.Time{Time: now.Add(-2 * time.Hour)},
},
},
startJobFunc: startMaintenanceJobSucceed,
waitJobFunc: waitMaintenanceJobCompleteFunc(now, velerov1api.BackupRepositoryMaintenanceSucceeded, ""),
expectSuccess: true,
expectFailure: false,
expectDuration: true,
},
{
name: "metrics recorded on failed maintenance",
repo: &velerov1api.BackupRepository{
ObjectMeta: metav1.ObjectMeta{
Namespace: velerov1api.DefaultNamespace,
Name: "test-repo-failure",
},
Spec: velerov1api.BackupRepositorySpec{
MaintenanceFrequency: metav1.Duration{Duration: time.Hour},
},
Status: velerov1api.BackupRepositoryStatus{
LastMaintenanceTime: &metav1.Time{Time: now.Add(-2 * time.Hour)},
},
},
startJobFunc: startMaintenanceJobSucceed,
waitJobFunc: func(client.Client, context.Context, string, string, logrus.FieldLogger) (velerov1api.BackupRepositoryMaintenanceStatus, error) {
return velerov1api.BackupRepositoryMaintenanceStatus{
StartTimestamp: &metav1.Time{Time: now},
CompleteTimestamp: &metav1.Time{Time: now.Add(time.Minute)}, // Job ran for 1 minute then failed
Result: velerov1api.BackupRepositoryMaintenanceFailed,
Message: "test error",
}, nil
},
expectSuccess: false,
expectFailure: true,
expectDuration: true,
},
{
name: "metrics recorded on job start failure",
repo: &velerov1api.BackupRepository{
ObjectMeta: metav1.ObjectMeta{
Namespace: velerov1api.DefaultNamespace,
Name: "test-repo-start-fail",
},
Spec: velerov1api.BackupRepositorySpec{
MaintenanceFrequency: metav1.Duration{Duration: time.Hour},
},
Status: velerov1api.BackupRepositoryStatus{
LastMaintenanceTime: &metav1.Time{Time: now.Add(-2 * time.Hour)},
},
},
startJobFunc: startMaintenanceJobFail,
expectSuccess: false,
expectFailure: true,
expectDuration: false, // No duration when job fails to start
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
// Create metrics instance
m := metrics.NewServerMetrics()
// Create reconciler with metrics
reconciler := mockBackupRepoReconciler(t, "", test.repo, nil)
reconciler.metrics = m
reconciler.clock = &fakeClock{now}
err := reconciler.Client.Create(t.Context(), test.repo)
require.NoError(t, err)
// Set up job functions
funcStartMaintenanceJob = test.startJobFunc
funcWaitMaintenanceJobComplete = test.waitJobFunc
// Run maintenance
_ = reconciler.runMaintenanceIfDue(t.Context(), test.repo, velerotest.NewLogger())
// Verify metrics were recorded
successCount := getMaintenanceMetricValue(t, m, "repo_maintenance_success_total", test.repo.Name)
failureCount := getMaintenanceMetricValue(t, m, "repo_maintenance_failure_total", test.repo.Name)
durationCount := getMaintenanceDurationCount(t, m, test.repo.Name)
if test.expectSuccess {
assert.Equal(t, float64(1), successCount, "Success metric should be recorded")
} else {
assert.Equal(t, float64(0), successCount, "Success metric should not be recorded")
}
if test.expectFailure {
assert.Equal(t, float64(1), failureCount, "Failure metric should be recorded")
} else {
assert.Equal(t, float64(0), failureCount, "Failure metric should not be recorded")
}
if test.expectDuration {
assert.Equal(t, uint64(1), durationCount, "Duration metric should be recorded")
} else {
assert.Equal(t, uint64(0), durationCount, "Duration metric should not be recorded")
}
})
}
}
// Helper to get maintenance metric value from ServerMetrics
func getMaintenanceMetricValue(t *testing.T, m *metrics.ServerMetrics, metricName, repoName string) float64 {
t.Helper()
metricMap := m.Metrics()
collector, ok := metricMap[metricName]
if !ok {
return 0
}
ch := make(chan prometheus.Metric, 1)
collector.Collect(ch)
close(ch)
for metric := range ch {
dto := &dto.Metric{}
err := metric.Write(dto)
require.NoError(t, err)
for _, label := range dto.Label {
if *label.Name == "repository_name" && *label.Value == repoName {
if dto.Counter != nil {
return *dto.Counter.Value
}
}
}
}
return 0
}
// Helper to get maintenance duration histogram count
func getMaintenanceDurationCount(t *testing.T, m *metrics.ServerMetrics, repoName string) uint64 {
t.Helper()
metricMap := m.Metrics()
collector, ok := metricMap["repo_maintenance_duration_seconds"]
if !ok {
return 0
}
ch := make(chan prometheus.Metric, 1)
collector.Collect(ch)
close(ch)
for metric := range ch {
dto := &dto.Metric{}
err := metric.Write(dto)
require.NoError(t, err)
for _, label := range dto.Label {
if *label.Name == "repository_name" && *label.Value == repoName {
if dto.Histogram != nil {
return *dto.Histogram.SampleCount
}
}
}
}
return 0
}
@@ -18,6 +18,7 @@ package controller
import (
"context"
"regexp"
"strings"
"time"
@@ -46,6 +47,104 @@ const (
bslValidationEnqueuePeriod = 10 * time.Second
)
// sanitizeStorageError cleans up verbose HTTP responses from cloud provider errors,
// particularly Azure which includes full HTTP response details and XML in error messages.
// It extracts the error code and message while removing HTTP headers and response bodies.
// It also scrubs sensitive information like SAS tokens from URLs.
func sanitizeStorageError(err error) string {
if err == nil {
return ""
}
errMsg := err.Error()
// Scrub sensitive information from URLs (SAS tokens, credentials, etc.)
// Azure SAS token parameters: sig, se, st, sp, spr, sv, sr, sip, srt, ss
// These appear as query parameters in URLs like: ?sig=value&se=value
sasParamsRegex := regexp.MustCompile(`([?&])(sig|se|st|sp|spr|sv|sr|sip|srt|ss)=([^&\s<>\n]+)`)
errMsg = sasParamsRegex.ReplaceAllString(errMsg, `${1}${2}=***REDACTED***`)
// Check if this looks like an Azure HTTP response error
// Azure errors contain patterns like "RESPONSE 404:" and "ERROR CODE:"
if !strings.Contains(errMsg, "RESPONSE") || !strings.Contains(errMsg, "ERROR CODE:") {
// Not an Azure-style error, return as-is
return errMsg
}
// Extract the error code (e.g., "ContainerNotFound", "BlobNotFound")
errorCodeRegex := regexp.MustCompile(`ERROR CODE:\s*(\w+)`)
errorCodeMatch := errorCodeRegex.FindStringSubmatch(errMsg)
var errorCode string
if len(errorCodeMatch) > 1 {
errorCode = errorCodeMatch[1]
}
// Extract the error message from the XML or plain text
// Look for message between <Message> tags or after "RESPONSE XXX:"
var errorMessage string
// Try to extract from XML first
messageRegex := regexp.MustCompile(`<Message>(.*?)</Message>`)
messageMatch := messageRegex.FindStringSubmatch(errMsg)
if len(messageMatch) > 1 {
errorMessage = messageMatch[1]
// Remove RequestId and Time from the message
if idx := strings.Index(errorMessage, "\nRequestId:"); idx != -1 {
errorMessage = errorMessage[:idx]
}
} else {
// Try to extract from plain text response (e.g., "RESPONSE 404: 404 The specified container does not exist.")
responseRegex := regexp.MustCompile(`RESPONSE\s+\d+:\s+\d+\s+([^\n]+)`)
responseMatch := responseRegex.FindStringSubmatch(errMsg)
if len(responseMatch) > 1 {
errorMessage = strings.TrimSpace(responseMatch[1])
}
}
// Build a clean error message
var cleanMsg string
if errorCode != "" && errorMessage != "" {
cleanMsg = errorCode + ": " + errorMessage
} else if errorCode != "" {
cleanMsg = errorCode
} else if errorMessage != "" {
cleanMsg = errorMessage
} else {
// Fallback: try to extract the desc part from gRPC error
descRegex := regexp.MustCompile(`desc\s*=\s*(.+)`)
descMatch := descRegex.FindStringSubmatch(errMsg)
if len(descMatch) > 1 {
// Take everything up to the first newline or "RESPONSE" marker
desc := descMatch[1]
if idx := strings.Index(desc, "\n"); idx != -1 {
desc = desc[:idx]
}
if idx := strings.Index(desc, "RESPONSE"); idx != -1 {
desc = strings.TrimSpace(desc[:idx])
}
cleanMsg = desc
} else {
// Last resort: return first line
if idx := strings.Index(errMsg, "\n"); idx != -1 {
cleanMsg = errMsg[:idx]
} else {
cleanMsg = errMsg
}
}
}
// Preserve the prefix part of the error (e.g., "rpc error: code = Unknown desc = ")
// but replace the verbose description with our clean message
if strings.Contains(errMsg, "desc = ") {
parts := strings.SplitN(errMsg, "desc = ", 2)
if len(parts) == 2 {
return parts[0] + "desc = " + cleanMsg
}
}
return cleanMsg
}
// BackupStorageLocationReconciler reconciles a BackupStorageLocation object
type backupStorageLocationReconciler struct {
ctx context.Context
@@ -125,9 +224,9 @@ func (r *backupStorageLocationReconciler) Reconcile(ctx context.Context, req ctr
if err != nil {
log.Info("BackupStorageLocation is invalid, marking as unavailable")
err = errors.Wrapf(err, "BackupStorageLocation %q is unavailable", location.Name)
unavailableErrors = append(unavailableErrors, err.Error())
unavailableErrors = append(unavailableErrors, sanitizeStorageError(err))
location.Status.Phase = velerov1api.BackupStorageLocationPhaseUnavailable
location.Status.Message = err.Error()
location.Status.Message = sanitizeStorageError(err)
} else {
log.Info("BackupStorageLocations is valid, marking as available")
location.Status.Phase = velerov1api.BackupStorageLocationPhaseAvailable
@@ -138,6 +237,12 @@ func (r *backupStorageLocationReconciler) Reconcile(ctx context.Context, req ctr
}
}()
// Validate the BackupStorageLocation spec
if err = location.Validate(); err != nil {
log.WithError(err).Error("BackupStorageLocation spec is invalid")
return
}
backupStore, err := r.backupStoreGetter.Get(&location, pluginManager, log)
if err != nil {
log.WithError(err).Error("Error getting a backup store")
@@ -303,3 +303,115 @@ func TestBSLReconcile(t *testing.T) {
})
}
}
func TestSanitizeStorageError(t *testing.T) {
tests := []struct {
name string
input error
expected string
}{
{
name: "Nil error",
input: nil,
expected: "",
},
{
name: "Simple error without Azure formatting",
input: errors.New("simple error message"),
expected: "simple error message",
},
{
name: "AWS style error",
input: errors.New("NoSuchBucket: The specified bucket does not exist"),
expected: "NoSuchBucket: The specified bucket does not exist",
},
{
name: "Azure container not found error with full HTTP response",
input: errors.New(`rpc error: code = Unknown desc = GET https://oadp100711zl59k.blob.core.windows.net/oadp100711zl59k1
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified container does not exist.
ERROR CODE: ContainerNotFound
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>ContainerNotFound</Code><Message>The specified container does not exist.
RequestId:63cf34d8-801e-0078-09b4-2e4682000000
Time:2024-11-04T12:23:04.5623627Z</Message></Error>
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = ContainerNotFound: The specified container does not exist.",
},
{
name: "Azure blob not found error",
input: errors.New(`rpc error: code = Unknown desc = GET https://storage.blob.core.windows.net/container/blob
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified blob does not exist.
ERROR CODE: BlobNotFound
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>BlobNotFound</Code><Message>The specified blob does not exist.
RequestId:12345678-1234-1234-1234-123456789012
Time:2024-11-04T12:23:04.5623627Z</Message></Error>
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = BlobNotFound: The specified blob does not exist.",
},
{
name: "Azure error with plain text response (no XML)",
input: errors.New(`rpc error: code = Unknown desc = GET https://storage.blob.core.windows.net/container
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified container does not exist.
ERROR CODE: ContainerNotFound
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = ContainerNotFound: The specified container does not exist.",
},
{
name: "Azure error without XML message but with error code",
input: errors.New(`rpc error: code = Unknown desc = operation failed
RESPONSE 403: 403 Forbidden
ERROR CODE: AuthorizationFailure
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = AuthorizationFailure: Forbidden",
},
{
name: "Error with Azure SAS token in URL",
input: errors.New(`rpc error: code = Unknown desc = GET https://storage.blob.core.windows.net/backup?sv=2020-08-04&sig=abc123secrettoken&se=2024-12-31T23:59:59Z&sp=rwdl
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified container does not exist.
ERROR CODE: ContainerNotFound
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = ContainerNotFound: The specified container does not exist.",
},
{
name: "Error with multiple SAS parameters",
input: errors.New(`GET https://mystorageaccount.blob.core.windows.net/container?sv=2020-08-04&ss=b&srt=sco&sp=rwdlac&se=2024-12-31&st=2024-01-01&sip=168.1.5.60&spr=https&sig=SIGNATURE_HASH`),
expected: "GET https://mystorageaccount.blob.core.windows.net/container?sv=***REDACTED***&ss=***REDACTED***&srt=***REDACTED***&sp=***REDACTED***&se=***REDACTED***&st=***REDACTED***&sip=***REDACTED***&spr=***REDACTED***&sig=***REDACTED***",
},
{
name: "Simple URL without SAS tokens unchanged",
input: errors.New("GET https://storage.blob.core.windows.net/container/blob"),
expected: "GET https://storage.blob.core.windows.net/container/blob",
},
{
name: "Azure error with SAS token in full HTTP response",
input: errors.New(`rpc error: code = Unknown desc = GET https://oadp100711zl59k.blob.core.windows.net/backup?sig=secretsignature123&se=2024-12-31
--------------------------------------------------------------------------------
RESPONSE 404: 404 The specified container does not exist.
ERROR CODE: ContainerNotFound
--------------------------------------------------------------------------------
<?xml version="1.0" encoding="utf-8"?><Error><Code>ContainerNotFound</Code><Message>The specified container does not exist.
RequestId:63cf34d8-801e-0078-09b4-2e4682000000
Time:2024-11-04T12:23:04.5623627Z</Message></Error>
--------------------------------------------------------------------------------
`),
expected: "rpc error: code = Unknown desc = ContainerNotFound: The specified container does not exist.",
},
}
for _, test := range tests {
t.Run(test.name, func(t *testing.T) {
actual := sanitizeStorageError(test.input)
assert.Equal(t, test.expected, actual)
})
}
}
-57
View File
@@ -21,7 +21,6 @@ import (
"fmt"
"time"
snapshotv1api "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1"
"github.com/pkg/errors"
"github.com/sirupsen/logrus"
apierrors "k8s.io/apimachinery/pkg/api/errors"
@@ -36,7 +35,6 @@ import (
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/constant"
"github.com/vmware-tanzu/velero/pkg/features"
"github.com/vmware-tanzu/velero/pkg/label"
"github.com/vmware-tanzu/velero/pkg/persistence"
"github.com/vmware-tanzu/velero/pkg/plugin/clientmgmt"
@@ -243,31 +241,6 @@ func (b *backupSyncReconciler) Reconcile(ctx context.Context, req ctrl.Request)
log.Debug("Synced pod volume backup into cluster")
}
}
if features.IsEnabled(velerov1api.CSIFeatureFlag) {
// we are syncing these objects only to ensure that the storage snapshots are cleaned up
// on backup deletion or expiry.
log.Info("Syncing CSI VolumeSnapshotClasses in backup")
vsClasses, err := backupStore.GetCSIVolumeSnapshotClasses(backupName)
if err != nil {
log.WithError(errors.WithStack(err)).Error("Error getting CSI VolumeSnapClasses for this backup from backup store")
continue
}
for _, vsClass := range vsClasses {
vsClass.ResourceVersion = ""
err := b.client.Create(ctx, vsClass, &client.CreateOptions{})
switch {
case err != nil && apierrors.IsAlreadyExists(err):
log.Debugf("VolumeSnapshotClass %s already exists in cluster", vsClass.Name)
continue
case err != nil && !apierrors.IsAlreadyExists(err):
log.WithError(errors.WithStack(err)).Errorf("Error syncing VolumeSnapshotClass %s into cluster", vsClass.Name)
continue
default:
log.Infof("Created CSI VolumeSnapshotClass %s", vsClass.Name)
}
}
}
}
b.deleteOrphanedBackups(ctx, location.Name, backupStoreBackups, log)
@@ -364,40 +337,10 @@ func (b *backupSyncReconciler) deleteOrphanedBackups(ctx context.Context, locati
if err := b.client.Delete(ctx, &backupList.Items[i], &client.DeleteOptions{}); err != nil {
log.WithError(errors.WithStack(err)).Error("Error deleting orphaned backup from cluster")
} else {
log.Debug("Deleted orphaned backup from cluster")
b.deleteCSISnapshotsByBackup(ctx, backup.Name, log)
}
}
}
func (b *backupSyncReconciler) deleteCSISnapshotsByBackup(ctx context.Context, backupName string, log logrus.FieldLogger) {
if !features.IsEnabled(velerov1api.CSIFeatureFlag) {
return
}
m := client.MatchingLabels{velerov1api.BackupNameLabel: label.GetValidName(backupName)}
var vsList snapshotv1api.VolumeSnapshotList
listOptions := &client.ListOptions{
LabelSelector: label.NewSelectorForBackup(label.GetValidName(backupName)),
}
if err := b.client.List(ctx, &vsList, listOptions); err != nil {
log.WithError(err).Warnf("Failed to list volumesnapshots for backup: %s, the deletion will be skipped", backupName)
} else {
for i, vs := range vsList.Items {
name := kube.NamespaceAndName(vs.GetObjectMeta())
log.Debugf("Deleting volumesnapshot %s", name)
if err := b.client.Delete(context.TODO(), &vsList.Items[i]); err != nil {
log.WithError(err).Warnf("Failed to delete volumesnapshot %s", name)
}
}
}
vsc := &snapshotv1api.VolumeSnapshotContent{}
log.Debugf("Deleting volumesnapshotcontents for backup: %s", backupName)
if err := b.client.DeleteAllOf(context.TODO(), vsc, m); err != nil {
log.WithError(err).Warnf("Failed to delete volumesnapshotcontents for backup: %s", backupName)
}
}
// backupSyncSourceOrderFunc returns a new slice with the default backup location first (if it exists),
// followed by the rest of the locations in no particular order.
func backupSyncSourceOrderFunc(objList client.ObjectList) client.ObjectList {
@@ -24,7 +24,6 @@ import (
. "github.com/onsi/ginkgo/v2"
. "github.com/onsi/gomega"
snapshotv1api "github.com/kubernetes-csi/external-snapshotter/client/v8/apis/volumesnapshot/v1"
"github.com/sirupsen/logrus"
apierrors "k8s.io/apimachinery/pkg/api/errors"
"k8s.io/apimachinery/pkg/api/meta"
@@ -451,8 +450,6 @@ var _ = Describe("Backup Sync Reconciler", func() {
backupStore.On("GetBackupMetadata", backup.backup.Name).Return(backup.backup, nil)
backupStore.On("GetPodVolumeBackups", backup.backup.Name).Return(backup.podVolumeBackups, nil)
backupStore.On("BackupExists", "bucket-1", backup.backup.Name).Return(true, nil)
backupStore.On("GetCSIVolumeSnapshotClasses", backup.backup.Name).Return([]*snapshotv1api.VolumeSnapshotClass{}, nil)
backupStore.On("GetCSIVolumeSnapshotContents", backup.backup.Name).Return([]*snapshotv1api.VolumeSnapshotContent{}, nil)
}
backupStore.On("ListBackups").Return(backupNames, nil)
}
+53 -8
View File
@@ -25,45 +25,90 @@ import (
// BackupTracker keeps track of in-progress backups.
type BackupTracker interface {
// Add informs the tracker that a backup is ReadyToStart.
AddReadyToStart(ns, name string)
// Add informs the tracker that a backup is in progress.
Add(ns, name string)
// Delete informs the tracker that a backup is no longer in progress.
// Add informs the tracker that a backup has moved beyond InProgress
AddPostProcessing(ns, name string)
// Delete informs the tracker that a backup has reached a terminal state.
Delete(ns, name string)
// Contains returns true if the tracker is tracking the backup.
// Contains returns true if backup is InProgress or post-InProgress
Contains(ns, name string) bool
// RunningCount returns the number of backups which are ReadyToStart or InProgress
RunningCount() int
}
type backupTracker struct {
lock sync.RWMutex
backups sets.Set[string]
lock sync.RWMutex
readyToStartBackups sets.Set[string]
inProgressBackups sets.Set[string]
postProgressBackups sets.Set[string]
}
// NewBackupTracker returns a new BackupTracker.
func NewBackupTracker() BackupTracker {
return &backupTracker{
backups: sets.New[string](),
readyToStartBackups: sets.New[string](),
inProgressBackups: sets.New[string](),
postProgressBackups: sets.New[string](),
}
}
func (bt *backupTracker) AddReadyToStart(ns, name string) {
bt.lock.Lock()
defer bt.lock.Unlock()
bt.readyToStartBackups.Insert(backupTrackerKey(ns, name))
}
func (bt *backupTracker) Add(ns, name string) {
bt.lock.Lock()
defer bt.lock.Unlock()
bt.backups.Insert(backupTrackerKey(ns, name))
key := backupTrackerKey(ns, name)
bt.readyToStartBackups.Delete(key)
bt.inProgressBackups.Insert(key)
}
func (bt *backupTracker) AddPostProcessing(ns, name string) {
bt.lock.Lock()
defer bt.lock.Unlock()
key := backupTrackerKey(ns, name)
bt.readyToStartBackups.Delete(key)
bt.inProgressBackups.Delete(key)
bt.postProgressBackups.Insert(key)
}
func (bt *backupTracker) Delete(ns, name string) {
bt.lock.Lock()
defer bt.lock.Unlock()
bt.backups.Delete(backupTrackerKey(ns, name))
key := backupTrackerKey(ns, name)
bt.readyToStartBackups.Delete(key)
bt.inProgressBackups.Delete(key)
bt.postProgressBackups.Delete(key)
}
// Contains returns true if backup is InProgress or post-InProgress
// ignores ReadyToStart, since this is used to determine whether
// a backup is in progress and thus not able to be deleted now.
func (bt *backupTracker) Contains(ns, name string) bool {
bt.lock.RLock()
defer bt.lock.RUnlock()
return bt.backups.Has(backupTrackerKey(ns, name))
key := backupTrackerKey(ns, name)
return bt.inProgressBackups.Has(key) || bt.postProgressBackups.Has(key)
}
// RunningCount returns the number of backups which are ReadyToStart or InProgress
// used by queue controller to determine whether a new backup can be started.
func (bt *backupTracker) RunningCount() int {
bt.lock.RLock()
defer bt.lock.RUnlock()
return bt.inProgressBackups.Len() + bt.readyToStartBackups.Len()
}
func backupTrackerKey(ns, name string) string {
+30 -12
View File
@@ -77,6 +77,8 @@ type DataDownloadReconciler struct {
cancelledDataDownload map[string]time.Time
dataMovePriorityClass string
repoConfigMgr repository.ConfigManager
podLabels map[string]string
podAnnotations map[string]string
}
func NewDataDownloadReconciler(
@@ -96,6 +98,8 @@ func NewDataDownloadReconciler(
metrics *metrics.ServerMetrics,
dataMovePriorityClass string,
repoConfigMgr repository.ConfigManager,
podLabels map[string]string,
podAnnotations map[string]string,
) *DataDownloadReconciler {
return &DataDownloadReconciler{
client: client,
@@ -117,6 +121,8 @@ func NewDataDownloadReconciler(
cancelledDataDownload: make(map[string]time.Time),
dataMovePriorityClass: dataMovePriorityClass,
repoConfigMgr: repoConfigMgr,
podLabels: podLabels,
podAnnotations: podAnnotations,
}
}
@@ -860,25 +866,37 @@ func (r *DataDownloadReconciler) setupExposeParam(dd *velerov2alpha1api.DataDown
}
hostingPodLabels := map[string]string{velerov1api.DataDownloadLabel: dd.Name}
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, dd.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
if len(r.podLabels) > 0 {
for k, v := range r.podLabels {
hostingPodLabels[k] = v
}
} else {
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, dd.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
hostingPodLabels[k] = v
}
}
}
hostingPodAnnotation := map[string]string{}
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, dd.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
if len(r.podAnnotations) > 0 {
for k, v := range r.podAnnotations {
hostingPodAnnotation[k] = v
}
} else {
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, dd.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
hostingPodAnnotation[k] = v
}
}
}
hostingPodTolerations := []corev1api.Toleration{}
+144 -1
View File
@@ -129,7 +129,26 @@ func initDataDownloadReconcilerWithError(t *testing.T, objects []any, needError
dataPathMgr := datapath.NewManager(1)
return NewDataDownloadReconciler(&fakeClient, nil, fakeKubeClient, dataPathMgr, nil, nil, velerotypes.RestorePVC{}, nil, nil, corev1api.ResourceRequirements{}, "test-node", time.Minute*5, velerotest.NewLogger(), metrics.NewServerMetrics(), "", nil), nil
return NewDataDownloadReconciler(
&fakeClient,
nil,
fakeKubeClient,
dataPathMgr,
nil,
nil,
velerotypes.RestorePVC{},
nil,
nil,
corev1api.ResourceRequirements{},
"test-node",
time.Minute*5,
velerotest.NewLogger(),
metrics.NewServerMetrics(),
"",
nil,
nil, // podLabels
nil, // podAnnotations
), nil
}
func TestDataDownloadReconcile(t *testing.T) {
@@ -1292,3 +1311,127 @@ func TestResumeCancellableRestore(t *testing.T) {
})
}
}
func TestDataDownloadSetupExposeParam(t *testing.T) {
// Common objects for all cases
node := builder.ForNode("worker-1").Labels(map[string]string{kube.NodeOSLabel: kube.NodeOSLinux}).Result()
baseDataDownload := dataDownloadBuilder().Result()
baseDataDownload.Namespace = velerov1api.DefaultNamespace
baseDataDownload.Spec.OperationTimeout = metav1.Duration{Duration: time.Minute * 10}
baseDataDownload.Spec.SnapshotSize = 5368709120 // 5Gi
type args struct {
customLabels map[string]string
customAnnotations map[string]string
}
type want struct {
labels map[string]string
annotations map[string]string
}
tests := []struct {
name string
args args
want want
}{
{
name: "label has customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.DataDownloadLabel: baseDataDownload.Name,
"custom-label": "label-value",
},
annotations: map[string]string{},
},
},
{
name: "label has no customize values",
args: args{
customLabels: nil,
customAnnotations: nil,
},
want: want{
labels: map[string]string{velerov1api.DataDownloadLabel: baseDataDownload.Name},
annotations: map[string]string{},
},
},
{
name: "annotation has customize values",
args: args{
customLabels: nil,
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{velerov1api.DataDownloadLabel: baseDataDownload.Name},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
{
name: "both label and annotation have customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{
velerov1api.DataDownloadLabel: baseDataDownload.Name,
"custom-label": "label-value",
},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Fake clients per case
fakeClient := FakeClient{
Client: velerotest.NewFakeControllerRuntimeClient(t, node, baseDataDownload.DeepCopy()),
}
fakeKubeClient := clientgofake.NewSimpleClientset(node)
// Reconciler config per case
preparingTimeout := time.Minute * 3
podRes := corev1api.ResourceRequirements{}
r := NewDataDownloadReconciler(
&fakeClient,
nil,
fakeKubeClient,
datapath.NewManager(1),
nil,
nil,
velerotypes.RestorePVC{},
nil,
nil,
podRes,
"test-node",
preparingTimeout,
velerotest.NewLogger(),
metrics.NewServerMetrics(),
"download-priority",
nil, // repoConfigMgr (unused when cacheVolumeConfigs is nil)
tt.args.customLabels,
tt.args.customAnnotations,
)
// Act
got, err := r.setupExposeParam(baseDataDownload)
// Assert no error
require.NoError(t, err)
// Core fields
assert.Equal(t, baseDataDownload.Spec.TargetVolume.PVC, got.TargetPVCName)
assert.Equal(t, baseDataDownload.Spec.TargetVolume.Namespace, got.TargetNamespace)
// Labels and Annotations
assert.Equal(t, tt.want.labels, got.HostingPodLabels)
assert.Equal(t, tt.want.annotations, got.HostingPodAnnotations)
})
}
}
+30 -12
View File
@@ -83,6 +83,8 @@ type DataUploadReconciler struct {
metrics *metrics.ServerMetrics
cancelledDataUpload map[string]time.Time
dataMovePriorityClass string
podLabels map[string]string
podAnnotations map[string]string
}
func NewDataUploadReconciler(
@@ -101,6 +103,8 @@ func NewDataUploadReconciler(
log logrus.FieldLogger,
metrics *metrics.ServerMetrics,
dataMovePriorityClass string,
podLabels map[string]string,
podAnnotations map[string]string,
) *DataUploadReconciler {
return &DataUploadReconciler{
client: client,
@@ -126,6 +130,8 @@ func NewDataUploadReconciler(
metrics: metrics,
cancelledDataUpload: make(map[string]time.Time),
dataMovePriorityClass: dataMovePriorityClass,
podLabels: podLabels,
podAnnotations: podAnnotations,
}
}
@@ -936,25 +942,37 @@ func (r *DataUploadReconciler) setupExposeParam(du *velerov2alpha1api.DataUpload
}
hostingPodLabels := map[string]string{velerov1api.DataUploadLabel: du.Name}
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, du.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
if len(r.podLabels) > 0 {
for k, v := range r.podLabels {
hostingPodLabels[k] = v
}
} else {
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, du.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
hostingPodLabels[k] = v
}
}
}
hostingPodAnnotation := map[string]string{}
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, du.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
if len(r.podAnnotations) > 0 {
for k, v := range r.podAnnotations {
hostingPodAnnotation[k] = v
}
} else {
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, du.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
hostingPodAnnotation[k] = v
}
}
}
hostingPodTolerations := []corev1api.Toleration{}
+149 -1
View File
@@ -248,7 +248,9 @@ func initDataUploaderReconcilerWithError(needError ...error) (*DataUploadReconci
time.Minute*5,
velerotest.NewLogger(),
metrics.NewServerMetrics(),
"", // dataMovePriorityClass
"", // dataMovePriorityClass
nil, // podLabels
nil, // podAnnotations
), nil
}
@@ -1384,3 +1386,149 @@ func TestResumeCancellableBackup(t *testing.T) {
})
}
}
func TestDataUploadSetupExposeParam(t *testing.T) {
// Common objects for all cases
fileMode := corev1api.PersistentVolumeFilesystem
node := builder.ForNode("worker-1").Labels(map[string]string{kube.NodeOSLabel: kube.NodeOSLinux}).Result()
pvc := &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Namespace: "app-ns",
Name: "test-pvc",
},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "test-pv",
VolumeMode: &fileMode,
Resources: corev1api.VolumeResourceRequirements{
Requests: corev1api.ResourceList{
corev1api.ResourceStorage: resource.MustParse("10Gi"),
},
},
},
}
pv := &corev1api.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pv",
},
}
baseDataUpload := dataUploadBuilder().Result()
baseDataUpload.Spec.SourceNamespace = "app-ns"
baseDataUpload.Spec.SourcePVC = "test-pvc"
baseDataUpload.Namespace = velerov1api.DefaultNamespace
baseDataUpload.Spec.OperationTimeout = metav1.Duration{Duration: time.Minute * 10}
type args struct {
customLabels map[string]string
customAnnotations map[string]string
}
type want struct {
labels map[string]string
annotations map[string]string
}
tests := []struct {
name string
args args
want want
}{
{
name: "label has customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.DataUploadLabel: baseDataUpload.Name,
"custom-label": "label-value",
},
annotations: map[string]string{},
},
},
{
name: "label has no customize values",
args: args{
customLabels: nil,
customAnnotations: nil,
},
want: want{
labels: map[string]string{velerov1api.DataUploadLabel: baseDataUpload.Name},
annotations: map[string]string{},
},
},
{
name: "annotation has customize values",
args: args{
customLabels: nil,
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{velerov1api.DataUploadLabel: baseDataUpload.Name},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
{
name: "both label and annotation have customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{
velerov1api.DataUploadLabel: baseDataUpload.Name,
"custom-label": "label-value",
},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Fake clients per case
fakeCRClient := velerotest.NewFakeControllerRuntimeClient(t, pvc, pv, node, baseDataUpload.DeepCopy())
fakeKubeClient := clientgofake.NewSimpleClientset(node)
// Reconciler config per case
preparingTimeout := time.Minute * 3
podRes := corev1api.ResourceRequirements{}
r := NewDataUploadReconciler(
fakeCRClient,
nil,
fakeKubeClient,
nil, // snapshotClient (unused in setupExposeParam)
datapath.NewManager(1),
nil, // dataPathMgr
nil, // exposer (unused in setupExposeParam)
map[string]velerotypes.BackupPVC{},
podRes,
testclocks.NewFakeClock(time.Now()),
"test-node",
preparingTimeout,
velerotest.NewLogger(),
metrics.NewServerMetrics(),
"upload-priority",
tt.args.customLabels,
tt.args.customAnnotations,
)
// Act
got, err := r.setupExposeParam(baseDataUpload)
// Assert no error
require.NoError(t, err)
require.NotNil(t, got)
// Type assertion to CSISnapshotExposeParam
csiParam, ok := got.(*exposer.CSISnapshotExposeParam)
require.True(t, ok, "expected CSISnapshotExposeParam type")
// Labels and Annotations
assert.Equal(t, tt.want.labels, csiParam.HostingPodLabels)
assert.Equal(t, tt.want.annotations, csiParam.HostingPodAnnotations)
})
}
}
+45 -15
View File
@@ -58,9 +58,23 @@ const (
)
// NewPodVolumeBackupReconciler creates the PodVolumeBackupReconciler instance
func NewPodVolumeBackupReconciler(client client.Client, mgr manager.Manager, kubeClient kubernetes.Interface, dataPathMgr *datapath.Manager,
counter *exposer.VgdpCounter, nodeName string, preparingTimeout time.Duration, resourceTimeout time.Duration, podResources corev1api.ResourceRequirements,
metrics *metrics.ServerMetrics, logger logrus.FieldLogger, dataMovePriorityClass string, privileged bool) *PodVolumeBackupReconciler {
func NewPodVolumeBackupReconciler(
client client.Client,
mgr manager.Manager,
kubeClient kubernetes.Interface,
dataPathMgr *datapath.Manager,
counter *exposer.VgdpCounter,
nodeName string,
preparingTimeout time.Duration,
resourceTimeout time.Duration,
podResources corev1api.ResourceRequirements,
metrics *metrics.ServerMetrics,
logger logrus.FieldLogger,
dataMovePriorityClass string,
privileged bool,
podLabels map[string]string,
podAnnotations map[string]string,
) *PodVolumeBackupReconciler {
return &PodVolumeBackupReconciler{
client: client,
mgr: mgr,
@@ -78,6 +92,8 @@ func NewPodVolumeBackupReconciler(client client.Client, mgr manager.Manager, kub
cancelledPVB: make(map[string]time.Time),
dataMovePriorityClass: dataMovePriorityClass,
privileged: privileged,
podLabels: podLabels,
podAnnotations: podAnnotations,
}
}
@@ -99,6 +115,8 @@ type PodVolumeBackupReconciler struct {
cancelledPVB map[string]time.Time
dataMovePriorityClass string
privileged bool
podLabels map[string]string
podAnnotations map[string]string
}
// +kubebuilder:rbac:groups=velero.io,resources=podvolumebackups,verbs=get;list;watch;create;update;patch;delete
@@ -796,25 +814,37 @@ func (r *PodVolumeBackupReconciler) setupExposeParam(pvb *velerov1api.PodVolumeB
}
hostingPodLabels := map[string]string{velerov1api.PVBLabel: pvb.Name}
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, pvb.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
if len(r.podLabels) > 0 {
for k, v := range r.podLabels {
hostingPodLabels[k] = v
}
} else {
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, pvb.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
hostingPodLabels[k] = v
}
}
}
hostingPodAnnotation := map[string]string{}
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, pvb.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
if len(r.podAnnotations) > 0 {
for k, v := range r.podAnnotations {
hostingPodAnnotation[k] = v
}
} else {
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, pvb.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
hostingPodAnnotation[k] = v
}
}
}
hostingPodTolerations := []corev1api.Toleration{}
@@ -47,13 +47,12 @@ import (
velerov2alpha1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
"github.com/vmware-tanzu/velero/pkg/builder"
"github.com/vmware-tanzu/velero/pkg/datapath"
datapathmocks "github.com/vmware-tanzu/velero/pkg/datapath/mocks"
"github.com/vmware-tanzu/velero/pkg/exposer"
"github.com/vmware-tanzu/velero/pkg/metrics"
velerotest "github.com/vmware-tanzu/velero/pkg/test"
"github.com/vmware-tanzu/velero/pkg/uploader"
"github.com/vmware-tanzu/velero/pkg/util/kube"
datapathmocks "github.com/vmware-tanzu/velero/pkg/datapath/mocks"
)
const pvbName = "pvb-1"
@@ -153,6 +152,8 @@ func initPVBReconcilerWithError(needError ...error) (*PodVolumeBackupReconciler,
velerotest.NewLogger(),
"", // dataMovePriorityClass
false, // privileged
nil, // podLabels
nil, // podAnnotations
), nil
}
@@ -1187,3 +1188,123 @@ func TestResumeCancellablePodVolumeBackup(t *testing.T) {
})
}
}
func TestPodVolumeBackupSetupExposeParam(t *testing.T) {
// common objects for all cases
node := builder.ForNode("worker-1").Labels(map[string]string{kube.NodeOSLabel: kube.NodeOSLinux}).Result()
basePVB := pvbBuilder().Result()
basePVB.Spec.Node = "worker-1"
basePVB.Spec.Pod.Namespace = "app-ns"
basePVB.Spec.Pod.Name = "app-pod"
basePVB.Spec.Volume = "data-vol"
type args struct {
customLabels map[string]string
customAnnotations map[string]string
}
type want struct {
labels map[string]string
annotations map[string]string
}
tests := []struct {
name string
args args
want want
}{
{
name: "label has customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.PVBLabel: basePVB.Name,
"custom-label": "label-value",
},
annotations: map[string]string{},
},
},
{
name: "label has no customize values",
args: args{
customLabels: nil,
customAnnotations: nil,
},
want: want{
labels: map[string]string{velerov1api.PVBLabel: basePVB.Name},
annotations: map[string]string{},
},
},
{
name: "annotation has customize values",
args: args{
customLabels: nil,
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{velerov1api.PVBLabel: basePVB.Name},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
{
name: "annotation has no customize values",
args: args{
customLabels: map[string]string{"another-label": "lval"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.PVBLabel: basePVB.Name,
"another-label": "lval",
},
annotations: map[string]string{},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Fake clients per case
fakeCRClient := velerotest.NewFakeControllerRuntimeClient(t, node, basePVB.DeepCopy())
fakeKubeClient := clientgofake.NewSimpleClientset(node)
// Reconciler config per case
preparingTimeout := time.Minute * 3
resourceTimeout := time.Minute * 10
podRes := corev1api.ResourceRequirements{}
r := NewPodVolumeBackupReconciler(
fakeCRClient,
nil,
fakeKubeClient,
datapath.NewManager(1),
nil,
"test-node",
preparingTimeout,
resourceTimeout,
podRes,
metrics.NewServerMetrics(),
velerotest.NewLogger(),
"backup-priority",
true,
tt.args.customLabels,
tt.args.customAnnotations,
)
// Act
got := r.setupExposeParam(basePVB)
// Core fields
assert.Equal(t, exposer.PodVolumeExposeTypeBackup, got.Type)
assert.Equal(t, basePVB.Spec.Pod.Namespace, got.ClientNamespace)
assert.Equal(t, basePVB.Spec.Pod.Name, got.ClientPodName)
assert.Equal(t, basePVB.Spec.Volume, got.ClientPodVolume)
// Labels/Annotations
assert.Equal(t, tt.want.labels, got.HostingPodLabels)
assert.Equal(t, tt.want.annotations, got.HostingPodAnnotations)
})
}
}
+47 -16
View File
@@ -56,10 +56,25 @@ import (
"github.com/vmware-tanzu/velero/pkg/util/kube"
)
func NewPodVolumeRestoreReconciler(client client.Client, mgr manager.Manager, kubeClient kubernetes.Interface, dataPathMgr *datapath.Manager,
counter *exposer.VgdpCounter, nodeName string, preparingTimeout time.Duration, resourceTimeout time.Duration, backupRepoConfigs map[string]string,
cacheVolumeConfigs *velerotypes.CachePVC, podResources corev1api.ResourceRequirements, logger logrus.FieldLogger, dataMovePriorityClass string,
privileged bool, repoConfigMgr repository.ConfigManager) *PodVolumeRestoreReconciler {
func NewPodVolumeRestoreReconciler(
client client.Client,
mgr manager.Manager,
kubeClient kubernetes.Interface,
dataPathMgr *datapath.Manager,
counter *exposer.VgdpCounter,
nodeName string,
preparingTimeout time.Duration,
resourceTimeout time.Duration,
backupRepoConfigs map[string]string,
cacheVolumeConfigs *velerotypes.CachePVC,
podResources corev1api.ResourceRequirements,
logger logrus.FieldLogger,
dataMovePriorityClass string,
privileged bool,
repoConfigMgr repository.ConfigManager,
podLabels map[string]string,
podAnnotations map[string]string,
) *PodVolumeRestoreReconciler {
return &PodVolumeRestoreReconciler{
client: client,
mgr: mgr,
@@ -79,6 +94,8 @@ func NewPodVolumeRestoreReconciler(client client.Client, mgr manager.Manager, ku
dataMovePriorityClass: dataMovePriorityClass,
privileged: privileged,
repoConfigMgr: repoConfigMgr,
podLabels: podLabels,
podAnnotations: podAnnotations,
}
}
@@ -101,6 +118,8 @@ type PodVolumeRestoreReconciler struct {
dataMovePriorityClass string
privileged bool
repoConfigMgr repository.ConfigManager
podLabels map[string]string
podAnnotations map[string]string
}
// +kubebuilder:rbac:groups=velero.io,resources=podvolumerestores,verbs=get;list;watch;create;update;patch;delete
@@ -863,25 +882,37 @@ func (r *PodVolumeRestoreReconciler) setupExposeParam(pvr *velerov1api.PodVolume
}
hostingPodLabels := map[string]string{velerov1api.PVRLabel: pvr.Name}
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, pvr.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
if len(r.podLabels) > 0 {
for k, v := range r.podLabels {
hostingPodLabels[k] = v
}
} else {
for _, k := range util.ThirdPartyLabels {
if v, err := nodeagent.GetLabelValue(context.Background(), r.kubeClient, pvr.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentLabelNotFound {
log.WithError(err).Warnf("Failed to check node-agent label, skip adding host pod label %s", k)
}
} else {
hostingPodLabels[k] = v
}
}
}
hostingPodAnnotation := map[string]string{}
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, pvr.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
if len(r.podAnnotations) > 0 {
for k, v := range r.podAnnotations {
hostingPodAnnotation[k] = v
}
} else {
for _, k := range util.ThirdPartyAnnotations {
if v, err := nodeagent.GetAnnotationValue(context.Background(), r.kubeClient, pvr.Namespace, k, nodeOS); err != nil {
if err != nodeagent.ErrNodeAgentAnnotationNotFound {
log.WithError(err).Warnf("Failed to check node-agent annotation, skip adding host pod annotation %s", k)
}
} else {
hostingPodAnnotation[k] = v
}
}
}
hostingPodTolerations := []corev1api.Toleration{}
@@ -617,7 +617,25 @@ func initPodVolumeRestoreReconcilerWithError(objects []runtime.Object, cliObj []
dataPathMgr := datapath.NewManager(1)
return NewPodVolumeRestoreReconciler(fakeClient, nil, fakeKubeClient, dataPathMgr, nil, "test-node", time.Minute*5, time.Minute, nil, nil, corev1api.ResourceRequirements{}, velerotest.NewLogger(), "", false, nil), nil
return NewPodVolumeRestoreReconciler(
fakeClient,
nil,
fakeKubeClient,
dataPathMgr,
nil,
"test-node",
time.Minute*5,
time.Minute,
nil,
nil,
corev1api.ResourceRequirements{},
velerotest.NewLogger(),
"",
false,
nil,
nil, // podLabels
nil, // podAnnotations
), nil
}
func TestPodVolumeRestoreReconcile(t *testing.T) {
@@ -1082,6 +1100,128 @@ func TestPodVolumeRestoreReconcile(t *testing.T) {
}
}
func TestPodVolumeRestoreSetupExposeParam(t *testing.T) {
// common objects for all cases
node := builder.ForNode("worker-1").Labels(map[string]string{kube.NodeOSLabel: kube.NodeOSLinux}).Result()
basePVR := pvrBuilder().Result()
basePVR.Status.Node = "worker-1"
basePVR.Spec.Pod.Namespace = "app-ns"
basePVR.Spec.Pod.Name = "app-pod"
basePVR.Spec.Volume = "data-vol"
type args struct {
customLabels map[string]string
customAnnotations map[string]string
}
type want struct {
labels map[string]string
annotations map[string]string
}
tests := []struct {
name string
args args
want want
}{
{
name: "label has customize values",
args: args{
customLabels: map[string]string{"custom-label": "label-value"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.PVRLabel: basePVR.Name,
"custom-label": "label-value",
},
annotations: map[string]string{},
},
},
{
name: "label has no customize values",
args: args{
customLabels: nil,
customAnnotations: nil,
},
want: want{
labels: map[string]string{velerov1api.PVRLabel: basePVR.Name},
annotations: map[string]string{},
},
},
{
name: "annotation has customize values",
args: args{
customLabels: nil,
customAnnotations: map[string]string{"custom-annotation": "annotation-value"},
},
want: want{
labels: map[string]string{velerov1api.PVRLabel: basePVR.Name},
annotations: map[string]string{"custom-annotation": "annotation-value"},
},
},
{
name: "annotation has no customize values",
args: args{
customLabels: map[string]string{"another-label": "lval"},
customAnnotations: nil,
},
want: want{
labels: map[string]string{
velerov1api.PVRLabel: basePVR.Name,
"another-label": "lval",
},
annotations: map[string]string{},
},
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Fake clients per case
fakeCRClient := velerotest.NewFakeControllerRuntimeClient(t, node, basePVR.DeepCopy())
fakeKubeClient := clientgofake.NewSimpleClientset(node)
// Reconciler config per case
preparingTimeout := time.Minute * 3
resourceTimeout := time.Minute * 10
podRes := corev1api.ResourceRequirements{}
r := NewPodVolumeRestoreReconciler(
fakeCRClient,
nil,
fakeKubeClient,
datapath.NewManager(1),
nil,
"test-node",
preparingTimeout,
resourceTimeout,
nil, // backupRepoConfigs
nil, // cacheVolumeConfigs -> keep nil so CacheVolume is nil
podRes,
velerotest.NewLogger(),
"restore-priority",
true,
nil, // repoConfigMgr (unused when cacheVolumeConfigs is nil)
tt.args.customLabels,
tt.args.customAnnotations,
)
// Act
got := r.setupExposeParam(basePVR)
// Core fields
assert.Equal(t, exposer.PodVolumeExposeTypeRestore, got.Type)
assert.Equal(t, basePVR.Spec.Pod.Namespace, got.ClientNamespace)
assert.Equal(t, basePVR.Spec.Pod.Name, got.ClientPodName)
assert.Equal(t, basePVR.Spec.Volume, got.ClientPodVolume)
// Labels/Annotations
assert.Equal(t, tt.want.labels, got.HostingPodLabels)
assert.Equal(t, tt.want.annotations, got.HostingPodAnnotations)
})
}
}
func TestOnPodVolumeRestoreFailed(t *testing.T) {
for _, getErr := range []bool{true, false} {
ctx := t.Context()
+1 -1
View File
@@ -252,7 +252,7 @@ func (fs *fileSystemBR) boostRepoConnect(ctx context.Context, repositoryType str
return err
}
} else {
if err := repoProvider.NewResticRepositoryProvider(credentialGetter.FromFile, filesystem.NewFileSystem(), fs.log).BoostRepoConnect(ctx, repoProvider.RepoParam{BackupLocation: fs.backupLocation, BackupRepo: fs.backupRepo}); err != nil {
if err := repoProvider.NewResticRepositoryProvider(*credentialGetter, filesystem.NewFileSystem(), fs.log).BoostRepoConnect(ctx, repoProvider.RepoParam{BackupLocation: fs.backupLocation, BackupRepo: fs.backupRepo}); err != nil {
return err
}
}
+6
View File
@@ -107,6 +107,9 @@ func TestAsyncBackup(t *testing.T) {
<-finish
// Ensure the goroutine finishes so deferred fs.close executes, satisfying mock expectations.
fs.wgDataPath.Wait()
assert.Equal(t, test.err, asyncErr)
assert.Equal(t, test.result, asyncResult)
})
@@ -192,6 +195,9 @@ func TestAsyncRestore(t *testing.T) {
<-finish
// Ensure the goroutine finishes so deferred fs.close executes, satisfying mock expectations.
fs.wgDataPath.Wait()
assert.Equal(t, asyncErr, test.err)
assert.Equal(t, asyncResult, test.result)
})
+32 -3
View File
@@ -184,7 +184,22 @@ func (e *podVolumeExposer) Expose(ctx context.Context, ownerObject corev1api.Obj
}
}
hostingPod, err := e.createHostingPod(ctx, ownerObject, param.Type, path.ByPath, param.OperationTimeout, param.HostingPodLabels, param.HostingPodAnnotations, param.HostingPodTolerations, pod.Spec.NodeName, param.Resources, nodeOS, param.PriorityClassName, param.Privileged, cachePVC)
hostingPod, err := e.createHostingPod(
ctx,
ownerObject,
param.Type,
path.ByPath,
param.OperationTimeout,
param.HostingPodLabels,
param.HostingPodAnnotations,
param.HostingPodTolerations,
pod.Spec.NodeName,
param.Resources,
nodeOS,
param.PriorityClassName,
param.Privileged,
cachePVC,
)
if err != nil {
return errors.Wrapf(err, "error to create hosting pod")
}
@@ -328,8 +343,22 @@ func (e *podVolumeExposer) CleanUp(ctx context.Context, ownerObject corev1api.Ob
kube.DeletePVAndPVCIfAny(ctx, e.kubeClient.CoreV1(), cachePVCName, ownerObject.Namespace, 0, e.log)
}
func (e *podVolumeExposer) createHostingPod(ctx context.Context, ownerObject corev1api.ObjectReference, exposeType string, hostPath string,
operationTimeout time.Duration, label map[string]string, annotation map[string]string, toleration []corev1api.Toleration, selectedNode string, resources corev1api.ResourceRequirements, nodeOS string, priorityClassName string, privileged bool, cachePVC *corev1api.PersistentVolumeClaim) (*corev1api.Pod, error) {
func (e *podVolumeExposer) createHostingPod(
ctx context.Context,
ownerObject corev1api.ObjectReference,
exposeType string,
hostPath string,
operationTimeout time.Duration,
label map[string]string,
annotation map[string]string,
toleration []corev1api.Toleration,
selectedNode string,
resources corev1api.ResourceRequirements,
nodeOS string,
priorityClassName string,
privileged bool,
cachePVC *corev1api.PersistentVolumeClaim,
) (*corev1api.Pod, error) {
hostingPodName := ownerObject.Name
containerName := string(ownerObject.UID)
+11
View File
@@ -59,6 +59,7 @@ type podTemplateConfig struct {
repoMaintenanceJobConfigMap string
nodeAgentConfigMap string
itemBlockWorkerCount int
concurrentBackups int
forWindows bool
kubeletRootDir string
nodeAgentDisableHostPath bool
@@ -224,6 +225,12 @@ func WithItemBlockWorkerCount(itemBlockWorkerCount int) podTemplateOption {
}
}
func WithConcurrentBackups(concurrentBackups int) podTemplateOption {
return func(c *podTemplateConfig) {
c.concurrentBackups = concurrentBackups
}
}
func WithPriorityClassName(priorityClassName string) podTemplateOption {
return func(c *podTemplateConfig) {
c.priorityClassName = priorityClassName
@@ -337,6 +344,10 @@ func Deployment(namespace string, opts ...podTemplateOption) *appsv1api.Deployme
args = append(args, fmt.Sprintf("--item-block-worker-count=%d", c.itemBlockWorkerCount))
}
if c.concurrentBackups > 0 {
args = append(args, fmt.Sprintf("--concurrent-backups=%d", c.concurrentBackups))
}
deployment := &appsv1api.Deployment{
ObjectMeta: objectMeta(namespace, "velero"),
TypeMeta: metav1.TypeMeta{
+31 -15
View File
@@ -278,30 +278,45 @@ func GroupResources(resources *unstructured.UnstructuredList) *ResourceGroup {
return rg
}
// createResource attempts to create a resource in the cluster.
// If the resource already exists in the cluster, it's merely logged.
func createResource(r *unstructured.Unstructured, factory client.DynamicFactory, w io.Writer) error {
// createOrApplyResource attempts to create or apply a resource in the cluster.
// If apply is true, it uses server-side apply to update existing resources.
// If apply is false and the resource already exists in the cluster, it's merely logged.
func createOrApplyResource(r *unstructured.Unstructured, factory client.DynamicFactory, w io.Writer, apply bool) error {
id := fmt.Sprintf("%s/%s", r.GetKind(), r.GetName())
// Helper to reduce boilerplate message about the same object
log := func(f string, a ...any) {
format := strings.Join([]string{id, ": ", f, "\n"}, "")
fmt.Fprintf(w, format, a...)
log := func(f string) {
fmt.Fprintf(w, "%s: %s\n", id, f)
}
log("attempting to create resource")
c, err := CreateClient(r, factory, w)
if err != nil {
return err
}
if _, err := c.Create(r); apierrors.IsAlreadyExists(err) {
log("already exists, proceeding")
} else if err != nil {
return errors.Wrapf(err, "Error creating resource %s", id)
if apply {
log("attempting to apply resource")
// Set field manager for server-side apply and force to override conflicts
applyOpts := metav1.ApplyOptions{
FieldManager: "velero-cli",
Force: true,
}
if _, err := c.Apply(r.GetName(), r, applyOpts); err != nil {
return errors.Wrapf(err, "Error applying resource %s", id)
}
log("applied")
} else {
log("attempting to create resource")
if _, err := c.Create(r); apierrors.IsAlreadyExists(err) {
log("already exists, proceeding")
} else if err != nil {
return errors.Wrapf(err, "Error creating resource %s", id)
} else {
log("created")
}
}
log("created")
return nil
}
@@ -335,13 +350,14 @@ func CreateClient(r *unstructured.Unstructured, factory client.DynamicFactory, w
// An unstructured list of resources is sent, one at a time, to the server. These are assumed to be in the preferred order already.
// Resources will be sorted into CustomResourceDefinitions and any other resource type, and the function will wait up to 1 minute
// for CRDs to be ready before proceeding.
// If apply is true, it uses server-side apply to update existing resources.
// An io.Writer can be used to output to a log or the console.
func Install(dynamicFactory client.DynamicFactory, kbClient kbclient.Client, resources *unstructured.UnstructuredList, w io.Writer) error {
func Install(dynamicFactory client.DynamicFactory, kbClient kbclient.Client, resources *unstructured.UnstructuredList, w io.Writer, apply bool) error {
rg := GroupResources(resources)
//Install CRDs first
for _, r := range rg.CRDResources {
if err := createResource(r, dynamicFactory, w); err != nil {
if err := createOrApplyResource(r, dynamicFactory, w, apply); err != nil {
return err
}
}
@@ -357,7 +373,7 @@ func Install(dynamicFactory client.DynamicFactory, kbClient kbclient.Client, res
// Install all other resources
for _, r := range rg.OtherResources {
if err = createResource(r, dynamicFactory, w); err != nil {
if err = createOrApplyResource(r, dynamicFactory, w, apply); err != nil {
return err
}
}
+235 -1
View File
@@ -1,6 +1,8 @@
package install
import (
"bytes"
"errors"
"os"
"testing"
"time"
@@ -12,9 +14,11 @@ import (
corev1api "k8s.io/api/core/v1"
apiextv1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1"
apiextv1beta1 "k8s.io/apiextensions-apiserver/pkg/apis/apiextensions/v1beta1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime"
"k8s.io/apimachinery/pkg/runtime/schema"
"sigs.k8s.io/controller-runtime/pkg/client/fake"
v1crds "github.com/vmware-tanzu/velero/config/crd/v1/crds"
@@ -53,7 +57,7 @@ func TestInstall(t *testing.T) {
require.NoError(t, appendUnstructured(resources, v1crds.CRDs[0]))
require.NoError(t, appendUnstructured(resources, Namespace("velero")))
assert.NoError(t, Install(factory, c, resources, os.Stdout))
assert.NoError(t, Install(factory, c, resources, os.Stdout, false))
}
func Test_crdsAreReady(t *testing.T) {
@@ -168,3 +172,233 @@ func TestNodeAgentWindowsIsReady(t *testing.T) {
require.NoError(t, err)
assert.True(t, ready)
}
func TestCreateOrApplyResourceError(t *testing.T) {
r := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
"namespace": "velero",
},
},
}
dc := &test.FakeDynamicClient{}
expectedErr := errors.New("create error")
dc.On("Create", mock.Anything).Return(&unstructured.Unstructured{}, expectedErr)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
var buf bytes.Buffer
err := createOrApplyResource(r, factory, &buf, false)
require.Error(t, err)
require.Contains(t, err.Error(), expectedErr.Error())
}
func TestCreateOrApplyResourceAlreadyExists(t *testing.T) {
r := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
"namespace": "velero",
},
},
}
dc := &test.FakeDynamicClient{}
alreadyExistsErr := apierrors.NewAlreadyExists(schema.GroupResource{Resource: "configmaps"}, "test-configmap")
// We need to return a non-nil unstructured object even though it's not used
dc.On("Create", mock.Anything).Return(&unstructured.Unstructured{}, alreadyExistsErr)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
var buf bytes.Buffer
err := createOrApplyResource(r, factory, &buf, false)
require.NoError(t, err)
}
func TestCreateOrApplyResourceClientError(t *testing.T) {
r := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
"namespace": "velero",
},
},
}
factory := &test.FakeDynamicFactory{}
expectedErr := errors.New("client creation error")
// Return error from ClientForGroupVersionResource
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(&test.FakeDynamicClient{}, expectedErr)
var buf bytes.Buffer
err := createOrApplyResource(r, factory, &buf, false)
require.Error(t, err)
require.Contains(t, err.Error(), expectedErr.Error())
}
func TestCreateOrApplyResourceApplyError(t *testing.T) {
r := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
"namespace": "velero",
},
},
}
dc := &test.FakeDynamicClient{}
expectedErr := errors.New("apply error")
// Mock Apply to return an error
dc.On("Apply", mock.Anything, mock.Anything, mock.Anything).Return(&unstructured.Unstructured{}, expectedErr)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
var buf bytes.Buffer
err := createOrApplyResource(r, factory, &buf, true) // true for apply flag to use Apply
require.Error(t, err)
require.Contains(t, err.Error(), expectedErr.Error())
}
func TestInstallErrorAfterCreateClient(t *testing.T) {
// Create a test non-CRD resource
nonCRDResource := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
},
},
}
resources := &unstructured.UnstructuredList{
Items: []unstructured.Unstructured{*nonCRDResource},
}
// Mock the factory to return a client that will succeed on ClientForGroupVersionResource
// but fail on Create
dc := &test.FakeDynamicClient{}
expectedErr := errors.New("create error after successful client creation")
dc.On("Create", mock.Anything).Return(&unstructured.Unstructured{}, expectedErr)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
c := fake.NewClientBuilder().Build()
var buf bytes.Buffer
err := Install(factory, c, resources, &buf, false)
require.Error(t, err)
require.Contains(t, err.Error(), expectedErr.Error())
}
func TestInstallErrorOnCRDResource(t *testing.T) {
crdResource := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "apiextensions.k8s.io/v1",
"kind": "CustomResourceDefinition",
"metadata": map[string]any{
"name": "test-crd",
},
},
}
resources := &unstructured.UnstructuredList{
Items: []unstructured.Unstructured{*crdResource},
}
dc := &test.FakeDynamicClient{}
expectedErr := errors.New("error creating CRD resource")
// We need to return a non-nil unstructured object even though it's not used
dc.On("Create", mock.Anything).Return(&unstructured.Unstructured{}, expectedErr)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
c := fake.NewClientBuilder().Build()
var buf bytes.Buffer
err := Install(factory, c, resources, &buf, false)
require.Error(t, err)
require.Contains(t, err.Error(), expectedErr.Error())
}
func TestInstallWithApplyFlag(t *testing.T) {
// Create a test resource
testResource := &unstructured.Unstructured{
Object: map[string]any{
"apiVersion": "v1",
"kind": "ConfigMap",
"metadata": map[string]any{
"name": "test-configmap",
"namespace": "velero",
},
"data": map[string]any{
"key1": "value1",
},
},
}
resources := &unstructured.UnstructuredList{
Items: []unstructured.Unstructured{*testResource},
}
// Test case 1: Without apply flag (create)
{
dc := &test.FakeDynamicClient{}
// Expect Create to be called
dc.On("Create", mock.Anything).Return(testResource, nil)
// Apply should not be called
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
c := fake.NewClientBuilder().Build()
err := Install(factory, c, resources, os.Stdout, false)
require.NoError(t, err)
// Verify that Create was called and Apply was not
dc.AssertCalled(t, "Create", mock.Anything)
dc.AssertNotCalled(t, "Apply", mock.Anything, mock.Anything, mock.Anything)
}
// Test case 2: With apply flag
{
dc := &test.FakeDynamicClient{}
// Create should not be called
// Expect Apply to be called
dc.On("Apply", mock.Anything, mock.Anything, mock.Anything).Return(testResource, nil)
factory := &test.FakeDynamicFactory{}
factory.On("ClientForGroupVersionResource", mock.Anything, mock.Anything, mock.Anything).Return(dc, nil)
c := fake.NewClientBuilder().Build()
err := Install(factory, c, resources, os.Stdout, true)
require.NoError(t, err)
// Verify that Apply was called and Create was not
dc.AssertCalled(t, "Apply", mock.Anything, mock.Anything, mock.Anything)
dc.AssertNotCalled(t, "Create", mock.Anything)
}
}
+2
View File
@@ -271,6 +271,7 @@ type VeleroOptions struct {
RepoMaintenanceJobConfigMap string
NodeAgentConfigMap string
ItemBlockWorkerCount int
ConcurrentBackups int
KubeletRootDir string
NodeAgentDisableHostPath bool
ServerPriorityClassName string
@@ -362,6 +363,7 @@ func AllResources(o *VeleroOptions) *unstructured.UnstructuredList {
WithPodResources(o.PodResources),
WithKeepLatestMaintenanceJobs(o.KeepLatestMaintenanceJobs),
WithItemBlockWorkerCount(o.ItemBlockWorkerCount),
WithConcurrentBackups(o.ConcurrentBackups),
}
if o.ServerPriorityClassName != "" {
+70
View File
@@ -27,6 +27,11 @@ type ServerMetrics struct {
metrics map[string]prometheus.Collector
}
// Metrics returns the metrics map for testing purposes.
func (m *ServerMetrics) Metrics() map[string]prometheus.Collector {
return m.metrics
}
const (
metricNamespace = "velero"
podVolumeMetricsNamespace = "podVolume"
@@ -75,6 +80,14 @@ const (
DataDownloadFailureTotal = "data_download_failure_total"
DataDownloadCancelTotal = "data_download_cancel_total"
// repo maintenance metrics
repoMaintenanceSuccessTotal = "repo_maintenance_success_total"
repoMaintenanceFailureTotal = "repo_maintenance_failure_total"
// repoMaintenanceDurationSeconds tracks the distribution of maintenance job durations.
// Each completed job's duration is recorded in the appropriate bucket, allowing
// analysis of individual job performance and trending over time.
repoMaintenanceDurationSeconds = "repo_maintenance_duration_seconds"
// Labels
nodeMetricLabel = "node"
podVolumeOperationLabel = "operation"
@@ -82,6 +95,7 @@ const (
pvbNameLabel = "pod_volume_backup"
scheduleLabel = "schedule"
backupNameLabel = "backupName"
repositoryNameLabel = "repository_name"
// metrics values
BackupLastStatusSucc int64 = 1
@@ -333,6 +347,41 @@ func NewServerMetrics() *ServerMetrics {
},
[]string{scheduleLabel, backupNameLabel},
),
repoMaintenanceSuccessTotal: prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: metricNamespace,
Name: repoMaintenanceSuccessTotal,
Help: "Total number of successful repo maintenance jobs",
},
[]string{repositoryNameLabel},
),
repoMaintenanceFailureTotal: prometheus.NewCounterVec(
prometheus.CounterOpts{
Namespace: metricNamespace,
Name: repoMaintenanceFailureTotal,
Help: "Total number of failed repo maintenance jobs",
},
[]string{repositoryNameLabel},
),
repoMaintenanceDurationSeconds: prometheus.NewHistogramVec(
prometheus.HistogramOpts{
Namespace: metricNamespace,
Name: repoMaintenanceDurationSeconds,
Help: "Time taken to complete repo maintenance jobs, in seconds",
Buckets: []float64{
toSeconds(1 * time.Minute),
toSeconds(5 * time.Minute),
toSeconds(10 * time.Minute),
toSeconds(15 * time.Minute),
toSeconds(30 * time.Minute),
toSeconds(1 * time.Hour),
toSeconds(2 * time.Hour),
toSeconds(3 * time.Hour),
toSeconds(4 * time.Hour),
},
},
[]string{repositoryNameLabel},
),
},
}
}
@@ -912,3 +961,24 @@ func (m *ServerMetrics) RegisterBackupLocationUnavailable(backupLocationName str
g.WithLabelValues(backupLocationName).Set(float64(0))
}
}
// RegisterRepoMaintenanceSuccess records a successful repo maintenance job.
func (m *ServerMetrics) RegisterRepoMaintenanceSuccess(repositoryName string) {
if c, ok := m.metrics[repoMaintenanceSuccessTotal].(*prometheus.CounterVec); ok {
c.WithLabelValues(repositoryName).Inc()
}
}
// RegisterRepoMaintenanceFailure records a failed repo maintenance job.
func (m *ServerMetrics) RegisterRepoMaintenanceFailure(repositoryName string) {
if c, ok := m.metrics[repoMaintenanceFailureTotal].(*prometheus.CounterVec); ok {
c.WithLabelValues(repositoryName).Inc()
}
}
// ObserveRepoMaintenanceDuration records the number of seconds a repo maintenance job took.
func (m *ServerMetrics) ObserveRepoMaintenanceDuration(repositoryName string, seconds float64) {
if h, ok := m.metrics[repoMaintenanceDurationSeconds].(*prometheus.HistogramVec); ok {
h.WithLabelValues(repositoryName).Observe(seconds)
}
}
+145
View File
@@ -372,3 +372,148 @@ func getHistogramCount(t *testing.T, vec *prometheus.HistogramVec, scheduleLabel
t.Fatalf("Histogram with schedule label '%s' not found", scheduleLabel)
return 0
}
// TestRepoMaintenanceMetrics verifies that repo maintenance metrics are properly recorded.
func TestRepoMaintenanceMetrics(t *testing.T) {
tests := []struct {
name string
repositoryName string
description string
}{
{
name: "maintenance job metrics for repository",
repositoryName: "default-restic-abcd",
description: "Metrics should be recorded with the repository name label",
},
{
name: "maintenance job metrics for different repository",
repositoryName: "velero-backup-repo-xyz",
description: "Metrics should be recorded with different repository name",
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
m := NewServerMetrics()
// Test repo maintenance success metric
t.Run("RegisterRepoMaintenanceSuccess", func(t *testing.T) {
m.RegisterRepoMaintenanceSuccess(tc.repositoryName)
metric := getMaintenanceMetricValue(t, m.metrics[repoMaintenanceSuccessTotal].(*prometheus.CounterVec), tc.repositoryName)
assert.Equal(t, float64(1), metric, tc.description)
})
// Test repo maintenance failure metric
t.Run("RegisterRepoMaintenanceFailure", func(t *testing.T) {
m.RegisterRepoMaintenanceFailure(tc.repositoryName)
metric := getMaintenanceMetricValue(t, m.metrics[repoMaintenanceFailureTotal].(*prometheus.CounterVec), tc.repositoryName)
assert.Equal(t, float64(1), metric, tc.description)
})
// Test repo maintenance duration metric
t.Run("ObserveRepoMaintenanceDuration", func(t *testing.T) {
m.ObserveRepoMaintenanceDuration(tc.repositoryName, 300.5)
// For histogram, we check the count
metric := getMaintenanceHistogramCount(t, m.metrics[repoMaintenanceDurationSeconds].(*prometheus.HistogramVec), tc.repositoryName)
assert.Equal(t, uint64(1), metric, tc.description)
})
})
}
}
// TestMultipleRepoMaintenanceJobsAccumulate verifies that multiple repo maintenance jobs
// accumulate metrics under the same repository label.
func TestMultipleRepoMaintenanceJobsAccumulate(t *testing.T) {
m := NewServerMetrics()
repoName := "default-restic-test"
// Simulate multiple repo maintenance job executions
m.RegisterRepoMaintenanceSuccess(repoName)
m.RegisterRepoMaintenanceSuccess(repoName)
m.RegisterRepoMaintenanceSuccess(repoName)
m.RegisterRepoMaintenanceFailure(repoName)
m.RegisterRepoMaintenanceFailure(repoName)
// Record multiple durations
m.ObserveRepoMaintenanceDuration(repoName, 120.5)
m.ObserveRepoMaintenanceDuration(repoName, 180.3)
m.ObserveRepoMaintenanceDuration(repoName, 90.7)
// Verify accumulated metrics
successMetric := getMaintenanceMetricValue(t, m.metrics[repoMaintenanceSuccessTotal].(*prometheus.CounterVec), repoName)
assert.Equal(t, float64(3), successMetric, "All repo maintenance successes should be counted")
failureMetric := getMaintenanceMetricValue(t, m.metrics[repoMaintenanceFailureTotal].(*prometheus.CounterVec), repoName)
assert.Equal(t, float64(2), failureMetric, "All repo maintenance failures should be counted")
durationCount := getMaintenanceHistogramCount(t, m.metrics[repoMaintenanceDurationSeconds].(*prometheus.HistogramVec), repoName)
assert.Equal(t, uint64(3), durationCount, "All repo maintenance durations should be observed")
}
// Helper function to get metric value from a CounterVec with repository_name label
func getMaintenanceMetricValue(t *testing.T, vec prometheus.Collector, repositoryName string) float64 {
t.Helper()
ch := make(chan prometheus.Metric, 1)
vec.Collect(ch)
close(ch)
for metric := range ch {
dto := &dto.Metric{}
err := metric.Write(dto)
require.NoError(t, err)
// Check if this metric has the expected repository_name label
hasCorrectLabel := false
for _, label := range dto.Label {
if *label.Name == "repository_name" && *label.Value == repositoryName {
hasCorrectLabel = true
break
}
}
if hasCorrectLabel {
if dto.Counter != nil {
return *dto.Counter.Value
}
if dto.Gauge != nil {
return *dto.Gauge.Value
}
}
}
t.Fatalf("Metric with repository_name label '%s' not found", repositoryName)
return 0
}
// Helper function to get histogram count with repository_name label
func getMaintenanceHistogramCount(t *testing.T, vec *prometheus.HistogramVec, repositoryName string) uint64 {
t.Helper()
ch := make(chan prometheus.Metric, 1)
vec.Collect(ch)
close(ch)
for metric := range ch {
dto := &dto.Metric{}
err := metric.Write(dto)
require.NoError(t, err)
// Check if this metric has the expected repository_name label
hasCorrectLabel := false
for _, label := range dto.Label {
if *label.Name == "repository_name" && *label.Value == repositoryName {
hasCorrectLabel = true
break
}
}
if hasCorrectLabel && dto.Histogram != nil {
return *dto.Histogram.SampleCount
}
}
t.Fatalf("Histogram with repository_name label '%s' not found", repositoryName)
return 0
}
+26 -8
View File
@@ -116,6 +116,7 @@ type ObjectBackupStoreGetter interface {
type objectBackupStoreGetter struct {
credentialStore credentials.FileStore
secretStore credentials.SecretStore
}
// NewObjectBackupStoreGetter returns a ObjectBackupStoreGetter that can get a velero.BackupStore.
@@ -123,6 +124,15 @@ func NewObjectBackupStoreGetter(credentialStore credentials.FileStore) ObjectBac
return &objectBackupStoreGetter{credentialStore: credentialStore}
}
// NewObjectBackupStoreGetterWithSecretStore returns an ObjectBackupStoreGetter with SecretStore
// support for resolving caCertRef from Kubernetes Secrets.
func NewObjectBackupStoreGetterWithSecretStore(credentialStore credentials.FileStore, secretStore credentials.SecretStore) ObjectBackupStoreGetter {
return &objectBackupStoreGetter{
credentialStore: credentialStore,
secretStore: secretStore,
}
}
func (b *objectBackupStoreGetter) Get(location *velerov1api.BackupStorageLocation, objectStoreGetter ObjectStoreGetter, logger logrus.FieldLogger) (BackupStore, error) {
if location.Spec.ObjectStorage == nil {
return nil, errors.New("backup storage location does not use object storage")
@@ -160,7 +170,16 @@ func (b *objectBackupStoreGetter) Get(location *velerov1api.BackupStorageLocatio
objectStoreConfig["prefix"] = prefix
// Only include a CACert if it's specified in order to maintain compatibility with plugins that don't expect it.
if location.Spec.ObjectStorage.CACert != nil {
// Prefer caCertRef (from Secret) over inline caCert (deprecated).
if location.Spec.ObjectStorage.CACertRef != nil {
if b.secretStore != nil {
caCertString, err := b.secretStore.Get(location.Spec.ObjectStorage.CACertRef)
if err != nil {
return nil, errors.Wrap(err, "error getting CA certificate from secret")
}
objectStoreConfig["caCert"] = caCertString
}
} else if location.Spec.ObjectStorage.CACert != nil {
objectStoreConfig["caCert"] = string(location.Spec.ObjectStorage.CACert)
}
@@ -266,13 +285,12 @@ func (s *objectBackupStore) PutBackup(info BackupInfo) error {
// Since the logic for all of these files is the exact same except for the name and the contents,
// use a map literal to iterate through them and write them to the bucket.
var backupObjs = map[string]io.Reader{
s.layout.getPodVolumeBackupsKey(info.Name): info.PodVolumeBackups,
s.layout.getBackupVolumeSnapshotsKey(info.Name): info.VolumeSnapshots,
s.layout.getBackupItemOperationsKey(info.Name): info.BackupItemOperations,
s.layout.getBackupResourceListKey(info.Name): info.BackupResourceList,
s.layout.getCSIVolumeSnapshotClassesKey(info.Name): info.CSIVolumeSnapshotClasses,
s.layout.getBackupResultsKey(info.Name): info.BackupResults,
s.layout.getBackupVolumeInfoKey(info.Name): info.BackupVolumeInfo,
s.layout.getPodVolumeBackupsKey(info.Name): info.PodVolumeBackups,
s.layout.getBackupVolumeSnapshotsKey(info.Name): info.VolumeSnapshots,
s.layout.getBackupItemOperationsKey(info.Name): info.BackupItemOperations,
s.layout.getBackupResourceListKey(info.Name): info.BackupResourceList,
s.layout.getBackupResultsKey(info.Name): info.BackupResults,
s.layout.getBackupVolumeInfoKey(info.Name): info.BackupVolumeInfo,
}
for key, reader := range backupObjs {
+26
View File
@@ -1017,6 +1017,32 @@ func TestNewObjectBackupStoreGetterConfig(t *testing.T) {
"credentialsFile": "/tmp/credentials/secret-file",
},
},
{
name: "location with CACertRef is initialized with caCert from secret",
location: builder.ForBackupStorageLocation("", "").Provider(provider).Bucket(bucket).CACertRef(
builder.ForSecretKeySelector("cacert-secret", "ca.crt").Result(),
).Result(),
getter: NewObjectBackupStoreGetterWithSecretStore(
velerotest.NewFakeCredentialsFileStore("", nil),
velerotest.NewFakeCredentialsSecretStore("cacert-from-secret", nil),
),
wantConfig: map[string]string{
"bucket": "bucket",
"prefix": "",
"caCert": "cacert-from-secret",
},
},
{
name: "location with CACertRef and no SecretStore uses no caCert",
location: builder.ForBackupStorageLocation("", "").Provider(provider).Bucket(bucket).CACertRef(
builder.ForSecretKeySelector("cacert-secret", "ca.crt").Result(),
).Result(),
getter: NewObjectBackupStoreGetter(velerotest.NewFakeCredentialsFileStore("", nil)),
wantConfig: map[string]string{
"bucket": "bucket",
"prefix": "",
},
},
}
for _, tc := range tests {
@@ -33,6 +33,11 @@ import (
// up on demand. On the other hand, the volumeHelperImpl assume there
// is a VolumeHelper instance initialized before calling the
// ShouldPerformXXX functions.
//
// Deprecated: Use ShouldPerformSnapshotWithVolumeHelper instead for better performance.
// ShouldPerformSnapshotWithVolumeHelper allows passing a pre-created VolumeHelper with
// an internal PVC-to-Pod cache, which avoids O(N*M) complexity when there are many PVCs and pods.
// See issue #9179 for details.
func ShouldPerformSnapshotWithBackup(
unstructured runtime.Unstructured,
groupResource schema.GroupResource,
@@ -40,6 +45,35 @@ func ShouldPerformSnapshotWithBackup(
crClient crclient.Client,
logger logrus.FieldLogger,
) (bool, error) {
return ShouldPerformSnapshotWithVolumeHelper(
unstructured,
groupResource,
backup,
crClient,
logger,
nil, // no cached VolumeHelper, will create one
)
}
// ShouldPerformSnapshotWithVolumeHelper is like ShouldPerformSnapshotWithBackup
// but accepts an optional VolumeHelper. If vh is non-nil, it will be used directly,
// avoiding the overhead of creating a new VolumeHelper on each call.
// This is useful for BIA plugins that process multiple PVCs during a single backup
// and want to reuse the same VolumeHelper (with its internal cache) across calls.
func ShouldPerformSnapshotWithVolumeHelper(
unstructured runtime.Unstructured,
groupResource schema.GroupResource,
backup velerov1api.Backup,
crClient crclient.Client,
logger logrus.FieldLogger,
vh volumehelper.VolumeHelper,
) (bool, error) {
// If a VolumeHelper is provided, use it directly
if vh != nil {
return vh.ShouldPerformSnapshot(unstructured, groupResource)
}
// Otherwise, create a new VolumeHelper (original behavior for third-party plugins)
resourcePolicies, err := resourcepolicies.GetResourcePoliciesFromBackup(
backup,
crClient,
@@ -49,6 +83,7 @@ func ShouldPerformSnapshotWithBackup(
return false, err
}
//nolint:staticcheck // Intentional use of deprecated function for backwards compatibility
volumeHelperImpl := volumehelper.NewVolumeHelperImpl(
resourcePolicies,
backup.Spec.SnapshotVolumes,
@@ -0,0 +1,324 @@
/*
Copyright the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package volumehelper
import (
"testing"
"github.com/sirupsen/logrus"
"github.com/stretchr/testify/require"
corev1api "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime"
"github.com/vmware-tanzu/velero/internal/volumehelper"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/kuberesource"
velerotest "github.com/vmware-tanzu/velero/pkg/test"
)
func TestShouldPerformSnapshotWithBackup(t *testing.T) {
tests := []struct {
name string
pvc *corev1api.PersistentVolumeClaim
pv *corev1api.PersistentVolume
backup *velerov1api.Backup
wantSnapshot bool
wantError bool
}{
{
name: "Returns true when snapshotVolumes not set",
pvc: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Namespace: "default",
},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "test-pv",
},
Status: corev1api.PersistentVolumeClaimStatus{
Phase: corev1api.ClaimBound,
},
},
pv: &corev1api.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pv",
},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
CSI: &corev1api.CSIPersistentVolumeSource{
Driver: "test-driver",
},
},
ClaimRef: &corev1api.ObjectReference{
Namespace: "default",
Name: "test-pvc",
},
},
},
backup: &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
},
wantSnapshot: true,
wantError: false,
},
{
name: "Returns false when snapshotVolumes is false",
pvc: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Namespace: "default",
},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "test-pv",
},
Status: corev1api.PersistentVolumeClaimStatus{
Phase: corev1api.ClaimBound,
},
},
pv: &corev1api.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pv",
},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
CSI: &corev1api.CSIPersistentVolumeSource{
Driver: "test-driver",
},
},
ClaimRef: &corev1api.ObjectReference{
Namespace: "default",
Name: "test-pvc",
},
},
},
backup: &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
SnapshotVolumes: boolPtr(false),
},
},
wantSnapshot: false,
wantError: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Create fake client with PV and PVC
client := velerotest.NewFakeControllerRuntimeClient(t, tt.pv, tt.pvc)
// Convert PVC to unstructured
pvcMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(tt.pvc)
require.NoError(t, err)
unstructuredPVC := &unstructured.Unstructured{Object: pvcMap}
logger := logrus.New()
// Call the function under test - this is the wrapper for third-party plugins
result, err := ShouldPerformSnapshotWithBackup(
unstructuredPVC,
kuberesource.PersistentVolumeClaims,
*tt.backup,
client,
logger,
)
if tt.wantError {
require.Error(t, err)
} else {
require.NoError(t, err)
require.Equal(t, tt.wantSnapshot, result)
}
})
}
}
func boolPtr(b bool) *bool {
return &b
}
func TestShouldPerformSnapshotWithVolumeHelper(t *testing.T) {
tests := []struct {
name string
pvc *corev1api.PersistentVolumeClaim
pv *corev1api.PersistentVolume
backup *velerov1api.Backup
wantSnapshot bool
wantError bool
}{
{
name: "Returns true with nil VolumeHelper when snapshotVolumes not set",
pvc: &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Namespace: "default",
},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "test-pv",
},
Status: corev1api.PersistentVolumeClaimStatus{
Phase: corev1api.ClaimBound,
},
},
pv: &corev1api.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pv",
},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
CSI: &corev1api.CSIPersistentVolumeSource{
Driver: "test-driver",
},
},
ClaimRef: &corev1api.ObjectReference{
Namespace: "default",
Name: "test-pvc",
},
},
},
backup: &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
},
wantSnapshot: true,
wantError: false,
},
}
for _, tt := range tests {
t.Run(tt.name, func(t *testing.T) {
// Create fake client with PV
client := velerotest.NewFakeControllerRuntimeClient(t, tt.pv, tt.pvc)
// Convert PVC to unstructured
pvcMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(tt.pvc)
require.NoError(t, err)
unstructuredPVC := &unstructured.Unstructured{Object: pvcMap}
logger := logrus.New()
// Call the function under test with nil VolumeHelper
// This exercises the fallback path that creates a new VolumeHelper per call
result, err := ShouldPerformSnapshotWithVolumeHelper(
unstructuredPVC,
kuberesource.PersistentVolumeClaims,
*tt.backup,
client,
logger,
nil, // Pass nil for VolumeHelper - exercises fallback path
)
if tt.wantError {
require.Error(t, err)
} else {
require.NoError(t, err)
require.Equal(t, tt.wantSnapshot, result)
}
})
}
}
// TestShouldPerformSnapshotWithNonNilVolumeHelper tests the ShouldPerformSnapshotWithVolumeHelper
// function when a pre-created VolumeHelper is passed. This exercises the cached path used
// by BIA plugins for better performance.
func TestShouldPerformSnapshotWithNonNilVolumeHelper(t *testing.T) {
pvc := &corev1api.PersistentVolumeClaim{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pvc",
Namespace: "default",
},
Spec: corev1api.PersistentVolumeClaimSpec{
VolumeName: "test-pv",
},
Status: corev1api.PersistentVolumeClaimStatus{
Phase: corev1api.ClaimBound,
},
}
pv := &corev1api.PersistentVolume{
ObjectMeta: metav1.ObjectMeta{
Name: "test-pv",
},
Spec: corev1api.PersistentVolumeSpec{
PersistentVolumeSource: corev1api.PersistentVolumeSource{
CSI: &corev1api.CSIPersistentVolumeSource{
Driver: "test-driver",
},
},
ClaimRef: &corev1api.ObjectReference{
Namespace: "default",
Name: "test-pvc",
},
},
}
backup := &velerov1api.Backup{
ObjectMeta: metav1.ObjectMeta{
Name: "test-backup",
Namespace: "velero",
},
Spec: velerov1api.BackupSpec{
IncludedNamespaces: []string{"default"},
},
}
// Create fake client with PV and PVC
client := velerotest.NewFakeControllerRuntimeClient(t, pv, pvc)
logger := logrus.New()
// Create VolumeHelper using the internal function with namespace caching
vh, err := volumehelper.NewVolumeHelperImplWithNamespaces(
nil, // no resource policies for this test
nil, // snapshotVolumes not set
logger,
client,
false, // defaultVolumesToFSBackup
true, // backupExcludePVC
[]string{"default"},
)
require.NoError(t, err)
require.NotNil(t, vh)
// Convert PVC to unstructured
pvcMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(pvc)
require.NoError(t, err)
unstructuredPVC := &unstructured.Unstructured{Object: pvcMap}
// Call with non-nil VolumeHelper - exercises the cached path
result, err := ShouldPerformSnapshotWithVolumeHelper(
unstructuredPVC,
kuberesource.PersistentVolumeClaims,
*backup,
client,
logger,
vh, // Pass non-nil VolumeHelper - exercises cached path
)
require.NoError(t, err)
require.True(t, result, "Should return true for snapshot when snapshotVolumes not set")
}
+27 -6
View File
@@ -290,9 +290,19 @@ func getJobConfig(
if globalResult.PriorityClassName != "" {
result.PriorityClassName = globalResult.PriorityClassName
}
// Pod's labels are only read from global config, not per-repository
if len(globalResult.PodLabels) > 0 {
result.PodLabels = globalResult.PodLabels
}
// Pod's annotations are only read from global config, not per-repository
if len(globalResult.PodAnnotations) > 0 {
result.PodAnnotations = globalResult.PodAnnotations
}
}
logger.Debugf("Didn't find content for repository %s in cm %s", repo.Name, repoMaintenanceJobConfig)
logger.Debugf("Configuration content for repository %s is %+v", repo.Name, result)
return result, nil
}
@@ -580,18 +590,29 @@ func buildJob(
podLabels := map[string]string{
RepositoryNameLabel: velerolabel.ReturnNameOrHash(repo.Name),
}
for _, k := range util.ThirdPartyLabels {
if v := veleroutil.GetVeleroServerLabelValue(deployment, k); v != "" {
if config != nil && len(config.PodLabels) > 0 {
for k, v := range config.PodLabels {
podLabels[k] = v
}
} else {
for _, k := range util.ThirdPartyLabels {
if v := veleroutil.GetVeleroServerLabelValue(deployment, k); v != "" {
podLabels[k] = v
}
}
}
podAnnotations := map[string]string{}
for _, k := range util.ThirdPartyAnnotations {
if v := veleroutil.GetVeleroServerAnnotationValue(deployment, k); v != "" {
if config != nil && len(config.PodAnnotations) > 0 {
for k, v := range config.PodAnnotations {
podAnnotations[k] = v
}
} else {
for _, k := range util.ThirdPartyAnnotations {
if v := veleroutil.GetVeleroServerAnnotationValue(deployment, k); v != "" {
podAnnotations[k] = v
}
}
}
// Set arguments
+102 -1
View File
@@ -538,6 +538,45 @@ func TestGetJobConfig(t *testing.T) {
},
expectedError: nil,
},
{
name: "Configs only exist in global section should supersede specific config",
repoJobConfig: &corev1api.ConfigMap{
ObjectMeta: metav1.ObjectMeta{
Namespace: veleroNamespace,
Name: repoMaintenanceJobConfig,
},
Data: map[string]string{
GlobalKeyForRepoMaintenanceJobCM: "{\"keepLatestMaintenanceJobs\":1,\"podResources\":{\"cpuRequest\":\"50m\",\"cpuLimit\":\"100m\",\"memoryRequest\":\"50Mi\",\"memoryLimit\":\"100Mi\"},\"loadAffinity\":[{\"nodeSelector\":{\"matchExpressions\":[{\"key\":\"cloud.google.com/machine-family\",\"operator\":\"In\",\"values\":[\"n2\"]}]}}],\"priorityClassName\":\"global-priority\",\"podAnnotations\":{\"global-key\":\"global-value\"},\"podLabels\":{\"global-key\":\"global-value\"}}",
"test-default-kopia": "{\"podResources\":{\"cpuRequest\":\"100m\",\"cpuLimit\":\"200m\",\"memoryRequest\":\"100Mi\",\"memoryLimit\":\"200Mi\"},\"loadAffinity\":[{\"nodeSelector\":{\"matchExpressions\":[{\"key\":\"cloud.google.com/machine-family\",\"operator\":\"In\",\"values\":[\"e2\"]}]}}],\"priorityClassName\":\"specific-priority\",\"podAnnotations\":{\"specific-key\":\"specific-value\"},\"podLabels\":{\"specific-key\":\"specific-value\"}}",
},
},
expectedConfig: &velerotypes.JobConfigs{
KeepLatestMaintenanceJobs: &keepLatestMaintenanceJobs,
PodResources: &kube.PodResources{
CPURequest: "100m",
CPULimit: "200m",
MemoryRequest: "100Mi",
MemoryLimit: "200Mi",
},
LoadAffinities: []*kube.LoadAffinity{
{
NodeSelector: metav1.LabelSelector{
MatchExpressions: []metav1.LabelSelectorRequirement{
{
Key: "cloud.google.com/machine-family",
Operator: metav1.LabelSelectorOpIn,
Values: []string{"e2"},
},
},
},
},
},
PriorityClassName: "global-priority",
PodAnnotations: map[string]string{"global-key": "global-value"},
PodLabels: map[string]string{"global-key": "global-value"},
},
expectedError: nil,
},
}
for _, tc := range testCases {
@@ -938,12 +977,12 @@ func TestBuildJob(t *testing.T) {
deploy *appsv1api.Deployment
logLevel logrus.Level
logFormat *logging.FormatFlag
thirdPartyLabel map[string]string
expectedJobName string
expectedError bool
expectedEnv []corev1api.EnvVar
expectedEnvFrom []corev1api.EnvFromSource
expectedPodLabel map[string]string
expectedPodAnnotation map[string]string
expectedSecurityContext *corev1api.SecurityContext
expectedPodSecurityContext *corev1api.PodSecurityContext
expectedImagePullSecrets []corev1api.LocalObjectReference
@@ -1065,6 +1104,68 @@ func TestBuildJob(t *testing.T) {
expectedJobName: "",
expectedError: true,
},
{
name: "Valid maintenance job customized labels and annotations",
m: &velerotypes.JobConfigs{
PodResources: &kube.PodResources{
CPURequest: "100m",
MemoryRequest: "128Mi",
CPULimit: "200m",
MemoryLimit: "256Mi",
},
PodLabels: map[string]string{
"global-label-1": "global-label-value-1",
"global-label-2": "global-label-value-2",
},
PodAnnotations: map[string]string{
"global-annotation-1": "global-annotation-value-1",
"global-annotation-2": "global-annotation-value-2",
},
},
deploy: deploy2,
logLevel: logrus.InfoLevel,
logFormat: logging.NewFormatFlag(),
expectedError: false,
expectedJobName: "test-123-maintain-job",
expectedEnv: []corev1api.EnvVar{
{
Name: "test-name",
Value: "test-value",
},
},
expectedEnvFrom: []corev1api.EnvFromSource{
{
ConfigMapRef: &corev1api.ConfigMapEnvSource{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-configmap",
},
},
},
{
SecretRef: &corev1api.SecretEnvSource{
LocalObjectReference: corev1api.LocalObjectReference{
Name: "test-secret",
},
},
},
},
expectedPodLabel: map[string]string{
"global-label-1": "global-label-value-1",
"global-label-2": "global-label-value-2",
RepositoryNameLabel: "test-123",
},
expectedPodAnnotation: map[string]string{
"global-annotation-1": "global-annotation-value-1",
"global-annotation-2": "global-annotation-value-2",
},
expectedSecurityContext: nil,
expectedPodSecurityContext: nil,
expectedImagePullSecrets: []corev1api.LocalObjectReference{
{
Name: "imagePullSecret1",
},
},
},
{
name: "Valid maintenance job with third party labels and BackupRepository name longer than 63",
m: &velerotypes.JobConfigs{
+4 -1
View File
@@ -109,7 +109,10 @@ func NewManager(
log: log,
}
mgr.providers[velerov1api.BackupRepositoryTypeRestic] = provider.NewResticRepositoryProvider(credentialFileStore, mgr.fileSystem, mgr.log)
mgr.providers[velerov1api.BackupRepositoryTypeRestic] = provider.NewResticRepositoryProvider(credentials.CredentialGetter{
FromFile: credentialFileStore,
FromSecret: credentialSecretStore,
}, mgr.fileSystem, mgr.log)
mgr.providers[velerov1api.BackupRepositoryTypeKopia] = provider.NewUnifiedRepoProvider(credentials.CredentialGetter{
FromFile: credentialFileStore,
FromSecret: credentialSecretStore,
+2 -2
View File
@@ -28,9 +28,9 @@ import (
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
)
func NewResticRepositoryProvider(store credentials.FileStore, fs filesystem.Interface, log logrus.FieldLogger) Provider {
func NewResticRepositoryProvider(credGetter credentials.CredentialGetter, fs filesystem.Interface, log logrus.FieldLogger) Provider {
return &resticRepositoryProvider{
svc: restic.NewRepositoryService(store, fs, log),
svc: restic.NewRepositoryService(credGetter, fs, log),
}
}
+20 -5
View File
@@ -59,7 +59,7 @@ var getGCPCredentials = repoconfig.GetGCPCredentials
var getS3BucketRegion = repoconfig.GetAWSBucketRegion
type localFuncTable struct {
getStorageVariables func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error)
getStorageVariables func(*velerov1api.BackupStorageLocation, string, string, map[string]string, credentials.CredentialGetter) (map[string]string, error)
getStorageCredentials func(*velerov1api.BackupStorageLocation, credentials.FileStore) (map[string]string, error)
}
@@ -427,7 +427,7 @@ func (urp *unifiedRepoProvider) GetStoreOptions(param any) (map[string]string, e
return map[string]string{}, errors.Errorf("invalid parameter, expect %T, actual %T", RepoParam{}, param)
}
storeVar, err := funcTable.getStorageVariables(repoParam.BackupLocation, urp.repoBackend, repoParam.BackupRepo.Spec.VolumeNamespace, repoParam.BackupRepo.Spec.RepositoryConfig)
storeVar, err := funcTable.getStorageVariables(repoParam.BackupLocation, urp.repoBackend, repoParam.BackupRepo.Spec.VolumeNamespace, repoParam.BackupRepo.Spec.RepositoryConfig, urp.credentialGetter)
if err != nil {
return map[string]string{}, errors.Wrap(err, "error to get storage variables")
}
@@ -539,7 +539,7 @@ func getStorageCredentials(backupLocation *velerov1api.BackupStorageLocation, cr
// so we would accept only the options that are well defined in the internal system.
// Users' inputs should not be treated as safe any time.
// We remove the unnecessary parameters and keep the modules/logics below safe
func getStorageVariables(backupLocation *velerov1api.BackupStorageLocation, repoBackend string, repoName string, backupRepoConfig map[string]string) (map[string]string, error) {
func getStorageVariables(backupLocation *velerov1api.BackupStorageLocation, repoBackend string, repoName string, backupRepoConfig map[string]string, credGetter credentials.CredentialGetter) (map[string]string, error) {
result := make(map[string]string)
backendType := repoconfig.GetBackendType(backupLocation.Spec.Provider, backupLocation.Spec.Config)
@@ -603,8 +603,23 @@ func getStorageVariables(backupLocation *velerov1api.BackupStorageLocation, repo
result[udmrepo.StoreOptionOssBucket] = bucket
result[udmrepo.StoreOptionPrefix] = prefix
if backupLocation.Spec.ObjectStorage != nil && backupLocation.Spec.ObjectStorage.CACert != nil {
result[udmrepo.StoreOptionCACert] = base64.StdEncoding.EncodeToString(backupLocation.Spec.ObjectStorage.CACert)
if backupLocation.Spec.ObjectStorage != nil {
var caCertData []byte
// Try CACertRef first (new method), then fall back to CACert (deprecated)
if backupLocation.Spec.ObjectStorage.CACertRef != nil {
caCertString, err := credGetter.FromSecret.Get(backupLocation.Spec.ObjectStorage.CACertRef)
if err != nil {
return nil, errors.Wrap(err, "error getting CA certificate from secret")
}
caCertData = []byte(caCertString)
} else if backupLocation.Spec.ObjectStorage.CACert != nil {
caCertData = backupLocation.Spec.ObjectStorage.CACert
}
if caCertData != nil {
result[udmrepo.StoreOptionCACert] = base64.StdEncoding.EncodeToString(caCertData)
}
}
result[udmrepo.StoreOptionOssRegion] = strings.Trim(region, "/")
result[udmrepo.StoreOptionFsPath] = config["fspath"]
+45 -24
View File
@@ -465,7 +465,7 @@ func TestGetStorageVariables(t *testing.T) {
t.Run(tc.name, func(t *testing.T) {
getS3BucketRegion = tc.getS3BucketRegion
actual, err := getStorageVariables(&tc.backupLocation, tc.repoBackend, tc.repoName, tc.repoConfig)
actual, err := getStorageVariables(&tc.backupLocation, tc.repoBackend, tc.repoName, tc.repoConfig, velerocredentials.CredentialGetter{})
require.Equal(t, tc.expected, actual)
@@ -554,7 +554,7 @@ func TestGetStoreOptions(t *testing.T) {
BackupRepo: &velerov1api.BackupRepository{},
},
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, errors.New("fake-error-2")
},
},
@@ -568,7 +568,7 @@ func TestGetStoreOptions(t *testing.T) {
BackupRepo: &velerov1api.BackupRepository{},
},
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -637,7 +637,7 @@ func TestPrepareRepo(t *testing.T) {
repoService: new(reposervicenmocks.BackupRepoService),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, errors.New("fake-store-option-error")
},
},
@@ -648,7 +648,7 @@ func TestPrepareRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -666,7 +666,7 @@ func TestPrepareRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -687,7 +687,7 @@ func TestPrepareRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -703,12 +703,33 @@ func TestPrepareRepo(t *testing.T) {
},
expectedErr: "cannot create new backup repo for read-only backup storage location velero/fake-bsl",
},
{
name: "create fail",
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
return map[string]string{}, nil
},
},
repoService: new(reposervicenmocks.BackupRepoService),
retFuncCheck: func(ctx context.Context, repoOption udmrepo.RepoOptions) (bool, error) {
return false, nil
},
retFuncCreate: func(ctx context.Context, repoOption udmrepo.RepoOptions) error {
return errors.New("fake-error-1")
},
expectedErr: "error to create backup repo: fake-error-1",
},
{
name: "initialize error",
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -729,7 +750,7 @@ func TestPrepareRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -812,7 +833,7 @@ func TestForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -836,7 +857,7 @@ func TestForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -864,7 +885,7 @@ func TestForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -962,7 +983,7 @@ func TestBatchForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -986,7 +1007,7 @@ func TestBatchForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1015,7 +1036,7 @@ func TestBatchForget(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1124,7 +1145,7 @@ func TestInitRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1142,7 +1163,7 @@ func TestInitRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1218,7 +1239,7 @@ func TestConnectToRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1236,7 +1257,7 @@ func TestConnectToRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1310,7 +1331,7 @@ func TestBoostRepoConnect(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1337,7 +1358,7 @@ func TestBoostRepoConnect(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1363,7 +1384,7 @@ func TestBoostRepoConnect(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1450,7 +1471,7 @@ func TestPruneRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
@@ -1468,7 +1489,7 @@ func TestPruneRepo(t *testing.T) {
getter: new(credmock.SecretStore),
credStoreReturn: "fake-password",
funcTable: localFuncTable{
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string) (map[string]string, error) {
getStorageVariables: func(*velerov1api.BackupStorageLocation, string, string, map[string]string, velerocredentials.CredentialGetter) (map[string]string, error) {
return map[string]string{}, nil
},
getStorageCredentials: func(*velerov1api.BackupStorageLocation, velerocredentials.FileStore) (map[string]string, error) {
+35 -15
View File
@@ -31,18 +31,18 @@ import (
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
)
func NewRepositoryService(store credentials.FileStore, fs filesystem.Interface, log logrus.FieldLogger) *RepositoryService {
func NewRepositoryService(credGetter credentials.CredentialGetter, fs filesystem.Interface, log logrus.FieldLogger) *RepositoryService {
return &RepositoryService{
credentialsFileStore: store,
fileSystem: fs,
log: log,
credGetter: credGetter,
fileSystem: fs,
log: log,
}
}
type RepositoryService struct {
credentialsFileStore credentials.FileStore
fileSystem filesystem.Interface
log logrus.FieldLogger
credGetter credentials.CredentialGetter
fileSystem filesystem.Interface
log logrus.FieldLogger
}
func (r *RepositoryService) InitRepo(bsl *velerov1api.BackupStorageLocation, repo *velerov1api.BackupRepository) error {
@@ -77,7 +77,7 @@ func (r *RepositoryService) DefaultMaintenanceFrequency() time.Duration {
}
func (r *RepositoryService) exec(cmd *restic.Command, bsl *velerov1api.BackupStorageLocation) error {
file, err := r.credentialsFileStore.Path(repokey.RepoKeySelector())
file, err := r.credGetter.FromFile.Path(repokey.RepoKeySelector())
if err != nil {
return err
}
@@ -88,17 +88,37 @@ func (r *RepositoryService) exec(cmd *restic.Command, bsl *velerov1api.BackupSto
// if there's a caCert on the ObjectStorage, write it to disk so that it can be passed to restic
var caCertFile string
if bsl.Spec.ObjectStorage != nil && bsl.Spec.ObjectStorage.CACert != nil {
caCertFile, err = restic.TempCACertFile(bsl.Spec.ObjectStorage.CACert, bsl.Name, r.fileSystem)
if err != nil {
return errors.Wrap(err, "error creating temp cacert file")
if bsl.Spec.ObjectStorage != nil {
var caCertData []byte
// Try CACertRef first (new method), then fall back to CACert (deprecated)
if bsl.Spec.ObjectStorage.CACertRef != nil {
caCertString, err := r.credGetter.FromSecret.Get(bsl.Spec.ObjectStorage.CACertRef)
if err != nil {
return errors.Wrap(err, "error getting CA certificate from secret")
}
caCertData = []byte(caCertString)
} else if bsl.Spec.ObjectStorage.CACert != nil {
caCertData = bsl.Spec.ObjectStorage.CACert
}
if caCertData != nil {
caCertFile, err = restic.TempCACertFile(caCertData, bsl.Name, r.fileSystem)
if err != nil {
return errors.Wrap(err, "error creating temp cacert file")
}
// ignore error since there's nothing we can do and it's a temp file.
defer os.Remove(caCertFile)
}
// ignore error since there's nothing we can do and it's a temp file.
defer os.Remove(caCertFile)
}
cmd.CACertFile = caCertFile
env, err := restic.CmdEnv(bsl, r.credentialsFileStore)
// CmdEnv uses credGetter.FromFile (not FromSecret) to get cloud provider credentials.
// FromFile materializes the BSL's Credential secret to a file path that cloud SDKs
// can read (e.g., AWS_SHARED_CREDENTIALS_FILE). This is different from caCertRef above,
// which uses FromSecret to read the CA certificate data directly into memory, then
// writes it to a temp file because restic CLI only accepts file paths (--cacert flag).
env, err := restic.CmdEnv(bsl, r.credGetter.FromFile)
if err != nil {
return err
}
@@ -103,6 +103,14 @@ func (p *volumeSnapshotRestoreItemAction) Execute(
// DeletionPolicy to Retain.
resetVolumeSnapshotAnnotation(&vs)
if vs.Spec.VolumeSnapshotClassName != nil {
// Delete VolumeSnapshotClass from the VolumeSnapshot.
// This is necessary to make the restore independent of the VolumeSnapshotClass.
vs.Spec.VolumeSnapshotClassName = nil
p.log.Debugf("Deleted VolumeSnapshotClassName from VolumeSnapshot %s/%s to make restore independent of VolumeSnapshotClass",
vs.Namespace, vs.Name)
}
vsMap, err := runtime.DefaultUnstructuredConverter.ToUnstructured(&vs)
if err != nil {
p.log.Errorf("Fail to convert VS %s to unstructured", vs.Namespace+"/"+vs.Name)
@@ -124,14 +124,20 @@ func TestVSExecute(t *testing.T) {
},
{
name: "Normal case, VSC should be created",
vs: builder.ForVolumeSnapshot("ns", "vsName").ObjectMeta(
builder.WithAnnotationsMap(
map[string]string{
velerov1api.VolumeSnapshotHandleAnnotation: "vsc",
velerov1api.DriverNameAnnotation: "pd.csi.storage.gke.io",
},
),
).SourceVolumeSnapshotContentName(newVscName).Status().BoundVolumeSnapshotContentName("vscName").Result(),
vs: builder.ForVolumeSnapshot("ns", "vsName").
ObjectMeta(
builder.WithAnnotationsMap(
map[string]string{
velerov1api.VolumeSnapshotHandleAnnotation: "vsc",
velerov1api.DriverNameAnnotation: "pd.csi.storage.gke.io",
},
),
).
SourceVolumeSnapshotContentName(newVscName).
VolumeSnapshotClass("vscClass").
Status().
BoundVolumeSnapshotContentName("vscName").
Result(),
restore: builder.ForRestore("velero", "restore").ObjectMeta(builder.WithUID("restoreUID")).Result(),
expectErr: false,
expectedVS: builder.ForVolumeSnapshot("ns", "test").SourceVolumeSnapshotContentName(newVscName).Result(),
@@ -108,6 +108,14 @@ func (p *volumeSnapshotContentRestoreItemAction) Execute(
return nil, errors.Errorf("fail to get snapshot handle from VSC %s status", vsc.Name)
}
if vsc.Spec.VolumeSnapshotClassName != nil {
// Delete VolumeSnapshotClass from the VolumeSnapshotContent.
// This is necessary to make the restore independent of the VolumeSnapshotClass.
vsc.Spec.VolumeSnapshotClassName = nil
p.log.Debugf("Deleted VolumeSnapshotClassName from VolumeSnapshotContent %s to make restore independent of VolumeSnapshotClass",
vsc.Name)
}
additionalItems := []velero.ResourceIdentifier{}
if csi.IsVolumeSnapshotContentHasDeleteSecret(&vsc) {
additionalItems = append(additionalItems,
@@ -55,13 +55,17 @@ func TestVSCExecute(t *testing.T) {
},
{
name: "Normal case, additional items should return ",
vsc: builder.ForVolumeSnapshotContent("test").ObjectMeta(builder.WithAnnotationsMap(
map[string]string{
velerov1api.PrefixedSecretNameAnnotation: "name",
velerov1api.PrefixedSecretNamespaceAnnotation: "namespace",
},
)).VolumeSnapshotRef("velero", "vsName", "vsUID").
Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleName}).Result(),
vsc: builder.ForVolumeSnapshotContent("test").
ObjectMeta(builder.WithAnnotationsMap(
map[string]string{
velerov1api.PrefixedSecretNameAnnotation: "name",
velerov1api.PrefixedSecretNamespaceAnnotation: "namespace",
},
)).
VolumeSnapshotRef("velero", "vsName", "vsUID").
VolumeSnapshotClassName("vsClass").
Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleName}).
Result(),
restore: builder.ForRestore("velero", "restore").ObjectMeta(builder.WithUID("restoreUID")).
NamespaceMappings("velero", "restore").Result(),
expectErr: false,
@@ -72,15 +76,17 @@ func TestVSCExecute(t *testing.T) {
Name: "name",
},
},
expectedVSC: builder.ForVolumeSnapshotContent(newVscName).ObjectMeta(builder.WithAnnotationsMap(
map[string]string{
velerov1api.PrefixedSecretNameAnnotation: "name",
velerov1api.PrefixedSecretNamespaceAnnotation: "namespace",
},
)).VolumeSnapshotRef("restore", newVscName, "").
expectedVSC: builder.ForVolumeSnapshotContent(newVscName).
ObjectMeta(builder.WithAnnotationsMap(
map[string]string{
velerov1api.PrefixedSecretNameAnnotation: "name",
velerov1api.PrefixedSecretNamespaceAnnotation: "namespace",
},
)).VolumeSnapshotRef("restore", newVscName, "").
Source(snapshotv1api.VolumeSnapshotContentSource{SnapshotHandle: &snapshotHandleName}).
DeletionPolicy(snapshotv1api.VolumeSnapshotContentRetain).
Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleName}).Result(),
Status(&snapshotv1api.VolumeSnapshotContentStatus{SnapshotHandle: &snapshotHandleName}).
Result(),
},
{
name: "VSC exists in cluster, same as the normal case",
+61
View File
@@ -25,6 +25,7 @@ import (
"os/signal"
"path/filepath"
"reflect"
"slices"
"sort"
"strings"
"sync"
@@ -77,6 +78,7 @@ import (
"github.com/vmware-tanzu/velero/pkg/util/filesystem"
"github.com/vmware-tanzu/velero/pkg/util/kube"
"github.com/vmware-tanzu/velero/pkg/util/results"
"github.com/vmware-tanzu/velero/pkg/util/wildcard"
)
const ObjectStatusRestoreAnnotationKey = "velero.io/restore-status"
@@ -474,6 +476,12 @@ func (ctx *restoreContext) execute() (results.Result, results.Result) {
return warnings, errs
}
// Expand wildcard patterns in namespace includes/excludes if needed
if err := ctx.expandNamespaceWildcards(backupResources); err != nil {
errs.AddVeleroError(err)
return warnings, errs
}
// TODO: Remove outer feature flag check to make this feature a default in Velero.
if features.IsEnabled(velerov1api.APIGroupVersionsFeatureFlag) {
if ctx.backup.Status.FormatVersion >= "1.1.0" {
@@ -2378,6 +2386,59 @@ func (ctx *restoreContext) getSelectedRestoreableItems(resource string, original
return restorable, warnings, errs
}
// extractNamespacesFromBackup extracts all available namespaces from backup resources
func extractNamespacesFromBackup(backupResources map[string]*archive.ResourceItems) []string {
namespaceSet := make(map[string]struct{})
for _, resource := range backupResources {
for namespace := range resource.ItemsByNamespace {
if namespace != "" { // Skip cluster-scoped resources (empty namespace)
namespaceSet[namespace] = struct{}{}
}
}
}
namespaces := make([]string, 0, len(namespaceSet))
for ns := range namespaceSet {
namespaces = append(namespaces, ns)
}
return namespaces
}
// expandNamespaceWildcards expands wildcard patterns in namespace includes/excludes
// and updates the restore context with the expanded patterns and status
func (ctx *restoreContext) expandNamespaceWildcards(backupResources map[string]*archive.ResourceItems) error {
if !wildcard.ShouldExpandWildcards(ctx.restore.Spec.IncludedNamespaces, ctx.restore.Spec.ExcludedNamespaces) {
return nil
}
// If `*` is mentioned in restore excludes, something is wrong
if slices.Contains(ctx.restore.Spec.ExcludedNamespaces, "*") {
return errors.New("wildcard '*' is not allowed in restore excludes")
}
availableNamespaces := extractNamespacesFromBackup(backupResources)
expandedIncludes, expandedExcludes, err := wildcard.ExpandWildcards(
availableNamespaces,
ctx.restore.Spec.IncludedNamespaces,
ctx.restore.Spec.ExcludedNamespaces,
)
if err != nil {
return errors.Wrap(err, "error expanding wildcard patterns in namespace includes/excludes")
}
// Update namespace includes/excludes with expanded patterns
ctx.namespaceIncludesExcludes = collections.NewIncludesExcludes().
Includes(expandedIncludes...).
Excludes(expandedExcludes...)
selectedNamespaces := wildcard.GetWildcardResult(expandedIncludes, expandedExcludes)
ctx.log.Infof("Expanded namespace wildcards - includes: %v, excludes: %v, final: %v",
expandedIncludes, expandedExcludes, selectedNamespaces)
return nil
}
// removeRestoreLabels removes the restore name and the
// restored backup's name.
func removeRestoreLabels(obj metav1.Object) {

Some files were not shown because too many files have changed in this diff Show More