Update the community page to add the correct links to community meeting
and meeting notes.
I also removed the referece of google group as I confirmed the last
message was sent 2 years ago.
Signed-off-by: Daniel Jiang <daniel.jiang@broadcom.com>
* Add CI check for invalid characters in file paths
Go's module zip rejects filenames containing certain characters (shell
special chars like " ' * < > ? ` |, path separators : \, and non-letter
Unicode such as control/format characters). This caused a build failure
when a changelog file contained an invisible U+200E LEFT-TO-RIGHT MARK
(see PR #9552).
Add a GitHub Actions workflow that validates all tracked file paths on
every PR to catch these issues before they reach downstream consumers.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
* Fix changelog filenames containing invisible U+200E characters
Remove LEFT-TO-RIGHT MARK unicode characters from changelog filenames
that would cause Go module zip failures.
Generated with [Claude Code](https://claude.ai/code)
via [Happy](https://happy.engineering)
Co-Authored-By: Claude <noreply@anthropic.com>
Co-Authored-By: Happy <yesreply@happy.engineering>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
---------
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-authored-by: Claude Opus 4.6 <noreply@anthropic.com>
Co-authored-by: Happy <yesreply@happy.engineering>
The `getDataUpload` function in the CSI PVC backup plugin was
previously making a cluster-scoped list query to retrieve DataUpload
CRs. In environments with strict minimum-privilege RBAC, this would
fail with forbidden errors.
This explicitly passes the backup namespace into the `ListOptions`
when calling `crClient.List`, correctly scoping the queries to the
backup's namespace. Unit tests have also been updated to ensure
cross-namespace queries are rejected appropriately.
Signed-off-by: Adam Zhang <adam.zhang@broadcom.com>
Kubernetes 1.34 introduced VolumeGroupSnapshot v1beta2 API and
deprecated v1beta1. Distributions running K8s 1.34+ (e.g. OpenShift
4.21+) have removed v1beta1 VGS CRDs entirely, breaking Velero's
VGS functionality on those clusters.
This change bumps external-snapshotter/client/v8 from v8.2.0 to
v8.4.0 and migrates all VGS API usage from v1beta1 to v1beta2.
The v1beta2 API is structurally compatible - the Spec-level types
(GroupSnapshotHandles, VolumeGroupSnapshotContentSource) are
unchanged. The Status-level change (VolumeSnapshotHandlePairList
replaced by VolumeSnapshotInfoList) does not affect Velero as it
does not directly consume that type.
Fixes#9694
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix VolumeGroupSnapshot restore on Ceph RBD
This PR fixes two related issues affecting CSI snapshot restore on Ceph RBD:
1. VolumeGroupSnapshot restore fails because Ceph RBD populates
volumeGroupSnapshotHandle on pre-provisioned VSCs, but Velero doesn't
create the required VGSC during restore.
2. CSI snapshot restore fails because VolumeSnapshotClassName is removed
from restored VSCs, preventing the CSI controller from getting
credentials for snapshot verification.
Changes:
- Capture volumeGroupSnapshotHandle during backup as VS annotation
- Create stub VGSC during restore with matching handle in status
- Look up VolumeSnapshotClass by driver and set on restored VSC
Fixes#9512Fixes#9515
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add changelog for VGS restore fix
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix gofmt import order
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add changelog for VGS restore fix
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix import alias corev1 to corev1api per lint config
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix: Add snapshot handles to existing stub VGSC and add unit tests
When multiple VolumeSnapshots from the same VolumeGroupSnapshot are
restored, they share the same VolumeGroupSnapshotHandle but have
different individual snapshot handles. This commit:
1. Fixes incomplete logic where existing VGSC wasn't updated with
new snapshot handles (addresses review feedback)
2. Fixes race condition where Create returning AlreadyExists would
skip adding the snapshot handle
3. Adds comprehensive unit tests for ensureStubVGSCExists (5 cases)
and addSnapshotHandleToVGSC (4 cases) functions
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Clean up stub VolumeGroupSnapshotContents during restore finalization
Add cleanup logic for stub VGSCs created during VolumeGroupSnapshot restore.
The stub VGSCs are temporary objects needed to satisfy CSI controller
validation during VSC reconciliation. Once all related VSCs become
ReadyToUse, the stub VGSCs are no longer needed and should be removed.
The cleanup runs in the restore finalizer controller's execute() phase.
Before deleting each VGSC, it polls until all related VolumeSnapshotContents
(correlated by snapshot handle) are ReadyToUse, with a timeout fallback.
Deletion failures and CRD-not-installed scenarios are treated as warnings
rather than errors to avoid failing the restore.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix lint: remove unused nolint directive and simplify cleanupStubVGSC return
The cleanupStubVGSC function only produces warnings (not errors), so
simplify its return signature. Also remove the now-unused nolint:unparam
directive on execute() since warnings are no longer always nil.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
---------
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Restrict the listing of PodVolumeBackup resources to the specific
restore namespace in both the core restore controller and the pod
volume restore action plugin. This prevents "Forbidden" errors when
Velero is configured with namespace-scoped minimum privileges,
avoiding the need for cluster-scoped list permissions for
PodVolumeBackups.
Fixes: #9681
Signed-off-by: Adam Zhang <adam.zhang@broadcom.com>
The tag used to latest. Due to latest tag v0.23.3 already used
Golang v1.26, Velero main still uses v1.25. Build failed.
To fix this, pin the controller-runtime to v0.23.2
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
* feat: support backup hooks on sidecars
Add support for configuring Kubernates native
Sidecars as target containrs for Backup Hooks
commands. This is purely a validation level
patch as the actual pods/exec API doesn't make
any distinction between standard and sidecar
containers.
Signed-off-by: Gabriele Fedi <gabriele.fedi@enterprisedb.com>
* test: extend unit tests
Signed-off-by: Gabriele Fedi <gabriele.fedi@enterprisedb.com>
* chore: changelog
Signed-off-by: Gabriele Fedi <gabriele.fedi@enterprisedb.com>
* style: fix linter issues
Signed-off-by: Gabriele Fedi <gabriele.fedi@enterprisedb.com>
---------
Signed-off-by: Gabriele Fedi <gabriele.fedi@enterprisedb.com>
* fix configmap lookup in non-default namespaces
o.Namespace is empty when Validate runs (Complete hasn't been called yet),
causing VerifyJSONConfigs to query the default namespace instead of the
intended one. Replace o.Namespace with f.Namespace() in all three ConfigMap
validation calls so the factory's already-resolved namespace is used.
Signed-off-by: Adam Zhang <adam.zhang@broadcom.com>
* switch the call order of validate/complete
switch the call order of validate/complete which accomplish
the same effect.
Signed-off-by: Adam Zhang <adam.zhang@broadcom.com>
---------
Signed-off-by: Adam Zhang <adam.zhang@broadcom.com>
From 1.18.1, Velero adds some default affinity in the backup/restore pod,
so we can't directly compare the whole affinity,
but we can verify if the expected affinity is contained in the pod affinity.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
The itemOperationTimeout field was missing from the Schedule API type
documentation even though it is supported in the Schedule CRD template.
This led users to believe the field was not available per-schedule.
Fixes#9598
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix DBR stuck when CSI snapshot no longer exists in cloud provider
During backup deletion, VolumeSnapshotContentDeleteItemAction creates a
new VSC with the snapshot handle from the backup and polls for readiness.
If the underlying snapshot no longer exists (e.g., deleted externally),
the CSI driver reports Status.Error but checkVSCReadiness() only checks
ReadyToUse, causing it to poll for the full 10-minute timeout instead of
failing fast. Additionally, the newly created VSC is never cleaned up on
failure, leaving orphaned resources in the cluster.
This commit:
- Adds Status.Error detection in checkVSCReadiness() to fail immediately
on permanent CSI driver errors (e.g., InvalidSnapshot.NotFound)
- Cleans up the dangling VSC when readiness polling fails
Fixes#9579
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add changelog for PR #9581
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix typo in pod_volume_test.go: colume -> volume
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
---------
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Issue #9544: Add test coverage and fix validation for MRAP ARN bucket names
S3 Multi-Region Access Point (MRAP) ARNs have the format:
arn:aws:s3::{account-id}:accesspoint/{mrap-alias}.mrap
These ARNs contain a '/' as part of the ARN path, which caused Velero's
BSL bucket validation to reject them with an error asking the user to
put the value in the Prefix field instead.
Fix the bucket name validation in objectBackupStoreGetter.Get() to
exempt ARNs (identified by the "arn:" prefix) from the slash check,
since slashes are a valid and required part of ARN syntax.
Add unit tests in object_store_mrap_test.go covering:
- A plain MRAP ARN as bucket name succeeds
- A MRAP ARN with a trailing slash is trimmed and accepted
Signed-off-by: Sabir Ali <testsabirweb@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* Address review comments: fix changelog filename and import grouping
Signed-off-by: Sabir Ali <testsabirweb@gmail.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* Restrict MRAP ARN bucket validation to arn:aws:s3: prefix
Per review, use HasPrefix(bucket, "arn:aws:s3:") instead of
HasPrefix(bucket, "arn:") so only S3 ARNs (e.g. MRAP) are exempt
from the slash check, not any ARN from other AWS services.
Signed-off-by: Sabir Ali <sabir.ali@spectrocloud.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
* Move MRAP bucket tests into TestNewObjectBackupStoreGetter
Consolidate MRAP ARN test cases into the existing table in
object_store_test.go and remove object_store_mrap_test.go.
Signed-off-by: Sabir Ali <sabir.ali@spectrocloud.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
---------
Signed-off-by: Sabir Ali <testsabirweb@gmail.com>
Signed-off-by: Sabir Ali <sabir.ali@spectrocloud.com>
Co-authored-by: Cursor <cursoragent@cursor.com>
Add a new Prometheus gauge metric that exposes the expected interval
between consecutive scheduled backups. This enables dynamic alerting
thresholds per schedule backups.
Signed-off-by: Quang Ngo <quang.ngo@canonical.com>
* Support all glob wildcard characters in namespace validation
Expand namespace validation to allow all valid glob pattern characters
(*, ?, {}, [], ,) by replacing them with valid characters during RFC 1123
validation. The actual glob pattern validation is handled separately by
the wildcard package.
Also add validation to reject unsupported characters (|, (), !) that are
not valid in glob patterns, and update terminology from "regex" to "glob"
for clarity since this implementation uses glob patterns, not regex.
Changes:
- Replace all glob wildcard characters in validateNamespaceName
- Add test coverage for valid glob patterns in includes/excludes
- Add test coverage for unsupported characters
- Reject exclamation mark (!) in wildcard patterns
- Clarify comments and error messages about glob vs regex
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Changelog
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add documentation: glob patterns are now accepted
Signed-off-by: Joseph <jvaikath@redhat.com>
* Error message fix
Signed-off-by: Joseph <jvaikath@redhat.com>
* Remove negation glob char test
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add bracket pattern validation for namespace glob patterns
Extends wildcard validation to support square bracket patterns [] used in glob character classes. Validates bracket syntax including empty brackets, unclosed brackets, and unmatched brackets. Extracts ValidateNamespaceName as a public function to enable reuse in namespace validation logic.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Reduce scope to *, ?, [ and ]
Signed-off-by: Joseph <jvaikath@redhat.com>
* Fix tests
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add namespace glob patterns documentation page
Adds dedicated documentation explaining supported glob patterns
for namespace include/exclude filtering to help users understand
the wildcard syntax.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Fix build-image Dockerfile envtest download
Replace inaccessible go.kubebuilder.io URL with setup-envtest and update envtest version to 1.33.0 to match Kubernetes v0.33.3 dependencies.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* kubebuilder binaries mv
Signed-off-by: Joseph <jvaikath@redhat.com>
* Reject brace patterns and update documentation
Add {, }, and , to unsupported characters list to explicitly reject
brace expansion patterns. Remove { from wildcard detection since these
patterns are not supported in the 1.18 release.
Update all documentation to show supported patterns inline (*, ?, [abc])
with clickable links to the detailed namespace-glob-patterns page.
Simplify YAML comments by removing non-clickable URLs.
Update tests to expect errors when brace patterns are used.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Document brace expansion as unsupported
Add {} and , to the unsupported patterns section to clarify that
brace expansion patterns like {a,b,c} are not supported.
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Update tests to expect brace pattern rejection
Co-Authored-By: Claude Sonnet 4.5 <noreply@anthropic.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
---------
Signed-off-by: Joseph <jvaikath@redhat.com>
Co-authored-by: Claude Sonnet 4.5 <noreply@anthropic.com>
Use typed error approach: Make GetPVForPVC return ErrPVNotFoundForPVC
when PV is not expected to be found (unbound PVC), then use errors.Is
to check for this error type. When a matching policy exists (e.g.,
pvcPhase: [Pending, Lost] with action: skip), apply the action without
error. When no policy matches, return the original error to preserve
default behavior.
Changes:
- Add ErrPVNotFoundForPVC sentinel error to pvc_pv.go
- Update ShouldPerformSnapshot to handle unbound PVCs with policies
- Update ShouldPerformFSBackup to handle unbound PVCs with policies
- Update item_backupper.go to handle Lost PVCs in tracking functions
- Remove checkPVCOnlySkip helper (no longer needed)
- Update tests to reflect new behavior
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Ensure the RBAC resources are restored before pods.
The change help to avoid pod starting error when pod depends on the RBAC resources,
e.g., prometheus operator check whether it has enough permission before launching
controller, if prometheus operator pod starts before RBAC resources created, it
will not launch controllers, and it will not retry.
f7f07bcdfb/cmd/operator/main.go (L392-L400)
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
Remove 'self' from MIGRATE_FROM_VELERO_VERSION.
* Need to modify the BYOT case to specify MIGRATE_FROM_VELERO_VERSION.
Remove the CSI plugin installation check for no older than v1.14
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
This commit implements VolumePolicy support for PVC Phase conditions, resolving
vmware-tanzu/velero#7233 where backups fail with ''PVC has no volume backing this claim''
for Pending PVCs.
Changes made:
- Extended VolumePolicy API to support PVC phase conditions
- Added pvcPhaseCondition struct with matching logic
- Modified getMatchAction() to evaluate policies for unbound PVCs before returning errors
- Added case to GetMatchAction() to handle PVC-only scenarios (nil PV)
- Added comprehensive unit tests for PVC phase parsing and matching
Users can now skip Pending PVCs through volume policy configuration:
apiVersion: v1
kind: ConfigMap
metadata:
name: volume-policy
namespace: velero
data:
policy.yaml: |
version: v1
volumePolicies:
- conditions:
pvcPhase: [Pending]
action:
type: skip
chore: rename changelog file to match PR #9166
Renamed changelogs/unreleased/7233-claude to changelogs/unreleased/9166-claude
to match the opened PR at https://github.com/vmware-tanzu/velero/pull/9166
docs: Add PVC phase condition support to VolumePolicy documentation
- Added pvcPhase field to YAML template example
- Documented pvcPhase as a supported condition in the list
- Added comprehensive examples for using PVC phase conditions
- Included examples for Pending, Bound, and Lost phases
- Demonstrated combining PVC phase with other conditions
Co-Authored-By: Tiger Kaovilai <kaovilai@users.noreply.github.com>
Ensure plugin init container names satisfy DNS-1123 label constraints
(max 63 chars). Long names are truncated with an 8-char hash suffix to
maintain uniqueness.
Fixes: #9444
Signed-off-by: Michal Pryc <mpryc@redhat.com>
The nolint:staticcheck directives are not needed in the test file because
it calls NewVolumeHelperImpl within the same package, which doesn't
trigger deprecation warnings. Only cross-package calls need the directive.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
The ShouldPerformSnapshotWithVolumeHelper function and tests intentionally
use NewVolumeHelperImpl (deprecated) for backwards compatibility with
third-party plugins. Add nolint:staticcheck to suppress the linter
warnings with explanatory comments.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Remove deprecated functions that were marked for removal per review:
- Remove GetPodsUsingPVC (replaced by GetPodsUsingPVCWithCache)
- Remove IsPVCDefaultToFSBackup (replaced by IsPVCDefaultToFSBackupWithCache)
- Remove associated tests for deprecated functions
- Add deprecation marker to NewVolumeHelperImpl
- Add deprecation marker to ShouldPerformSnapshotWithBackup
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
This commit addresses reviewer feedback on PR #9441 regarding
concurrent backup caching concerns. Key changes:
1. Added lazy per-namespace caching for the CSI PVC BIA plugin path:
- Added IsNamespaceBuilt() method to check if namespace is cached
- Added BuildCacheForNamespace() for lazy, per-namespace cache building
- Plugin builds cache incrementally as namespaces are encountered
2. Added NewVolumeHelperImplWithCache constructor for plugins:
- Accepts externally-managed PVC-to-Pod cache
- Follows pattern from PR #9226 (Scott Seago's design)
3. Plugin instance lifecycle clarification:
- Plugin instances are unique per backup (created via newPluginManager)
- Cleaned up via CleanupClients at backup completion
- No mutex or backup UID tracking needed
4. Test coverage:
- Added tests for IsNamespaceBuilt and BuildCacheForNamespace
- Added tests for NewVolumeHelperImplWithCache constructor
- Added test verifying cache usage for fs-backup determination
This maintains the O(N+M) complexity improvement from issue #9179
while addressing architectural concerns about concurrent access.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Address review feedback to have a global VolumeHelper instance per
plugin process instead of creating one on each ShouldPerformSnapshot
call.
Changes:
- Add volumeHelper, cachedForBackup, and mu fields to pvcBackupItemAction
struct for caching the VolumeHelper per backup
- Add getOrCreateVolumeHelper() method for thread-safe lazy initialization
- Update Execute() to use cached VolumeHelper via
ShouldPerformSnapshotWithVolumeHelper()
- Update filterPVCsByVolumePolicy() to accept VolumeHelper parameter
- Add ShouldPerformSnapshotWithVolumeHelper() that accepts optional
VolumeHelper for reuse across multiple calls
- Add NewVolumeHelperForBackup() factory function for BIA plugins
- Add comprehensive unit tests for both nil and non-nil VolumeHelper paths
This completes the fix for issue #9179 by ensuring the PVC-to-Pod cache
is built once per backup and reused across all PVC processing, avoiding
O(N*M) complexity.
Fixes#9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
- Rename NewVolumeHelperImplWithCache to NewVolumeHelperImplWithNamespaces
- Move cache building logic from backup.go into volumehelper
- Return error from NewVolumeHelperImplWithNamespaces if cache build fails
- Remove fallback in main backup path - backup fails if cache build fails
- Update NewVolumeHelperImpl to call NewVolumeHelperImplWithNamespaces
- Add comments clarifying fallback is only used by plugins
- Update tests for new error return signature
This addresses review comments from @Lyndon-Li and @kaovilai:
- Cache building is now encapsulated in volumehelper
- No fallback in main backup path ensures predictable performance
- Code reuse between constructors
Fixes#9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
- Use ResolveNamespaceList() instead of GetIncludes() for more accurate
namespace resolution when building the PVC-to-Pod cache
- Refactor NewVolumeHelperImpl to call NewVolumeHelperImplWithCache with
nil cache parameter to avoid code duplication
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Add test case to verify that the PVC-to-Pod cache is used even when
no volume policy is configured. When defaultVolumesToFSBackup is true,
the cache is used to find pods using the PVC to determine if fs-backup
should be used instead of snapshot.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Add TestVolumeHelperImplWithCache_ShouldPerformFSBackup to verify:
- Volume policy match with cache returns correct fs-backup decision
- Volume policy match with snapshot action skips fs-backup
- Fallback to direct lookup when cache is not built
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Add TestVolumeHelperImplWithCache_ShouldPerformSnapshot to verify:
- Volume policy match with cache returns correct snapshot decision
- fs-backup via opt-out with cache properly skips snapshot
- Fallback to direct lookup when cache is not built
These tests verify the cache-enabled code path added in the previous
commit for improved volume policy performance.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
The GetPodsUsingPVC function had O(N*M) complexity - for each PVC,
it listed ALL pods in the namespace and iterated through each pod.
With many PVCs and pods, this caused significant performance
degradation (2+ seconds per PV in some cases).
This change introduces a PVC-to-Pod cache that is built once per
backup and reused for all PVC lookups, reducing complexity from
O(N*M) to O(N+M).
Changes:
- Add PVCPodCache struct with thread-safe caching in podvolume pkg
- Add NewVolumeHelperImplWithCache constructor for cache support
- Build cache before backup item processing in backup.go
- Add comprehensive unit tests for cache functionality
- Graceful fallback to direct lookups if cache fails
Fixes#9179
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
This change enables BSL validation to work when using caCertRef
(Secret-based CA certificate) by resolving the certificate from
the Secret in velero core before passing it to the object store
plugin as 'caCert' in the config map.
This approach requires no changes to provider plugins since they
already understand the 'caCert' config key.
Changes:
- Add SecretStore to objectBackupStoreGetter struct
- Add NewObjectBackupStoreGetterWithSecretStore constructor
- Update Get method to resolve caCertRef from Secret
- Update server.go to use new constructor with SecretStore
- Add CACertRef builder method and unit tests
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude Opus 4.5 <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
- Introduced `CACertRef` field in `ObjectStorageLocation` to reference a Secret containing the CA certificate, replacing the deprecated `CACert` field.
- Implemented validation logic to ensure mutual exclusivity between `CACert` and `CACertRef`.
- Updated BSL controller and repository provider to handle the new certificate resolution logic.
- Enhanced CLI to support automatic certificate discovery from BSL configurations.
- Added unit and integration tests to validate new functionality and ensure backward compatibility.
- Documented migration strategy for users transitioning from inline certificates to Secret-based management.
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Explain that the duration histogram tracks distribution of individual
job durations, not accumulated sums, to address reviewer concerns.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
This commit addresses three review comments on PR #9321:
1. Keep sanitization in controller (response to @ywk253100)
- Maintaining centralized error handling for easier extension
- Azure-specific patterns detected and others passed through unchanged
2. Sanitize unavailableErrors array (@priyansh17)
- Now using sanitizeStorageError() for both unavailableErrors array
and location.Status.Message for consistency
3. Add SAS token scrubbing (@anshulahuja98)
- Scrubs Azure SAS token parameters to prevent credential leakage
- Redacts: sig, se, st, sp, spr, sv, sr, sip, srt, ss
- Example: ?sig=secret becomes ?sig=***REDACTED***
Added comprehensive test coverage for SAS token scrubbing with 4 new
test cases covering various scenarios.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Azure storage errors include verbose HTTP response details and XML
in error messages, making the BSL status.message field cluttered
and hard to read. This change adds sanitization to extract only
the error code and meaningful message.
Before:
BackupStorageLocation "test" is unavailable: rpc error: code = Unknown
desc = GET https://...
RESPONSE 404: 404 The specified container does not exist.
ERROR CODE: ContainerNotFound
<?xml version="1.0"...>
After:
BackupStorageLocation "test" is unavailable: rpc error: code = Unknown
desc = ContainerNotFound: The specified container does not exist.
AWS and GCP error messages are preserved as-is since they don't
contain verbose HTTP responses.
Fixes#8368
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
- Rename metric constants from maintenance_job_* to repo_maintenance_*
- Update metric help text to clarify these are for repo maintenance
- Rename functions: RegisterMaintenanceJob* → RegisterRepoMaintenance*
- Update all test references to use new names
Addresses review comments from @Lyndon-Li on PR #9414
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
Adds three new Prometheus metrics to track backup repository
maintenance job execution:
- velero_maintenance_job_success_total: Counter for successful jobs
- velero_maintenance_job_failure_total: Counter for failed jobs
- velero_maintenance_job_duration_seconds: Histogram for job duration
Metrics use repository_name label to identify specific BackupRepositories.
Duration is recorded for both successful and failed jobs (when job runs),
but not when job fails to start.
Includes comprehensive unit and integration tests.
Fixes#9225
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Add wildcard status fields
Signed-off-by: Joseph <jvaikath@redhat.com>
* Implement wildcard namespace expansion in item collector
- Introduced methods to get active namespaces and expand wildcard includes/excludes in the item collector.
- Updated getNamespacesToList to handle wildcard patterns and return expanded lists.
- Added utility functions for setting includes and excludes in the IncludesExcludes struct.
- Created a new package for wildcard handling, including functions to determine when to expand wildcards and to perform the expansion.
This enhances the backup process by allowing more flexible namespace selection based on wildcard patterns.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Enhance wildcard expansion logic and logging in item collector
- Improved logging to include original includes and excludes when expanding wildcards.
- Updated the ShouldExpandWildcards function to check for wildcard patterns in excludes.
- Added comments for clarity in the expandWildcards function regarding pattern handling.
These changes enhance the clarity and functionality of the wildcard expansion process in the backup system.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add wildcard namespace fields to Backup CRD and update deepcopy methods
- Introduced `wildcardIncludedNamespaces` and `wildcardExcludedNamespaces` fields to the Backup CRD to support wildcard patterns for namespace inclusion and exclusion.
- Updated deepcopy methods to handle the new fields, ensuring proper copying of data during object manipulation.
These changes enhance the flexibility of namespace selection in backup operations, aligning with recent improvements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor Backup CRD to rename wildcard namespace fields
- Updated `BackupStatus` struct to rename `WildcardIncludedNamespaces` to `WildcardExpandedIncludedNamespaces` and `WildcardExcludedNamespaces` to `WildcardExpandedExcludedNamespaces` for clarity.
- Adjusted associated comments to reflect the new naming and ensure consistency in documentation.
- Modified deepcopy methods to accommodate the renamed fields, ensuring proper data handling during object manipulation.
These changes enhance the clarity and maintainability of the Backup CRD, aligning with recent improvements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Fix
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor where wildcard expansion happens
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor Backup CRD and related components for expanded namespace handling
- Updated `BackupStatus` struct to rename fields for clarity: `WildcardExpandedIncludedNamespaces` and `WildcardExpandedExcludedNamespaces` are now `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces`, respectively.
- Adjusted associated comments and deepcopy methods to reflect the new naming conventions.
- Removed the `getActiveNamespaces` function from the item collector, streamlining the namespace handling process.
- Enhanced logging during wildcard expansion to provide clearer insights into the process.
These changes improve the clarity and maintainability of the Backup CRD and enhance the functionality of namespace selection in backup operations.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor wildcard expansion logic in item collector and enhance testing
- Moved the wildcard expansion logic into a dedicated method, `expandNamespaceWildcards`, improving code organization and readability.
- Updated logging to provide detailed insights during the wildcard expansion process.
- Introduced comprehensive unit tests for wildcard handling, covering various scenarios and edge cases.
- Enhanced the `ShouldExpandWildcards` function to better identify wildcard patterns and validate inputs.
These changes improve the maintainability and robustness of the wildcard handling in the backup system.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Enhance Restore CRD with expanded namespace fields and update logic
- Added `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces` fields to the `RestoreStatus` struct to support expanded wildcard namespace handling.
- Updated the `DeepCopyInto` method to ensure proper copying of the new fields.
- Implemented logic in the restore process to expand wildcard patterns for included and excluded namespaces, improving flexibility in namespace selection during restores.
- Enhanced logging to provide insights into the expanded namespaces.
These changes improve the functionality and maintainability of the restore process, aligning with recent enhancements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor Backup and Restore CRDs to enhance wildcard namespace handling
- Renamed fields in `BackupStatus` and `RestoreStatus` from `ExpandedIncludedNamespaces` and `ExpandedExcludedNamespaces` to `IncludeWildcardMatches` and `ExcludeWildcardMatches` for clarity.
- Introduced a new field `WildcardResult` to record the final namespaces after applying wildcard logic.
- Updated the `DeepCopyInto` methods to accommodate the new field names and ensure proper data handling.
- Enhanced comments to reflect the changes and improve documentation clarity.
These updates improve the maintainability and clarity of the CRDs, aligning with recent enhancements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Enhance wildcard namespace handling in Backup and Restore processes
- Updated `BackupRequest` and `Restore` status structures to include a new field `WildcardResult`, which captures the final list of namespaces after applying wildcard logic.
- Renamed existing fields to `IncludeWildcardMatches` and `ExcludeWildcardMatches` for improved clarity.
- Enhanced logging to provide detailed insights into the expanded namespaces and final results during backup and restore operations.
- Introduced a new utility function `GetWildcardResult` to streamline the selection of namespaces based on include/exclude criteria.
These changes improve the clarity and functionality of namespace selection in both backup and restore processes, aligning with recent enhancements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Refactor namespace wildcard expansion logic in restore process
- Moved the wildcard expansion logic into a dedicated method, `expandNamespaceWildcards`, improving code organization and readability.
- Enhanced error handling and logging to provide detailed insights into the expanded namespaces during the restore operation.
- Updated the restore context with expanded namespace patterns and final results, ensuring clarity in the restore status.
These changes improve the maintainability and clarity of the restore process, aligning with recent enhancements in wildcard handling.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add checks for "*" in exclude
Signed-off-by: Joseph <jvaikath@redhat.com>
* Rebase
Signed-off-by: Joseph <jvaikath@redhat.com>
* Create NamespaceIncludesExcludes to get full NS listing for backup w/
Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add new NamespaceIncludesExcludes struct
Signed-off-by: Joseph <jvaikath@redhat.com>
* Move namespace expansion logic
Signed-off-by: Joseph <jvaikath@redhat.com>
* Update backup status with expansion
Signed-off-by: Joseph <jvaikath@redhat.com>
* Wildcard status update
Signed-off-by: Joseph <jvaikath@redhat.com>
* Skip ns check if wildcard expansion
Signed-off-by: Joseph <jvaikath@redhat.com>
* Move wildcard expansion to getResourceItems
Signed-off-by: Joseph <jvaikath@redhat.com>
* lint
Signed-off-by: Joseph <jvaikath@redhat.com>
* Changelog
Signed-off-by: Joseph <jvaikath@redhat.com>
* linting issues
Signed-off-by: Joseph <jvaikath@redhat.com>
* Remove wildcard restore to check if tests pass
Signed-off-by: Joseph <jvaikath@redhat.com>
* Fix namespace mapping test bug from lint fix
The previous commit (0a4aabcf4) attempted to fix linting issues by
using strings.Builder, but incorrectly wrote commas to a separate
builder and concatenated them at the end instead of between namespace
mappings.
This caused the namespace mapping string to be malformed:
Before: ns-1:ns-1-mapped,ns-2:ns-2-mapped
Bug: ns-1:ns-1-mappedns-2:ns-2-mapped,,
The malformed string was parsed as a single mapping with an invalid
namespace name containing a colon, causing Kubernetes to reject it:
"ns-1-mappedns-2:ns-2-mapped" is invalid
Fix by properly using strings.Builder to construct the mapping string
with commas between entries, addressing both the linting concern and
the functional bug.
Fixes the MultiNamespacesMappingResticTest and
MultiNamespacesMappingSnapshotTest failures.
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Fix wildcard namespace expansion edge cases
This commit fixes two bugs in the wildcard namespace expansion feature:
1. Empty wildcard results: When a wildcard pattern (e.g., "invalid*")
matched no namespaces, the backup would incorrectly back up ALL
namespaces instead of backing up nothing. This was because the empty
includes list was indistinguishable from "no filter specified".
Fix: Added wildcardExpanded flag to NamespaceIncludesExcludes to
track when wildcard expansion has occurred. When true and the
includes list is empty, ShouldInclude now correctly returns false.
2. Premature namespace filtering: An earlier attempt to fix bug #1
filtered namespaces too early in collectNamespaces, breaking
LabelSelector tests where namespaces should be included based on
resources within them matching the label selector.
Fix: Removed the premature filtering and rely on the existing
filterNamespaces call at the end of getAllItems, which correctly
handles both wildcard expansion and label selector scenarios.
The fixes ensure:
- Wildcard patterns matching nothing result in empty backups
- Label selectors still work correctly (namespace included if any
resource in it matches the selector)
- State is preserved across multiple ResolveNamespaceList calls
🤖 Generated with [Claude Code](https://claude.com/claude-code)
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Signed-off-by: Joseph <jvaikath@redhat.com>
* Run wildcard expansion during backup processing
Signed-off-by: Joseph <jvaikath@redhat.com>
* Lint fix
Signed-off-by: Joseph <jvaikath@redhat.com>
* Improve coverage
Signed-off-by: Joseph <jvaikath@redhat.com>
* gofmt fix
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add wildcard details to describe backup status
Signed-off-by: Joseph <jvaikath@redhat.com>
* Revert "Remove wildcard restore to check if tests pass"
This reverts commit 4e22c2af855b71447762cb0a9fab7e7049f38a5f.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add restore describe for wildcard namespaces Revert restore wildcard removal
Signed-off-by: Joseph <jvaikath@redhat.com>
* Add coverage
Signed-off-by: Joseph <jvaikath@redhat.com>
* Lint
Signed-off-by: Joseph <jvaikath@redhat.com>
* Remove unintentional changes
Signed-off-by: Joseph <jvaikath@redhat.com>
* Remove wildcard status fields and mentionsRemove usage of wildcard fields for backup and restore status.
Signed-off-by: Joseph <jvaikath@redhat.com>
* Remove status update changelog line
Signed-off-by: Joseph <jvaikath@redhat.com>
* Rename getNamespaceIncludesExcludes
Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Scott Seago <sseago@redhat.com>
* Rewrite brace pattern validation
Signed-off-by: Joseph <jvaikath@redhat.com>
* Different var for internal loop
Signed-off-by: Joseph <jvaikath@redhat.com>
---------
Signed-off-by: Joseph <jvaikath@redhat.com>
Signed-off-by: Scott Seago <sseago@redhat.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-authored-by: Scott Seago <sseago@redhat.com>
Co-authored-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-authored-by: Claude <noreply@anthropic.com>
Bump golangci-lint to v1.25.0, because golangci-lint start to support
Golang v1.25 since v1.24.0, and v1.26.x was not stable yet.
Align action pr-linter-check's golangci-lint version to v1.25.0
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
Add documentation explaining how volume policies are applied before
VGS grouping, including examples and troubleshooting guidance for the
multiple CSI drivers scenario.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
VolumeGroupSnapshots were querying all PVCs with matching labels
directly from the cluster without respecting volume policies. This
caused errors when labeled PVCs included both CSI and non-CSI volumes,
or volumes from different CSI drivers that were excluded by policies.
This change filters PVCs by volume policy before VGS grouping,
ensuring only PVCs that should be snapshotted are included in the
group. A warning is logged when PVCs are excluded from VGS due to
volume policy.
Fixes#9344
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
* Fix the Job build error when BackupReposiotry name longer than 63.
Fix the Job build error.
Consider the name length limitation change in job list code.
Use hash to replace the GetValidName function.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
* Use ref_name to replace ref.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
---------
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
Update test expectations to include createdName field for resources
with action 'created'. Also ensure namespaces track their created
names when created via EnsureNamespaceExistsAndIsReady.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
When restoring resources with GenerateName, Kubernetes assigns the actual name
after creation, but Velero only tracked the original name from the backup in
itemKey. This caused volume information collection to fail when trying to fetch
PVCs using the original name instead of the actual created name.
Example:
- Original PVC name from backup: "test-vm-disk-1"
- Actual created PVC name: "test-vm-backup-2025-10-27-test-vm-disk-1-mdjkd"
- Volume info tried to fetch: "test-vm-disk-1" → Failed with "not found"
This affects any plugin or workflow using GenerateName during restore:
- kubevirt-velero-plugin (VMFR use case with PVC collision avoidance)
- Custom restore item actions using generateName
- Secrets/ConfigMaps restored with generateName
Changes:
1. Add createdName field to restoredItemStatus struct (pkg/restore/request.go)
2. Capture actual name from createdObj.GetName() (pkg/restore/restore.go:1520)
3. Use createdName in RestoredResourceList() when available (pkg/restore/request.go:93-95)
This fix is backwards compatible:
- createdName defaults to empty string
- When empty, falls back to itemKey.name (original behavior)
- Only populated for GenerateName resources where needed
Fixes volume information collection errors like:
"Failed to get PVC" error="persistentvolumeclaims \"<original-name>\" not found"
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
When restoring resources with GenerateName (where name is empty and K8s
assigns the actual name), the managed fields patch was failing with error
"name is required" because it was using obj.GetName() which returns empty
for GenerateName resources.
The fix uses createdObj.GetName() instead, which contains the actual name
assigned by Kubernetes after resource creation.
This affects any resource using GenerateName for restore, including:
- PersistentVolumeClaims restored by kubevirt-velero-plugin
- Secrets and ConfigMaps created with generateName
- Any custom resources using generateName
Changes:
- Line 1707: Use createdObj.GetName() instead of obj.GetName() in Patch call
- Lines 1702, 1709, 1713, 1716: Use createdObj in error/info messages for accuracy
This is a backwards-compatible fix since:
- For resources WITHOUT generateName: obj.GetName() == createdObj.GetName()
- For resources WITH generateName: createdObj.GetName() has the actual name
The managed fields patch was already correctly using createdObj (lines 1698-1700),
only the Patch() call was incorrectly using obj.
Fixes restore status showing FinalizingPartiallyFailed with "name is required"
error when restoring resources with GenerateName.
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
- Expanded the design to include detailed implementation steps for wildcard expansion in both backup and restore operations.
- Added new status fields to the backup and restore CRDs to track expanded wildcard namespaces.
- Clarified the approach to ensure backward compatibility with existing `*` behavior.
- Addressed limitations and provided insights on restore operations handling wildcard-expanded backups.
This update aims to provide a comprehensive and clear framework for implementing wildcard namespace support in Velero.
Signed-off-by: Joseph <jvaikath@redhat.com>
- Clarified the use of the standalone `*` character in namespace specifications.
- Ensured consistent formatting for `*` throughout the document.
- Maintained focus on backward compatibility and limitations regarding wildcard usage.
This update enhances the clarity and consistency of the design document for implementing wildcard namespace support in Velero.
Signed-off-by: Joseph <jvaikath@redhat.com>
- Updated the abstract to clarify the current limitations of namespace specifications in Velero.
- Expanded the goals section to include specific objectives for implementing wildcard patterns in `--include-namespaces` and `--exclude-namespaces`.
- Detailed the high-level design and implementation steps, including the addition of new status fields in the backup CRD and the creation of a utility package for wildcard expansion.
- Addressed backward compatibility and known limitations regarding the use of wildcards alongside the existing "*" character.
This update aims to provide a comprehensive overview of the proposed changes for improved namespace selection flexibility.
Signed-off-by: Joseph <jvaikath@redhat.com>
stale-issue-message:"This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands."
Below is a list of adopters of Velero in **production environments** that have
@@ -68,6 +69,9 @@ Replicated uses the Velero open source project to enable snapshots in [KOTS][101
**[Microsoft Azure][105]**<br>
[Azure Backup for AKS][106] is an Azure native, Kubernetes aware, Enterprise ready backup for containerized applications deployed on Azure Kubernetes Service (AKS). AKS Backup utilizes Velero to perform backup and restore operations to protect stateful applications in AKS clusters.<br>
**[Broadcom][107]**<br>
[VMware Cloud Foundation][108] (VCF) offers built-in [vSphere Kubernetes Service][109] (VKS), a Kubernetes runtime that includes a CNCF certified Kubernetes distribution, to deploy and manage containerized workloads. VCF empowers platform engineers with native [Kubernetes multi-cluster management][110] capability for managing Kubernetes (K8s) infrastructure at scale. VCF utilizes Velero for Kubernetes data protection enabling platform engineers to back up and restore containerized workloads manifests & persistent volumes, helping to increase the resiliency of stateful applications in VKS cluster.
## Adding your organization to the list of Velero Adopters
If you are using Velero and would like to be included in the list of `Velero Adopters`, add an SVG version of your logo to the `site/static/img/adopters` directory in this repo and submit a [pull request][3] with your change. Name the image file something that reflects your company (e.g., if your company is called Acme, name the image acme.png). See this for an example [PR][4].
@@ -125,3 +129,8 @@ If you would like to add your logo to a future `Adopters of Velero` section on [
In v1.18, Velero is capable to process multiple backups concurrently. This is a significant usability improvement, especially for multiple tenants or multiple users case, backups submitted from different users could run their backups simultaneously without interfering with each other.
Check design https://github.com/vmware-tanzu/velero/blob/main/design/Implemented/concurrent-backup-processing.md for more details.
#### Cache volume for data movers
In v1.18, Velero allows users to configure cache volumes for data mover pods during restore for CSI snapshot data movement and fs-backup. This brings below benefits:
- Solve the problem that data mover pods fail to when pod's ephemeral disk is limited
- Solve the problem that multiple data mover pods fail to run concurrently in one node when the node's ephemeral disk is limited
- Working together with backup repository's cache limit configuration, cache volume with appropriate size helps to improve the restore throughput
Check design https://github.com/vmware-tanzu/velero/blob/main/design/Implemented/backup-repo-cache-volume.md for more details.
#### Incremental size for data movers
In v1.18, Velero allows users to observe the incremental size of data movers backups for CSI snapshot data movement and fs-backup, so that users could visually see the data reduction due to incremental backup.
#### Wildcard support for namespaces
In v1.18, Velero allows to use Glob regular expressions for namespace filters during backup and restore, so that users could filter namespaces in a batch manner.
#### VolumePolicy for PVC phase
In v1.18, Velero VolumePolicy supports actions by PVC phase, which help users to do special operations for PVCs with a specific phase, e.g., skip PVCs in Pending/Lost status from the backup.
#### Scalability and Resiliency improvements
##### Prevent Velero server OOM Kill for large backup repositories
In v1.18, some backup repository operations are delay executed out of Velero server, so Velero server won't be OOM Killed.
#### Performance improvement for VolumePolicy
In v1.18, VolumePolicy is enhanced for large number of pods/PVCs so that the performance is significantly improved.
#### Events for data mover pod diagnostic
In v1.18, events are recorded into data mover pod diagnostic, which allows user to see more information for troubleshooting when the data mover pod fails.
### Runtime and dependencies
Golang runtime: 1.25.7
kopia: 0.22.3
### Limitations/Known issues
### Breaking changes
#### Deprecation of PVC selected node feature
According to [Velero deprecation policy](https://github.com/vmware-tanzu/velero/blob/main/GOVERNANCE.md#deprecation-policy), PVC selected node feature is deprecated in v1.18. Velero could appropriately handle PVC's selected-node annotation, so users don't need to do anything particularly.
### All Changes
* Remove backup from running list when backup fails validation (#9498, @sseago)
* Maintenance Job only uses the first element of the LoadAffinity array (#9494, @blackpiglet)
* Fix issue #9478, add diagnose info on expose peek fails (#9481, @Lyndon-Li)
* Add Role, RoleBinding, ClusterRole, and ClusterRoleBinding in restore sequence. (#9474, @blackpiglet)
* Add maintenance job and data mover pod's labels and annotations setting. (#9452, @blackpiglet)
* Implement concurrency control for cache of native VolumeSnapshotter plugin. (#9281, @0xLeo258)
* Fix issue #7904, remove the code and doc for PVC node selection (#9269, @Lyndon-Li)
* Fix schedule controller to prevent backup queue accumulation during extended blocking scenarios by properly handling empty backup phases (#9264, @shubham-pampattiwar)
* Fix repository maintenance jobs to inherit allowlisted tolerations from Velero deployment (#9256, @shubham-pampattiwar)
* Implement wildcard namespace pattern expansion for backup namespace includes/excludes. This change adds support for wildcard patterns (*, ?, [abc], {a,b,c}) in namespace includes and excludes during backup operations (#9255, @Joeavaikath)
* Protect VolumeSnapshot field from race condition during multi-thread backup (#9248, @0xLeo258)
* Update AzureAD Microsoft Authentication Library to v1.5.0 (#9244, @priyansh17)
* Get pod list once per namespace in pvc IBA (#9226, @sseago)
Fix VolumeGroupSnapshot restore failure with Ceph RBD CSI driver by creating stub VolumeGroupSnapshotContent during restore and looking up VolumeSnapshotClass by driver for credential support
Fix issue #9659, in the case that PVB/PVR/DU/DD is cancelled before the data path is really started, call EndEvent to prevent data mover pod from crashing because of delay event distribution
Add an `--apply` flag to the install command that enables applying existing resources rather than creating them. This can be useful as part of the upgrade process for existing installations.
## Background
The current Velero install command creates resources but doesn't provide a direct way to apply updates to an existing installation.
Users attempting to run the install command on an existing installation receive "already exists" messages.
Upgrade steps for existing installs typically involve a three (or more) step process to apply updated CRDs (using `--dry-run` and piping to `kubectl apply`) and then updating/setting images on the Velero deployment and node-agent.
## Goals
- Provide a simple flag to enable applying resources on an existing Velero installation.
- Use server-side apply to update existing resources rather than attempting to create them.
- Maintain consistency with the regular install flow.
## Non Goals
- Implement special logic for specific version-to-version upgrades (i.e. resource deletion, etc).
- Add complex upgrade validation or pre/post-upgrade hooks.
- Provide rollback capabilities.
## High-Level Design
The `--apply` flag will be added to the Velero install command.
When this flag is set, the installation process will use server-side apply to update existing resources instead of using create on new resources.
This flag can be used as _part_ of the upgrade process, but will not always fully handle an upgrade.
## Detailed Design
The implementation adds a new boolean flag `--apply` to the install command.
This flag will be passed through to the underlying install functions where the resource creation logic resides.
When the flag is set to true:
- The `createOrApplyResource` function will use server-side apply with field manager "velero-cli" and `force=true` to update resources.
- Resources will be applied in the same order as they would be created during installation.
- Custom Resource Definitions will still be processed first, and the system will wait for them to be established before continuing.
The server-side apply approach with `force=true` ensures that resources are updated even if there are conflicts with the last applied state.
This provides a best-effort mechanism to apply resources that follows the same flow as installation but updates resources instead of creating them.
No special handling is added for specific versions or resource structures, making this a general-purpose mechanism for applying resources.
## Alternatives Considered
1. Creating a separate `upgrade` command that would duplicate much of the install command logic.
- Rejected due to code duplication and maintenance overhead.
2. Implementing version-specific upgrade logic to handle breaking changes between versions.
- Rejected as overly complex and difficult to maintain across multiple version paths.
- This could be considered again in the future, but is not in the scope of the current design.
3. Adding automatic detection of existing resources and switching to apply mode.
- Rejected as it could lead to unexpected behavior and confusion if users unintentionally apply changes to existing resources.
## Security Considerations
The apply flag maintains the same security profile as the install command.
No additional permissions are required beyond what is needed for resource creation.
The use of `force=true` with server-side apply could potentially override manual changes made to resources, but this is a necessary trade-off to ensure apply is successful.
## Compatibility
This enhancement is compatible with all existing Velero installations as it is a new opt-in flag.
It does not change any resource formats or API contracts.
The apply process is best-effort and does not guarantee compatibility between arbitrary versions of Velero.
Users should still consult release notes for any breaking changes that may require manual intervention.
This flag could be adopted by the helm chart, specifically for CRD updates, to simplify the CRD update job.
## Implementation
The implementation involves:
1. Adding support for `Apply` to the existing Kubernetes client code.
1. Adding the `--apply` flag to the install command options.
1. Changing `createResource` to `createOrApplyResource` and updating it to use server-side apply when the `apply` boolean is set.
The implementation is straightforward and follows existing code patterns.
No migration of state or special handling of specific resources is required.
**Backup Storage**: The storage to store the backup data. Check [Unified Repository design][1] for details.
**Backup Repository**: Backup repository is layered between BR data movers and Backup Storage to provide BR related features that is introduced in [Unified Repository design][1].
**Velero Generic Data Path (VGDP)**: VGDP is the collective of modules that is introduced in [Unified Repository design][1]. Velero uses these modules to finish data transfer for various purposes (i.e., PodVolume backup/restore, Volume Snapshot Data Movement). VGDP modules include uploaders and the backup repository.
**Data Mover Pods**: Intermediate pods which hold VGDP and complete the data transfer. See [VGDP Micro Service for Volume Snapshot Data Movement][2] and [VGDP Micro Service For fs-backup][3] for details.
**Repository Maintenance Pods**: Pods for [Repository Maintenance Jobs][4], which holds VGDP to run repository maintenance.
## Background
According to the [Unified Repository design][1] Velero uses selectable backup repositories for various backup/restore methods, i.e., fs-backup, volume snapshot data movement, etc. Some backup repositories may need to cache data on the client side for various repository operation, so as to accelerate the execution.
In the existing [Backup Repository Configuration][5], we allow users to configure the cache data size (`cacheLimitMB`). However, the cache data is still stored in the root file system of data mover pods/repository maintenance pods, so stored in the root file system of the node. This is not good enough, reasons:
- In many distributions, the node's system disk size is predefined, non configurable and limit, e.g., the system disk size may be 20G or less
- Velero supports concurrent data movements in each node. The cache in each of the concurrent data mover pods could quickly run out of the system disk and cause problems like pod eviction, failure of pod creation, degradation of Kubernetes QoS, etc.
We need to allow users to prepare a dedicated location, e.g., a dedictated volume, for the cache.
Not all backup repositories or not all backup repository operations require cache, we need to define the details when and how the cache is used.
## Goals
- Create a mechanism for users to configure cache volumes for various pods running VGDP
- Design the workflow to assign the cache volume pod path to backup repositories
- Describe when and how the cache volume is used
## Non-Goals
- The solution is based on [Unified Repository design][1], [VGDP Micro Service for Volume Snapshot Data Movement][2] and [VGDP Micro Service For fs-backup][3], legacy data paths are not supported. E.g., when a pod volume restore (PVR) runs with legacy Restic path, if any data is cached, the cache still resides in the root file system.
## Solution
### Cache Data
Varying on backup repositoires, cache data may include payload data or repository metadata, e.g., indexes to the payload data chunks.
Payload data is highly related to the backup data, and normally take the majority of the repository data as well as the cache data.
Repository metadata is related to the backup repository's chunking algorithm, data chunk mapping method, etc, and so the size is not proportional to the backup data size.
On the other hand for some backup repository, in extreme cases, the repository metadata may be significantly large. E.g., Kopia's indexes are per chunks, if there are huge number of small files in the repository, Kopia's index data may be in the same level of or even larger than the payload data.
However, in the cases that repository metadata data become the majority, other bottlenecks may emerge and concurrency of data movers may be significantly constrained, so the requirement to cache volumes may go away.
Therefore, for now we only consider the cache volume requirement for payload data, and leave the consideration for metadata as a future enhancement.
### Scenarios
Backup repository cache varies on backup repositories and backup repository operation during VGDP runs. Below are the scenarios when VGDP runs:
- Data Upload for Backup: this is the process to upload/write the backup data into the backup repository, e.g., DataUpload or PodVolumeBackup. The pieces of data is almost directly written to the repository, sometimes with a small group staying shortly in the local place. That is to say, there should not be large scale data cached for this scenario, so we don't prepare dedicated cache for this scenario.
- Repository Maintenance: Repository maintenance most often visits the backup repository's metadata and sometimes it needs to visit the file system directories from the backed up data. On the other hand, it is not practical to run concurrent maintenance jobs in one node. So the cache data is neither large nor affect the root file system too much. Therefore, we don't need to prepare dedicated cache for this scenario.
- Data Download for Restore: this is the process to download/read the backup data from the backup repository during restore, e.g., DataDownload or PodVolumeRestore. For backup repositories for which data are stored in remote backup storages (e.g., Kopia repository stores data in remote object stores), large scale of data are cached locally to accerlerate the restore. Therefore, we need dedicate cache volumes for this scenario.
- Backup Deletion: During this scenario, backup repository is connected, metadata is enumerated to find the repository snapshot representing the backup data. That is to say, only metadata is cached if any. Therefore, dedicated cache volumes are not required in this scenario.
The above analyses are based on the common behavior of backup repositories and they are not considering the case that backup repository metadata takes majority or siginficant proportion of the cache data.
As a conclusion of the analyses, we will create dedicated cache volumes for restore scenarios.
For other scenarios, we can add them regarded to the future changes/requirements. The mechanism to expose and connect the cache volumes should work for all scenarios. E.g., if we need to consider the backup repository metadata case, we may need cache volumes for backup and repository maintenance as well, then we can just reuse the same cache volume provision and connection mechanism to backup and repository maintenance scenarios.
### Cache Data and Lifecycle
If available, one cache volume is dedicately assigned to one data mover pod. That is, the cached data is destroyed when the data mover pod completes. Then the backup repository instance also closes.
Cache data are fully managed by the specific backup repository. So the backup repository may also have its own way to GC the cache data.
That is to say, cache data GC may be launched by the backup repository instance during the running of the data mover pod; then the left data are automatically destroyed when the data mover pod and the cache PVC are destroyed (cache PVC's `reclaimPolicy` is always `Deleted`, so once the cache PVC is destroyed, the volume will also be destroyed). So no specially logics are needed for cache data GC.
### Data Size
Cache volumes take storage space and cluster resources (PVC, PV), therefore, cache volumes should be created only when necessary and the volumes should be with reasonable size based on the cache data size:
- It is not a good bargain to have cache volumes for small backups, small backups will use resident cache location (the cache location in the root file system)
- The cache data size has a limit, the existing `cacheLimitMB` is used for this purpose. E.g., it could be set as 1024 for a 1TB backup, which means 1GB of data is cached and the old cache data exceeding this size will be cleared. Therefore, it is meaningless to set the cache volume size much larger than `cacheLimitMB`
### Cache Volume Size
The cache volume size is calculated from below factors (for Restore scenarios):
- **Limit**: The limit of the cache data, that is represented by `cacheLimitMB`, the default value is 5GB
- **backupSize**: The size of the backup as a reference to evaluate whether to create a cache volume. It doesn't mean the backup data really decides the cache data all the time, it is just a reference to evaluate the scale of the backup, small scale backups may need small cache data. Sometimes, backupSize is not irrelevant to the size of cache data, in this case, ResidentThreshold should not be set, Limit will be used directly. It is unlikely that backupSize is unavailable, but once that happens, ResidentThreshold is ignored, Limit will be used directly.
- **ResidentThreshold**: The minimum backup size that a cache volume is created
- **InflationPercentage**: Considering the overhead of the file system and the possible delay of the cache cleanup, there should be an inflation for the final volume size vs. the logical size, otherwise, the cache volume may be overrun. This inflation percentage is hardcoded, e.g., 20%.
Finally, the `cacheVolumeSize` will be rounded up to GiB considering the UX friendliness, storage friendliness and management friendliness.
### PVC/PV
The PVC for a cache volume is created in Velero namespace and a storage class is required for the cache PVC. The PVC's accessMode is `ReadWriteOnce` and volumeMode is `FileSystem`, so the storage class provided should support this specification. Otherwise, if the storageclass doesn't support either of the specifications, the data mover pod may be hang in `Pending` state until a timeout setting with the data movement (e.g. `prepareTimeout`) and the data movement will finally fail.
It is not expected that the cache volume is retained after data mover pod is deleted, so the `reclaimPolicy` for the storageclass must be `Delete`.
To detect the problems in the storageclass and fail earlier, a validation is applied to the storageclass and once the validation fails, the cache configuration will be ignored, so the data mover pod will be created without a cache volume.
### Cache Volume Configurations
Below configurations are introduced:
- **residentThresholdMB**: the minimum data size(in MB) to be processed (if available) that a cache volume is created
- **cacheStorageClass**: the name of the storage class to provision the cache PVC
Not like `cacheLimitMB` which is set to and affect the backup repository, the above two configurations are actually data mover configurations of how to create cache volumes to data mover pods; and the two configurations don't need to be per backup repository. So we add them to the node-agent Configuration.
### Sample
Below are some examples of the node-agent configMap with the configurations:
Sample-1:
```json
{
"cacheVolume":{
"storageClass":"sc-1",
"residentThresholdMB":1024
}
}
```
Sample-2:
```json
{
"cacheVolume":{
"storageClass":"sc-1",
}
}
```
Sample-3:
```json
{
"cacheVolume":{
"residentThresholdMB":1024
}
}
```
**sample-1**: This is a valid configuration. Restores with backup data size larger than 1G will be assigned a cache volume using storage class `sc-1`.
**sample-2**: This is a valid configuration. Data mover pods are always assigned a cache volume using storage class `sc-1`.
**sample-3**: This is not a valid configuration because the storage class is absent. Velero gives up creating a cache volume.
To create the configMap, users need to save something like the above sample to a json file and then run below command:
```
kubectl create cm <ConfigMap name> -n velero --from-file=<json file name>
```
The cache volume configurations will be visited by node-agent server, so they also need to specify the `--node-agent-configmap` to the `velero node-agent` parameters.
## Detailed Design
### Backup and Restore
The restore needs to know the backup size so as to calculate the cache volume size, some new fields are added to the DataDownload and PodVolumeRestore CRDs.
`snapshotSize` field is also added to DataDownload and PodVolumeRestore's `spec`:
```yaml
spec:
snapshotID:
description:SnapshotID is the ID of the Velero backup snapshot to
be restored from.
type:string
snapshotSize:
description:SnapshotSize is the logical size of the snapshot.
format:int64
type:integer
```
`snapshotSize` represents the total size of the backup; during restore, the value is transferred from DataUpload/PodVolumeBackup's `Status.Progress.TotalBytes` to DataDownload/PodVolumeRestore.
It is unlikely that `Status.Progress.TotalBytes` from DataUpload/PodVolumeBackup is unavailable, but once it happens, according to the above formula, `residentThresholdMB` is ignored, cache volume size is calculated directly from cache limit for the corresponding backup repository.
### Exposer
Cache volume configurations are retrieved by node-agent and passed through DataDownload/PodVolumeRestore to GenericRestore exposer/PodVolume exposer.
The exposers are responsible to calculate cache volume size, create cache PVCs and mount them to the restorePods.
If the calculated cache volume size is 0, or any of the critical parameters is missing (e.g., cache volume storage class), the exposers ignore the cache volume configuration and continue with creating restorePods without cache volumes, so no impact to the result of the restore.
Exposers mount the cache volume to a predefined directory and pass the directory to the data mover pods through the `cache-volume-path` parameter.
Below data structure is added to the exposers' expose parameters:
```go
typeGenericRestoreExposeParamstruct{
// RestoreSize specifies the data size for the volume to be restored
RestoreSizeint64
// CacheVolume specifies the info for cache volumes
CacheVolume*CacheVolumeInfo
}
typePodVolumeExposeParamstruct{
// RestoreSize specifies the data size for the volume to be restored
RestoreSizeint64
// CacheVolume specifies the info for cache volumes
CacheVolume*repocache.CacheConfigs
}
typeCacheConfigsstruct{
// StorageClass specifies the storage class for cache volumes
StorageClassstring
// Limit specifies the maximum size of the cache data
Limitint64
// ResidentThreshold specifies the minimum size of the cache data to create a cache volume
ResidentThresholdint64
}
```
### Data Mover Pods
Data mover pods retrieve the cache volume directory from `cache-volume-path` parameter and pass it to Unified Repository.
If the directory is empty, Unified Repository uses the resident location for data cache, that is, the root file system.
### Kopia Repository
Kopia repository supports cache directory configuration for both metadata and data. The existing `SetupConnectOptions` is modified to customize the `CacheDirectory`:
This design document describes the enhancement of BackupStorageLocation (BSL) certificate management in Velero, introducing a Secret-based certificate reference mechanism (`caCertRef`) alongside the existing inline certificate field (`caCert`). This enhancement provides a more secure, Kubernetes-native approach to certificate management while enabling future CLI improvements for automatic certificate discovery.
## Background
Currently, Velero supports TLS certificate verification for object storage providers through an inline `caCert` field in the BSL specification. While functional, this approach has several limitations:
- **Security**: Certificates are stored directly in the BSL YAML, potentially exposing sensitive data
- **Management**: Certificate rotation requires updating the BSL resource itself
- **CLI Usability**: Users must manually specify certificates when using CLI commands
- **Size Limitations**: Large certificate bundles can make BSL resources unwieldy
Issue #9097 and PR #8557 highlight the need for improved certificate management that addresses these concerns while maintaining backward compatibility.
## Goals
- Provide a secure, Secret-based certificate storage mechanism
- Maintain full backward compatibility with existing BSL configurations
- Enable future CLI enhancements for automatic certificate discovery
- Simplify certificate rotation and management
- Provide clear migration path for existing users
## Non-Goals
- Removing support for inline certificates immediately
- Changing the behavior of existing BSL configurations
- Implementing client-side certificate validation
- Supporting certificates from ConfigMaps or other resource types
## High-Level Design
### API Changes
#### New Field: CACertRef
```go
typeObjectStorageLocationstruct{
// Existing field (now deprecated)
// +optional
// +kubebuilder:deprecatedversion:warning="caCert is deprecated, use caCertRef instead"
returnnil,errors.Wrap(err,"error getting CA certificate from secret")
}
return[]byte(certString),nil
}
// Fall back to caCert (deprecated)
ifbsl.Spec.ObjectStorage.CACert!=nil{
returnbsl.Spec.ObjectStorage.CACert,nil
}
returnnil,nil
}
```
### CLI Certificate Discovery Integration
#### Background: PR #8557 Implementation
PR #8557 ("CLI automatically discovers and uses cacert from BSL") was merged in August 2025, introducing automatic CA certificate discovery from BackupStorageLocation for Velero CLI download operations. This eliminated the need for users to manually specify the `--cacert` flag when performing operations like `backup describe`, `backup download`, `backup logs`, and `restore logs`.
#### Current Implementation (Post PR #8557)
The CLI now automatically discovers certificates from BSL through the `pkg/cmd/util/cacert/bsl_cacert.go` module:
```go
// Current implementation only supports inline caCert
This design provides a secure, Kubernetes-native approach to certificate management in Velero while maintaining backward compatibility. It establishes the foundation for enhanced CLI functionality and improved user experience, addressing the concerns raised in issue #9097 and enabling the features proposed in PR #8557.
The phased approach ensures smooth migration for existing users while delivering immediate security benefits for new deployments.
Velero currently treats namespace patterns with glob characters as literal strings. This design adds wildcard expansion to support flexible namespace selection using patterns like `app-*` or `test-{dev,staging}`.
## Background
Requested in [#1874](https://github.com/vmware-tanzu/velero/issues/1874) for more flexible namespace selection.
## Goals
- Support glob pattern expansion in namespace includes/excludes
- Maintain backward compatibility with existing `*` behavior
## Non-Goals
- Complex regex patterns beyond basic globs
## High-Level Design
Wildcard expansion occurs early in both backup and restore flows, converting patterns to literal namespace lists before normal processing.
### Backup Flow
Expansion happens in `getResourceItems()` before namespace collection:
1. Check if wildcards exist using `ShouldExpandWildcards()`
2. Expand patterns against active cluster namespaces
3. Replace includes/excludes with expanded literal namespaces
4. Continue with normal backup processing
### Restore Flow
Expansion occurs in `execute()` after parsing backup contents:
1. Extract available namespaces from backup tar
2. Expand patterns against backup namespaces (not cluster namespaces)
3. Update restore context with expanded namespaces
4. Continue with normal restore processing
This ensures restore wildcards match actual backup contents, not current cluster state.
## Detailed Design
### Status Fields
Add wildcard expansion tracking to backup and restore CRDs:
```go
typeWildcardNamespaceStatusstruct{
// IncludeWildcardMatches records namespaces that matched include patterns
**Backup Storage**: The storage to store the backup data. Check [Unified Repository design][1] for details.
**Backup Repository**: Backup repository is layered between BR data movers and Backup Storage to provide BR related features that is introduced in [Unified Repository design][1].
**Velero Generic Data Path (VGDP)**: VGDP is the collective of modules that is introduced in [Unified Repository design][1]. Velero uses these modules to finish data transfer for various purposes (i.e., PodVolume backup/restore, Volume Snapshot Data Movement). VGDP modules include uploaders and the backup repository.
**Velero Built-in Data Mover (VBDM)**: VBDM, which is introduced in [Volume Snapshot Data Movement design][2] and [Unified Repository design][1], is the built-in data mover shipped along with Velero, it includes Velero data mover controllers and VGDP.
**Data Mover Pods**: Intermediate pods which hold VGDP and complete the data transfer. See [VGDP Micro Service for Volume Snapshot Data Movement][3] for details.
**Change Block Tracking (CBT)**: CBT is the mechanism to track changed blocks, so that backups could back up the changed data only. CBT usually provides by the computing/storage platform.
**TCO**: Total Cost of Ownership. This is a general criteria for products/solutions, but also means a lot for BR solutions. For example, this means what kind of backup storage (and its cost) it requires, the retention policy of backup copies, the ways to remove backup data redundancy, etc.
**PodVolume Backup**: This is the Velero backup method which accesses the data from live file system, see [Kopia Integration design][1] for how it works.
**CAOS and CABS**: Content-Addressable Object Storage and Content-Addressable Block Storage, they are the parts from Kopia repository, see [Kopia Architecture][5].
## Background
Kubernetes supports two kinds of volume mode, `FileSystem` and `Block`, for persistent volumes. Underlyingly, the storage could use a block storage to provision either `FileSystem` mode or `Block` mode volumes; and the storage could use a file storage to provision `FileSystem` mode volumes.
For volumes provisioned by block storage, they could be backed up/restored from the block level, regardless the volume mode of the persistent volume.
On the other hand, as long as the data could be accessed from the file system, a backup/restore could be conducted from the file system level. That is to say `FileSystem` mode volumes could be backed up/restored from the file system level, regardless of the backend storage type.
Then if a `FileSystem` mode volume is provisioned by a block storage, the volume could be backed up/restored either from the file system level or block level.
For Velero, [CSI Snapshot Data Movement][2] which is implemented by VBDM, ships a file system uploader, so the backup/restore is done from file system only.
Once possible, block level backup/restore is better than file system level backup/restore:
- Block level backup could leverage CBT to process minimal size of data, so it significantly reduces the overhead to network, backup repository and backup storage. As a result, TCO is significantly reduced.
- Block level backup/restore is performant in throughput and resource consumption, because it doesn't need to handle the complexity of the file system, especially for the case that huge number of small files in the file system.
- Block level backup/restore is less OS dependent because the uploader doesn't need the OS to be aware of the file system in the volume.
At present, [Kubernetes CBT API][4] is mature and close to Beta stage. Many platform/storage has supported/is going to support it.
Therefore, it is very important for Velero to deliver the block level backup/restore and recommend users to use it over the file system data mover as long as:
- The volume is backed by block storage so block level access is possible
- The platform supports CBT
Meanwhile, file system level backup/restore is still valuable for below scenarios:
- The volume is backed by file storage, e.g., AWS EFS, Azure File, CephFS, VKS File Volume, etc.
- The volume is backed by block storage but CBT is not available
- The volume doesn't support CSI snapshot, so Velero PodVolume Backup method is used
There are rich features delivered with VGDP, VBDM and [VGDP micro service][3], to reuse these features, block data mover should be built based on these modules.
Velero VBDM supports linux and Windows nodes, however, Windows container doesn't support block mode volumes, so backing up/restoring from Windows nodes is not supported until Windows container removes this limitation. As a result, if there are both linux and Windows nodes in the cluster, block data mover can only run in linux nodes.
Both the Kubernetes CBT service and Velero work in the boundary of the cluster, even though the backend storage may be shared by multiple clusters, Velero can only protection workloads in the same cluster where it is running.
## Goals
Add a block data mover to VBDM and support block level backup/restore for [CSI Snapshot Data Movement][2], which includes:
- Support block level full backup for both `FileSystem` and `Block` mode volumes
- Support block level incremental backup for both `FileSystem` and `Block` mode volumes
- Support block level restore from full/incremental backup for both `FileSystem` and `Block` mode volumes
- Support block level backup/restore for both linux and Windows workloads from linux cluster nodes
- Support all existing features, i.e., load concurrency, node selection, cache volume, deduplication, compression, encryption, etc. for the block data mover
- Support volumes processed from file system level and block level in the same backup/restore
## Non-Goals
- PodVolume Backup does the backup/restore from file system level only, so block level backup/restore is not supported
- Volumes that are backed by file system storages, can only be backed up/restored from file system level, so block level backup/restore is not supported
- Backing up/restoring from Windows nodes is not supported
- Block level incremental backup requires special capabilities of the backup repository, and Velero [Unified Repository][1] supports multiple kinds of backup repositories. The current design focus on Kopia repository only, block level incremental backup support of other repositories will be considered when the specific backup repository is integrated to [Velero Unified Repository][1]
## Architecture
### Data Path
Below shows the architecture of VGDP when integrating to Unified Repository (implemented by Kopia repository).
A new block data mover will be added besides the existing file system data mover, the both data movers read/write data from/to the same backup repository through Unified Repo interface.
Unified Repo interface and the backup repository needs to be enhanced to support incremental backups.

For more details of VGDP architecture, see [Unified Repository design][1], [Volume Snapshot Data Movement design][2] and [VGDP Micro Service for Volume Snapshot Data Movement][3].
### Backup
Below is the architecture for block data mover backup which is developed based on the existing VBDM:

The existing VBDM is reused, below are the major changes based on the existing VBDM:
**Exposer**: Exposer needs to create block mode backupPVC all the time regardless of the sourcePVC mode.
**CBT**: This is a new layer to retrieve, transform and store the changed blocks, it interacts with CSI SnapshotMetadataService through gRPC.
**Uploader**: A new block uploader is added. It interacts with CBT layer, holds special logics to make performant data read from block devices and holds special logics to write incremental data to Unified Repository.
**Extended Kopia repo**: A new Incremental Aware Object Extension is added to Kopia's CAOS, so as to support incremental data write. Other parts of Kopia repository, including the existing CAOS and CABS, are not changed.
### Restore
Below is architecture for block data mover restore which is developed based on the existing VBDM:

The existing VBDM is reused, below are the major changes based on the existing VBDM:
**Exposer**: While the restorePV is in block mode, exposer needs to rebind the restorePV to a targetPVC in either file system mode or block mode.
**Uploader**: The same block uploader holds special logics to make performant data write to block devices and holds special logics to read data from the backup chain in Unified repository.
For more details of VBDM, see [Volume Snapshot Data Movement design][2].
## Detailed Design
### Selectable Data Mover Type
#### Per Backup Selection
At present, the backup accepts a `DataMover` parameter and when its value is empty or `velero`, VBDM is used.
After block data mover is introduced, VBDM will have two types of data movers, Velero file system data mover and Velero block data mover.
A new type string `velero-block` is introduced for Velero block data mover, that is, when `DataMover` is set as `velero-block`, Velero block data mover is used.
Another new value `velero-fs` is introduced for Velero file system data mover, that is, when `DataMover` is set as `velero-fs`, Velero file system data mover is used.
For backwards compatibility consideration, `velero` is preserved a valid value, it refers to the default data mover, and the default data mover may change among releases. At present, Velero file system data mover is the default data mover; we can change the default one to Velero block data mover in future releases.
#### Volume Policy
It is a valid case that users have multiple volumes in a single backup, while they want to use Velero file system data mover for some of the volumes and use Velero block data mover for some others.
To meet this requirement, a combined solution of Per Backup Selection and Volume Policy is used.
`action.parameters` is used to provide extra information of the action. This is an ideal place to differentiate Velero file system data mover and Velero block data mover.
Therefore, Velero built-in data mover will support `dataMover` key in `parameters`, with the value either `velero-fs` or `velero-block`. While `velero-fs` and `velero-block` are with the same meaning with Per Backup Selection.
As an example, here is how a user might use both `velero-block` and `velero-fs` in a single backup:
- Users set `DataMover` parameter for the backup as `velero-block`
- Users add a record into Volume Policy, make `conditions` to filter the volumes they want to backup through Velero file system data mover, make `action.type` as `snapshot` and insert a record into `action.parameter` as `dataMover:velero-fs`
In this way, all volumes matched by `conditions` will be backed up with Velero file system data mover; while the others will fallback to the per backup method Velero block data mover.
Vice versa, users could set the per backup method as file system data mover and select volumes for Velero block data mover.
The selected data mover for each volume should be recorded to `volumeInfo.json`.
### Controllers
Backup controller and Restore controller are kept as is, async operations are still used to interact with VBDM with block data mover.
DataUpload controller and DataDownload controller are almost kept as is, with some minor changes to handle the data mover type and backup type appropriately and convey it to the exposers. With [VGDP Micro Service][3], the controllers are almost isolated from VGDP, so no major changes are required.
### Exposer
#### CSI Snapshot Exposer
The existing CSI Snapshot Exposer is reused with some changes to decide the backupPVC volume mode by access mode. Specifically, for Velero block data mover, access mode is always `Block`, so the backupPVC volume mode is always `Block`.
Once the backupPVC is created with correct volume mode, the existing code could create the backupPod and mount the backupPVC appropriately.
#### Generic Restore Exposer
The existing Generic Restore Exposer is reused, but the workflow needs some changes.
For block data mover, the restorePV is in Block mode all the time, whereas, the targetPVC may be in either file system mode or block mode.
However, Kubernetes doesn't allow to bound a PV to a PVC with mismatch volume mode.
Therefore, the workflow of ***Finish Volume Readiness*** as introduced in [Volume Snapshot Data Movement design][2] is changed as below:
- When restore completes and restorePV is created, set restorePV's `deletionPolicy` to `Retain`
- Create another rebindPV and copy restorePV's `volumeHandle` but the `volumeMode` matches to the targetPVC
- Delete restorePV
- Set the rebindPV's claim reference (the ```claimRef``` filed) to targetPVC
- Add the ```velero.io/dynamic-pv-restore``` label to the rebindPV
In this way, the targetPVC will be bound immediately by Kubernetes to rebindPV.
These changes work for file system data mover as well, so the old workflow will be replaced, only the new workflow is kept.
### VGDP
Below is the VGDP workflow during backup:

Below is the VGDP workflow during restore:

#### Unified Repo
For block data mover, one Unified Repo Object is created for each volume, and some metadata is also saved into Unified Repo to describe the volume.
During the backup, the write conducts a skippable-write manner:
- For the data range that the write does not skip, object is written with the real data
- For the data range that is skipped, the data is either filled as ZERO or cloned from the parent object. Specifically, for a full backup, data is filled as ZERO; for an incremental backup, data is cloned from the parent object
To support incremental backup, `ObjectWriter` interface needs to extend to support `io.WriterAt`, so that uploader could conduct a skippable-write manner:
```go
typeObjectWriterinterface{
io.WriteCloser
io.WriterAt
// Seeker is used in the cases that the object is not written sequentially
io.Seeker
// Checkpoint is periodically called to preserve the state of data written to the repo so far.
// Checkpoint returns a unified identifier that represent the current state.
// An empty ID could be returned on success if the backup repository doesn't support this.
Checkpoint()(ID,error)
// Result waits for the completion of the object write.
// Result returns the object's unified identifier after the write completes.
Result()(ID,error)
}
```
To clone data from parent object, the caller needs to specify the parent object. To support this, `ObjectWriteOptions` is extended with `ParentObject`.
The existing `AccessMode` could be used to indicate the data access type, either file system or block:
```go
// ObjectWriteOptions defines the options when creating an object for write
typeObjectWriteOptionsstruct{
FullPathstring// Full logical path of the object
DataTypeint// OBJECT_DATA_TYPE_*
Descriptionstring// A description of the object, could be empty
PrefixID// A prefix of the name used to save the object
AccessModeint// OBJECT_DATA_ACCESS_*
BackupModeint// OBJECT_DATA_BACKUP_*
AsyncWritesint// Num of async writes for the object, 0 means no async write
ParentObjectID// the parent object based on which incremental write will be done
}
```
To support non-Kopia uploader to save snapshots to Unified Repo, snapshot related methods will be added to `BackupRepo` interface:
To support non-Kopia uploader to save metadata, which is used to describe the backed up objects, some metadata related methods will be added to `BackupRepo` interface:
```go
// WriteMetadata writes metadata to the repo, metadata is used to describe data, e.g., file system
For Incremental Aware Object Extension, one object represents one volume.
For full backup, the skipped areas will be written as all ZERO by Incremental Aware Object Extension, since Kopia repository's interface doesn't support skippable write. But it is fine, the ZERO data will be deduplicated by Kopia repository so nothing is actually written to the backup storage.
For incremental backup, Incremental Aware Object Extension clones the table entries from the parent object for the skipped areas; for the written area, Incremental Aware Object Extension writes the data to Kopia repository and generate new entries. Finally, Incremental Aware Object Extension generates a new block address table for the incremental object which covers its entire logical space.
Incremental Aware Object Extension is automatically activated for block mode data access as set by `AccessMode` of `ObjectWriteOptions`.
#### Deduplication
The Incremental Aware Object Extension uses fix-sized splitter for deduplication, this is good enough for block level backup, reasons:
- Not like a file, a disk write never inserts data to the middle of the disk, it only does in-place update or append. So the data never shifts between two disks or the same disk of two different backups
- File system IO to disk general aligned to a specific size, e.g., 4KB for NTFS and ext4, as long as the chunk size is a multiply of this size, it effectively reduces the case that one IO kills two deduplication chunks
- For the usage cases that the disk is used as raw block device without a file system, the IO is still conducted by aligning to a specific boundary
The chunk size is intentionally chosen as 1MB, reasons:
- 1MB is a multiply of 4KB for file systems or common block sizes for raw block device usages
- 1MB is the start boundary of partitions for modern operating systems, for both MBR and GPT, so partition metadata could be isolated to a separate chunk
- The more chunks are there, the more indexes in the repository, 1MB is a moderate value regarding to the overhead of indexes for Kopia repository
#### Benefits
Since the existing block address table(BAT) of CAOS is reused and kept as is, it brings below benefits:
- All the entries are still managed by Kopia CAOS, so Velero doesn't need to keep an extra data
- The objects written by Velero block uploader is still recognizable by Kopia, for both full backup and incremental backup
- The existing data management in Kopia repository still works for objects generated by Velero block uploader, e.g., snapshot GC, repository maintenance, etc.
Most importantly, this solution is super performant:
- During incremental write, it doesn't copy any data from the parent object, instead, it only clones object block address entries
- During backup deletion, it doesn't need to move any data, it only deletes the BAT for the object
#### Uploader behavior
The block uploader's skippable write must also be aligned to this 1MB boundary, because Incremental Aware Object Extension needs to clone the entries that have been skipped from the parent object.
File system uploader is still using variable-sized deduplication, it is fine to keep data from the two uploaders into the same Kopia repository, though normally they won't be mutually deduplicated.
Volume could be resized; and volume size may not be aligned to 1MB boundary. The uploader need to handle the resize appropriately since Incremental Aware Object Extension cannot copy a BAT entry partially.
#### CBT Layer
CBT provides below functionalities:
1. For a full backup, it provides the allocated data ranges. E.g., for a 1TB volume, there may be only 1MB of files, with this functionality, the uploader could skip the ranges without real data
2. For an incremental backup, it provides the changed data ranges based on the provided parent snapshot. In this way, the uploader could skip the unchanged data and achieves an incremental backup
For case 1, the uploader calls Unified Repo Object's `WriteAt` method with the offset for the allocated data, ranges ahead of the offset will be filled as ZERO by unified repository.
For case 2, the uploader calls Unified Repo Object's `WriteAt` method with the offset for the changed data, ranges ahead of the offset will be cloned from the parent object unified repository.
A changeId is stored with each backup, the next backup will retrieve the parent snapshot's changeId and use it to retrieve the CBT.
The CBT retrieved from Kubernetes API are a list of `BlockMetadata`, each of range could be with fixed size or variable size.
Block uploader needs to maintain its own granularity that is friendly to its backup repository and uploader, as mentioned above.
From Kubernetes API, `GetMetadataAllocated` or `GetMetadataDelta` are called looply until all `BlockMetadata` are retrieved.
On the other hand, considering the complexity in uploader, e.g., multiple stream between read and write, the workflow should be driven by the uploader instead of the CBT iterator, therefore, in practice, all the allocated/changed blocks should be retrieved and preserved before passing it to the uploader.
As another fact, directly saving `BlockMetadata` list will be memory consuming.
With all the above considerations, the `Bitmap` data structure is used to save the allocated/changed blocks, calling CBT Bitmap.
CBT Bitmap chunk size could be set as 1MB or a multiply of it, but a larger chunk size would amplify the backup size, so 1MB size will be use.
Finally, interactions among CSI Snapshot Metadata Service, CBT Layer and Uploader is like below:

In this way, CBT layer and uploader are decoupled and CBT bitmap plays as a north bound parameter of the uploader.
#### Block Uploader
Block uploader consists of the reader and writer which are running asynchronously.
During backup, reader reads data from the block device and also refers to CBT Bitmap for allocated/changed blocks; writer writes data to the Unified Repo.
During restore, reader reads data from the Unified Repo; writer writes data to the block device.
Reader and writer connects by a ring buffer, that is, reader pushes the block data to the ring buffer and writer gets data from the ring buffer and write to the target.
To improve performance, block device is opened with direct IO, so that no data is going through the system cache unnecessarily.
During restore, to optimize the write throughput and storage usage, zero blocks should be either skipped (for restoring to a new volume) or unmapped (for restoring to an existing volume). To cover the both cases in a unified way, the SCSI command `WRITE_SAME` is used. Logics are as below:
- Detect if a block read from the backup is with all zero data
- If true, the uploader sends `WRITE_SAME` SCSI command by calling `BLKZEROOUT` ioctl
- If the call fails, the uploader fallbaks to use the conservative way to write all zero bytes to the disk
Uploader implementation is OS dependent, but since Windows container doesn't support block volumes, the current implementation is for linux only.
#### ChangeId
ChangeId identifies the base that CBT is generated from, it must strictly map to the parent snapshot in the repository. Otherwise, there will be data corruption in the incremental backup.
Therefore, ChangeId is saved together with the repository snapshot.
The data mover always queries parent snapshot from Unified Repo together with the ChangeId. In this way, no mismatch would happen.
Inside the uploader, the upper layer (DataUpload controller) could also provide the ChangeId as a mechanism of double confirmation. The received ChangeId would be re-evaluated against the one in the provided snapshot.
For Kubernetes API, changeId is represented by `BaseSnapshotId`.
changeId retrieval is storage specific, generally, it is retrieved from the `SnapshotHandle` of the VolumeSnapshotContent object; however, storages may also refer to other places to retrieve the changeId.
That is, `SnapshotHandle` and changeId may be two different values, in this case, the both values need to be preserved.
#### Volume Snapshot Retention
Storages/CSI drivers may support the changeId differently based on the storage's capabilities:
1. In order to calculate the changes, some storages require the parent snapshot mapping to the changeId always exists at the time of `GetMetadataDelta` is called, then the parent snapshot can NOT be deleted as long as there are incremental backups based on it.
2. Some storages don't require the parent snapshot itself at the time of calculating changes, then parent snapshot could be deleted immediately after the parent backup completes.
The existing exposer works perfectly with Case 1, that is, the snapshot is always deleted when the backup completes.
However, for Case 2, since the snapshot must be retained, the exposer needs changes as below:
- At the end of each backup, keep the current VolumeSnapshot's `deletionPolicy` as `Retain`, then when the VolumeSnapshot is deleted at the end of the backup, the current snapshot is retained in the storage
-`GetMetadataDelta` is called with `BaseSnapshotId` set as the preserved changeId
- When deleting a backup, a VolumeSnapshot-VolumeSnapshotContent pair is rebuilt with `deletionPolicy` as `delete` and `snapshotHandle` as the preserved one
- Then the rebuilt VolumeSnapshot is deleted so that the volume snapshot is deleted from the storage
There is no way to automatically detect which way a specific volume support, so an interface is exposed to users to set the volume snapshot retention method.
The interface could be added to the `Action.Parameters` of Volume Policy. By default, Velero block data mover takes Way 1, so volume snapshot is never retained; if users specify `RetainSnapshot` parameter, Way 2 will be taken.
In this way, users could specify --- for storage class "xxx" or CSI driver "yyy", backup through CSI snapshot with Velero block data mover and retain the snapshot.
#### Incremental Size
By the end of the backup, incremental size is also returned by the uploader, as same as Velero file system uploader. The size indicates how much data are unique so processed by the uploader, based on the provided CBT.
### Fallback to Full Backup
There are some occasions that the incremental backup won't continue, so the data mover fallbacks to full backup:
-`GetMetadataAllocated` or `GetMetadataDelta` returns error
- ChangeId is missing
- Parent snapshot is missing
When the fallback happens, the volume will be fully backed up from block level, but since because of the data deduplication from the backup repository, the unallocated/unchanged data would be probably deduplicated.
During restore, the volume will also be fully restored. The zero blocks handling as mentioned above is still working, so that write IO for unallocated data would be probably eliminated.
Fallback is to handle the exceptional cases, for most of the backups/restores, fallback is never expected.
### Irregular Volume Size
As mentioned above, during incremental backup, block uploader IO should be restricted to be aligned to the deduplication chunk size (1MB); on the other hand, there is no hard limit for users' volume size to be aligned.
To support volumes with irregular size, below measures are taken:
- Volume objects in the repository is always aligned to 1MB
- If the volume size is irregular, zero bytes will be padded to the tail of the volume object
- A real size is recorded in the repository snapshot
- During restore, the real size of data is restored
The padding must be always with zero bytes.
### Volume Size Change
Incremental backup could continue when volume is resized.
Block uploader supports to write disk with arbitrary size.
The volume resize cases don't need to be handled case by case.
Instead, when volume resize happens, block uploader needs to handle it appropriately in below ways:
- Loop with CBT
- Read data between RoundDownTo1M(newSize) and newSize to get the tail data
- If there is no tail data, which means the volume size is aligned to 1MB, then call `WriteAt(newSize, nil)`
- Otherwise, call `WriteAt(RoundDownTo1M(newSize), taildata)`, `taildata` is also padded to 1MB
That is to say:
- If CBT covers the tail of the volume, loop with CBT is enough for both shrink and expand case
- Otherwise, if volume is expanded, `WriteAt` guarantees to clone appropriate objects entries from the parent object and append zero data for the expanded areas. Particularly, if the parent volume is not in regular size, the zero padding bytes is also reused. Therefore, the parent object's padding bytes must be zero
- In the case the volume is shrunk, writing the tail data makes sure zero bytes are padding to the new volume object instead of inheriting non-zero data from the parent object
### Cancellation
The existing Cancellation mechanism is reused, so there is no change outside of the block uploader.
Inside the uploader, cancellation checkpoints are embedded to the uploader reader and writer, so that the execution could quit in a reasonable time once cancellation happens.
### Parallelism
Parallelism among data movers will reuse the existing mechanism --- load concurrency.
Inside the data mover, uploader reader and writer are always running in parallel. The number of reader and writer is always 1.
Sequential read/write of the volume is always optimized, there is no prove that multiple readers/writers are beneficial.
### Progress Report
Progress report outside of the data mover will reuse the existing mechanism.
Inside the data mover, progress update is embedded to the uploader writer.
The progress struct is kept as is, Velero block data mover still supports `TotalBytes` and `BytesDone`:
```go
typeProgressstruct{
TotalBytesint64`json:"totalBytes,omitempty"`
BytesDoneint64`json:"doneBytes,omitempty"`
}
```
By the end of the backup, the progress for block data mover provides the same `GetIncrementalSize` which reports the incremental size of the backup, so that the incremental size is reported to users in the same way as the file system data mover.
### Selectable Backup Type
For many reasons, a periodical full backup is required:
- From user experience, a periodical full is required to make sure the data integrity among the incremental backups, e.g., every 1 week or 1 month
Therefore, backup type (full/incremental) should be supported in Velero's manual backup and backup schedule.
Backup type will also be added to `volumeInfo.json` to support observability purposes.
Backup TTL is still used for users to specify a backup's retention time. By default, both full and incremental backups are with 30 days retention, even though this is not so reasonable for the full backups. This could be enhanced when Velero supports sophisticated retention policy.
As a workaround, users could create two schedules for the same scope of backup, one is for full backups, with less frequency and longer backup TTL; the other one is for incremental backups, with normal frequency and shorter backup TTL.
#### File System Data Mover
At present, Velero file system data mover doesn't support selectable backup type, instead, incremental backups are always conducted once possible.
From user experience this is not reasonable.
Therefore, to solve this problem and to make it align with Velero block data mover, Velero file system data mover will support backup type as well.
At present, the data path for Velero file system data mover has already supported it, we only need to expose this functionality to users.
### Backup Describe
Backup type should be added to backup description, there are two appearances:
- The `backupType` in the Backup CR. This is the selected backup type by users
- The backup type recorded in `volumeInfo.json`, which is the actual type taken by the backup
With these two values, users are able to know the actual backup type and also whether a fallback happens.
The `DataMover` item in the existing backup description should be updated to reflect the actual data mover completing the backup, this information could be retrieved from `volumeInfo.json`.
### Backup Sync
No more data is required for sync, so Backup Sync is kept as is.
### Backup Deletion
As mentioned above, no data is moved when deleting a repo snapshot for Velero block data mover, so Backup Deletion is kept as is regarding to repo snapshot; and for volume snapshot retention case, backup deletion logics will be modified accordingly to delete the retained snapshots.
### Restarts
Restarts mechanism is reused without any change.
### Logging
Logging mechanism is not changed.
### Backup CRD
A `backupType` field is added to Backup CRD, two values are supported `full` or `incremental`.
`full` indicates the data mover to take a full backup.
`incremental` which is the default value, indicates the data mover to take an incremental backup.
A `parentSnapshot` field is added to the DataUpload CRD, below values are supported:
-`""`: it fallbacks to `auto`
-`auto`: it means the data mover finds the recent snapshot of the same volume from Unified Repository and use it as the parent
-`none`: it means the data mover is not assigned with a parent snapshot, so it runs a full backup
- a specific snapshotID: it means the data mover use the specific snapshotID to find the parent snapshot. If it cannot be found, the data mover fallbacks to a full backup
The last option is for a backup plan, it will not be used for now and may be useful when Velero supports sophisticated retention policy. This means, Velero always finds the recent backup as the parent.
When `backupType` of the Backup is `full`, the data mover controller sets `none` to `parentSnapshot` of DataUpload.
When `backupType` of the Backup is `incremental`, the data mover controller sets `auto` to `parentSnapshot` of DataUpload. And `""` is just kept for backwards compatibility consideration.
```yaml
spec:
description:DataUploadSpec is the specification for a DataUpload.
properties:
parentSnapshot:
description:|-
ParentSnapshot specifies the parent snapshot that current backup is based on.
If its value is "" or "auto", the data mover finds the recent backup of the same volume as parent.
If its value is "none", the data mover will do a full backup
If its value is a specific snapshotID, the data mover finds the specific snapshot as parent.
type:string
```
### DataDownload CRD
No change is required to DataDownload CRD.
## Plugin Data Movers
The current design doesn't break anything for plugin data movers.
The enhancement in VolumePolicy could also be used for plugin data movers. That is, users could select a plugin data mover through VolumePolicy as same as Velero built-in data movers.
## Installation
No change to Installation.
## Upgrade
No impacts to Upgrade. The new fields in the CRDs are all optional fields and have backwards compatible values.
## CLI
Backup type parameter is added to Velero CLI as below:
```
velero backup create --full
velero schedule create --full
```
When the parameter is not specified, by default, Velero goes with incremental backups.
In addition, since the VolumeHelper interface is expected to be called by external plugins, the interface (but not the implementation) should be moved from `internal/volumehelper` to `pkg/util/volumehelper`.
In `pkg/plugin/utils/volumehelper/volume_policy_helper.go`, a new helper func will be added which delegates to the internal volumehelper.NewVolumeHelperImplWithNamespaces
```go
funcNewVolumeHelper(
volumePolicy*resourcepolicies.Policies,
snapshotVolumes*bool,
loggerlogrus.FieldLogger,
clientcrclient.Client,
defaultVolumesToFSBackupbool,
backupExcludePVCbool,
namespaces[]string,
)(VolumeHelper,error){
```
## Alternative Considered
An alternate approach was to create a new server arg to allow
user-defined parameters. That was rejected in favor of this approach,
as the explicitly-supported "custom" option integrates more easily
// snapshot action can have 3 different meaning based on velero configuration and backup spec - cloud provider based snapshots, local csi snapshots and datamover snapshots
SnapshotVolumeActionType="snapshot"
// custom action is used to identify a volume that will be handled by an external plugin. Velero will not snapshot or use fs-backup if action=="custom"
CustomVolumeActionType="custom"
)
// Action defined as one action for a specific way of backup
name:"match with empty phases list (always match)",
condition:&pvcPhaseCondition{phases:[]string{}},
volume:&structuredVolume{pvcPhase:"Pending"},
expectedMatch:true,
},
{
name:"match with nil phases list (always match)",
condition:&pvcPhaseCondition{phases:nil},
volume:&structuredVolume{pvcPhase:"Pending"},
expectedMatch:true,
},
}
for_,tc:=rangetests{
t.Run(tc.name,func(t*testing.T){
result:=tc.condition.match(tc.volume)
assert.Equal(t,tc.expectedMatch,result)
})
}
}
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.