Commit Graph

1620 Commits

Author SHA1 Message Date
Clément Nussbaumer
248a840918 feat: Permit specifying annotations for the BackupPVC
Signed-off-by: Clément Nussbaumer <clement.nussbaumer@postfinance.ch>
2025-08-29 10:10:41 +02:00
Lyndon-Li
d952cfbb25 add 1.17 chagnelog
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-08-27 17:43:42 +08:00
Lyndon-Li
e581de1fe1 add 1.17 chagnelog
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-08-27 15:24:34 +08:00
Xun Jiang
c62a486765 Add ConfigMap parameters validation for install CLI and server start.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-08-22 20:31:38 +08:00
Xun Jiang/Bruce Jiang
3e77413897 Merge pull request #9175 from kaovilai/issue4201
Add priorityclasses to high priority restore list
2025-08-18 15:58:31 +08:00
Tiger Kaovilai
84b33efc2e Add priorityclasses to high priority restore list
Fixes #4201: Ensure PriorityClasses are restored before pods that
reference them, preventing restoration failures when pods depend on
custom PriorityClasses.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-08-11 19:24:58 -05:00
Priyansh Choudhary
e471e0f561 Updated chnagelog
Signed-off-by: Priyansh Choudhary <im1706@gmail.com>
2025-08-11 15:08:33 +05:30
Priyansh Choudhary
560df6edc3 Implement context-based logging utilities for UDM repositories
Signed-off-by: Priyansh Choudhary <im1706@gmail.com>
2025-08-11 13:42:13 +05:30
Wenkai Yin(尹文开)
3b15cea27c Merge pull request #9165 from Lyndon-Li/issue-fix-9140
Issue 9140: add NoExecute toleration for Windows
2025-08-08 13:12:24 +08:00
Xun Jiang
ec99b50970 Remove the repository maintenance job parameters from velero server.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-08-07 23:25:22 +08:00
Lyndon-Li
1e800906c2 issue 9140: add NoExecute toleration for Windows
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-08-07 16:43:07 +08:00
lyndon-li
ae29030917 Merge branch 'main' into implement8869 2025-08-06 13:45:35 +08:00
Tiger Kaovilai
35d2cc0890 Add priority class support for Velero server and node-agent
- Add --server-priority-class-name and --node-agent-priority-class-name flags to velero install command
- Configure data mover pods (PVB/PVR/DataUpload/DataDownload) to use priority class from node-agent-configmap
- Configure maintenance jobs to use priority class from repo-maintenance-job-configmap (global config only)
- Add priority class validation with ValidatePriorityClass and GetDataMoverPriorityClassName utilities
- Update e2e tests to include PriorityClass testing utilities
- Move priority class design document to Implemented folder
- Add comprehensive unit tests for all priority class implementations
- Update documentation for priority class configuration
- Add changelog entry for #8883

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

remove unused test utils

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

feat: add unit test for getting priority class name in maintenance jobs

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

doc update

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

feat: add priority class validation for repository maintenance jobs

- Add ValidatePriorityClassWithClient function to validate priority class existence
- Integrate validation in maintenance.go when creating maintenance jobs
- Update tests to cover the new validation functionality
- Return boolean from ValidatePriorityClass to allow fallback behavior

This ensures maintenance jobs don't fail due to non-existent priority classes,
following the same pattern used for data mover pods.

Addresses feedback from:
https://github.com/vmware-tanzu/velero/pull/8883#discussion_r2238681442

Refs #8869

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

refactor: clean up priority class handling for data mover pods

- Fix comment in node_agent.go to clarify PriorityClassName is only for data mover pods
- Simplify server.go to use dataPathConfigs.PriorityClassName directly
- Remove redundant priority class logging from controllers as it's already logged during server startup
- Keep logging centralized in the node-agent server initialization

This reduces code duplication and clarifies the scope of priority class configuration.

🤖 Generated with [Claude Code](https://claude.ai/code)

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

refactor: remove GetDataMoverPriorityClassName from kube utilities

Remove GetDataMoverPriorityClassName function and its tests as priority
class is now read directly from dataPathConfigs instead of parsing from
ConfigMap. This simplifies the codebase by eliminating the need for
indirect ConfigMap parsing.

Refs #8869

🤖 Generated with [Claude Code](https://claude.ai/code)

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

refactor: remove priority class validation from install command

Remove priority class validation during install as it's redundant
since validation already occurs during server startup. Users cannot
see console logs during install, making the validation warnings
ineffective at this stage.

The validation remains in place during server and node-agent startup
where it's more appropriate and visible to users.

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-Authored-By: Claude <noreply@anthropic.com>
2025-08-06 01:36:22 -04:00
Daniel Jiang
249d8f581a Add include/exclude policy to resources policy
fixes #8610

This commit extends the resources policy, such that user can define
resource include exclude filters in the policy and reuse it in different backups.

Signed-off-by: Daniel Jiang <daniel.jiang@broadcom.com>
2025-08-05 15:16:59 +08:00
Xun Jiang/Bruce Jiang
9cb421c26f Fix the dd and du's node affinity issue. (#9130)
Some checks failed
Run the E2E test on kind / build (push) Failing after 12m11s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 27s
Close stale issues and PRs / stale (push) Successful in 12s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m22s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m3s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m0s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m6s
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-08-04 16:21:35 -04:00
Daniel Jiang
850109abe4 Merge pull request #8557 from kaovilai/cacertcli-auto
Some checks failed
Run the E2E test on kind / build (push) Failing after 11m26s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 35s
CLI automatically discovers and uses cacert from BSL
2025-08-04 14:08:08 +08:00
Xun Jiang/Bruce Jiang
82e35a58dd Merge pull request #9135 from shubham-pampattiwar/keep-maint-jobs
Some checks failed
Run the E2E test on kind / build (push) Failing after 12m8s
Run the E2E test on kind / setup-test-matrix (push) Successful in 9s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 14s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m13s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 50s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m9s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 56s
Add ConfigMap support for keepLatestMaintenanceJobs
2025-08-02 11:41:21 +08:00
Shubham Pampattiwar
d8f222c83f Add ConfigMap support for keepLatestMaintenanceJobs
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

lint fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-31 16:33:46 -07:00
Shubham Pampattiwar
a3bfbe0d7a Add VolumeGroupSnapshot docs
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Add link to main docs page

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

remove diagram file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

explain all thevgs workflows

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Update pre-reqs

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

update troubleshooting section

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-31 13:55:36 -07:00
Xun Jiang
c84aab7f6f Remove the WaitUntilVSCHandleIsReady from vs BIA.
Becasue the pvc BIA already run WaitUntilVSCHandleIsReady,
no need to do the same work in vs BIA.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-07-31 15:35:05 +08:00
Tiger Kaovilai
f4233c0f9f CLI automatically discovers and uses cacert from BSL for download requests
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

feat: Add CA cert fallback when caCertFile fails in download requests

- Fallback to BSL cert when caCertFile cannot be opened
- Combine certificate handling blocks to reuse CA pool initialization
- Add comprehensive unit tests for fallback behavior

This improves robustness by allowing downloads to proceed with BSL CA cert
when the provided CA cert file is unavailable or unreadable.

🤖 Generated with [Claude Code](https://claude.ai/code)

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
Co-Authored-By: Claude <noreply@anthropic.com>
2025-07-29 22:25:52 -04:00
Felix Prasse
8678ea28ee Keep manager label for VSC
If distributed snapshotting is enabled in the external snapshotter a manager label is added to the volume snapshot content. When exposing the snapshot velero needs to keep this label around otherwise the exposed snapshot will never become ready.

Signed-off-by: Felix Prasse <1330854+flx5@users.noreply.github.com>
2025-07-30 08:06:22 +08:00
Xun Jiang/Bruce Jiang
36cde48ae8 Merge pull request #8979 from Lyndon-Li/vgdp-for-fs-backup-design
Some checks failed
Run the E2E test on kind / build (push) Failing after 12m22s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 29s
Close stale issues and PRs / stale (push) Successful in 13s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m12s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m4s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 53s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 55s
Design for VGDP MS for fs-backup
2025-07-29 14:16:00 +08:00
lyndon-li
40210198c6 Merge pull request #9117 from Lyndon-Li/issue-fix-9065
Some checks failed
Run the E2E test on kind / build (push) Failing after 11m32s
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 31s
Issue 9095: update restore doc for PVC selected-node
2025-07-29 11:28:53 +08:00
lyndon-li
7fe8e0b571 Merge pull request #9118 from Lyndon-Li/issue-fix-9065-1
Issue 9065: add doc for node-agent prepare queue length
2025-07-29 10:53:44 +08:00
lyndon-li
f5999d6c37 Merge branch 'main' into issue-fix-9065 2025-07-28 15:01:40 +08:00
Amos Mastbaum
687dcf69e7 csi pvc backup action
Signed-off-by: Amos Mastbaum <68001528+amastbau@users.noreply.github.com>

Update pvc_action.go

Signed-off-by: Amos Mastbaum <68001528+amastbau@users.noreply.github.com>

Update pvc_action.go

Signed-off-by: Amos Mastbaum <68001528+amastbau@users.noreply.github.com>

Adding missing test covarage + log mesasgae as suggested

Signed-off-by: Amos Mastbaum <68001528+amastbau@users.noreply.github.com>

Adding missing test covarage + log mesasgae as suggested

Signed-off-by: Amos Mastbaum <68001528+amastbau@users.noreply.github.com>
2025-07-28 14:57:02 +08:00
Lyndon-Li
f242c12309 Merge branch 'main' into issue-fix-9065-1 2025-07-28 14:53:07 +08:00
Xun Jiang/Bruce Jiang
21fa637f17 Merge pull request #9112 from Lyndon-Li/fs-backup-doc-refactor
Refactor fs-backup doc
2025-07-28 14:38:20 +08:00
Lyndon-Li
1cd2a228ad issue 9065: add doc for node-agent prepare queue length
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-28 14:14:22 +08:00
Lyndon-Li
09946bbbe5 issue 9065: update restore doc for PVC selected-node
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-28 13:37:38 +08:00
Xun Jiang/Bruce Jiang
fb6ff2aa66 Merge pull request #9113 from Lyndon-Li/csi-snapshot-data-movement-doc-update
Some checks failed
Run the E2E test on kind / build (push) Failing after 12m23s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 1m11s
CSI snapshot data movement doc update
2025-07-28 11:05:12 +08:00
Wenkai Yin(尹文开)
63ebd4e51b Return error if timeout when checking server version (#9111)
Some checks failed
Run the E2E test on kind / build (push) Failing after 8m20s
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 36s
Close stale issues and PRs / stale (push) Successful in 33s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 3m59s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m25s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m40s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m58s
Return error if timeout when checking server version

Fixes #8620

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
2025-07-25 12:31:55 -04:00
Lyndon-Li
191b943906 refactor fs-backup doc
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-25 18:28:39 +08:00
Lyndon-Li
ea21a49636 update CSI snapshot data movement doc for host path disable
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-25 18:17:08 +08:00
Tiger Kaovilai
1daa685e7d Make ResticIdentifier optional for kopia repositories (#8987)
The ResticIdentifier field in BackupRepository is only relevant for restic
repositories. For kopia repositories, this field is unused and should be
omitted. This change:

- Adds omitempty tag to ResticIdentifier field in BackupRepository CRD
- Updates controller to only populate ResticIdentifier for restic repos
- Adds tests to verify behavior for both restic and kopia repository types

This ensures backward compatibility while properly handling kopia repositories
that don't require a restic-compatible identifier.

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-07-24 22:25:09 -04:00
Xun Jiang
a61a073aea Avoid checking the VS and VSC status in the backup finalizing phase.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-07-24 13:28:05 +08:00
Shubham Pampattiwar
aa2e09c69e Update Backup describe string for DefaultVolumesToFSBackup flag (#9105)
Some checks failed
Run the E2E test on kind / build (push) Failing after 7m54s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 36s
add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-23 17:55:28 -04:00
Xun Jiang/Bruce Jiang
770ff142d7 Add imagePullSecrets inheritage for VGDP pod and maintenance job. (#9096)
Some checks failed
Run the E2E test on kind / build (push) Failing after 8m21s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 42s
Close stale issues and PRs / stale (push) Successful in 21s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m49s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m17s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 3m30s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 3m12s
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-07-23 13:55:16 -04:00
Shubham Pampattiwar
60a6c7384f Fix missing defaultVolumesToFsBackup flag output in Velero describe backup cmd (#9056)
add changelog file



Show defaultVolumesToFsBackup in describe only when set by the user



minor ut fix



minor fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-23 09:59:51 -04:00
lyndon-li
9b721a8251 Merge branch 'main' into issue-fix-9077 2025-07-23 15:05:22 +08:00
lyndon-li
48033b2e3b Merge pull request #9098 from Lyndon-Li/bump-up-kopia-0.21.1
Bump up Kopia to v0.21.1
2025-07-23 15:03:32 +08:00
longxiucai
8ce513ca07 Enable parameterized kubelet mount path during node-agent installation (#9074)
Enable parameterized kubelet mount path during node-agent installation

Signed-off-by: longyuxiang <longyuxiang@kylinos.cn>
2025-07-23 14:50:16 +08:00
Lyndon-Li
61238ee0ae issue 9077: don't block backup deletion on list VS error
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-23 11:32:18 +08:00
Lyndon-Li
e6377ff2fd Merge branch 'main' into bump-up-kopia-0.21.1 2025-07-22 13:42:37 +08:00
Lyndon-Li
b5502330e5 bump up kopia to v0.21.1
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-22 13:40:12 +08:00
Shubham Pampattiwar
a73a150d98 Accommodate VGS workflows in PVC CSI plugin
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

make update

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

lint fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

add unit tests for getVSForPVC func

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Use v1beta1 instead of v1 v1alpha1

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

go mod tidy

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

update updateVGSCreatedVS func to use retry on conflict

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

make update minor fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

fix ut assert

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Address PR feedback

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

minor updates

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

remove unsused func and add todo for dep upgrades

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-21 11:36:40 -07:00
lyndon-li
fd8c95baf8 Issue 9053: remove selected-node annotation during PVC restore (#9076)
issue 9053: remove selected-node annotation during PVC restore

Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-21 12:33:07 +08:00
Tiger Kaovilai
2b787f5d3d PVR action to remove restore-wait init container on restore (#8880)
This PR fixes issue #8870 where Velero was unnecessarily adding the restore-wait init container when restoring pods with volumes that were backed up using native datamover or CSI.

When restoring pods with volumes, Velero was always adding the restore-wait init container, even when the volumes were backed up using native datamover or CSI and didn't need file system restores. This was causing unnecessary overhead and potential issues.

PVR action to remove restore-wait init container on restore

Changes:
- Remove ALL existing restore-wait init containers before deciding whether to add a new one
- This covers both scenarios: when no file system restore is needed AND when preventing duplicates
- Simplify the add logic since we've already cleaned up existing containers
- Add better logging to show how many containers were removed

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-07-21 11:03:42 +08:00
lyndon-li
06d305ea47 Issue 8344: constrain data path expose (#9064)
Some checks failed
Run the E2E test on kind / build (push) Failing after 7m38s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 39s
Close stale issues and PRs / stale (push) Successful in 22s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m32s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m41s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m30s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 1m18s
* issue 8344: constrain data path exposure.

Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-18 13:32:45 +08:00