Commit Graph

428 Commits

Author SHA1 Message Date
Shubham Pampattiwar
f0c97c489d Merge pull request #9414 from shubham-pampattiwar/add-maintenance-job-metrics
Some checks failed
Run the E2E test on kind / get-go-version (push) Failing after 1m8s
Run the E2E test on kind / build (push) Has been skipped
Run the E2E test on kind / setup-test-matrix (push) Successful in 5s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / get-go-version (push) Successful in 14s
Main CI / Build (push) Failing after 37s
Close stale issues and PRs / stale (push) Successful in 15s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m43s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 58s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 1m8s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 58s
Add Prometheus metrics for maintenance jobs
2025-12-08 09:23:44 -08:00
Scott Seago
7e4797f588 Track running backup count via BackupTracker
This avoids an unnecessary apiserver List call when
the backup reconciler is already at capacity.

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 17:23:47 -05:00
Scott Seago
fe799d7546 feat: Add concurrent backups configuration to backup reconciler
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago
d91d50f696 feat: Add concurrentBackups to backupQueueReconciler
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago
9dfa108579 feat: initialize backup queue controller
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago
4cac891fb9 refactor: Extract backup-queue controller name to constant
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago
5d02af3ce3 feat: Create backup queue controller and add to disableable list
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Scott Seago
cfc12dc6bf feat: Add install arg and config for concurrent backups
Co-authored-by: aider (gemini/gemini-2.5-pro) <aider@aider.chat>
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-12-02 16:28:08 -05:00
Shubham Pampattiwar
fdf439963c Add Prometheus metrics for maintenance jobs
Adds three new Prometheus metrics to track backup repository
maintenance job execution:

- velero_maintenance_job_success_total: Counter for successful jobs
- velero_maintenance_job_failure_total: Counter for failed jobs
- velero_maintenance_job_duration_seconds: Histogram for job duration

Metrics use repository_name label to identify specific BackupRepositories.
Duration is recorded for both successful and failed jobs (when job runs),
but not when job fails to start.

Includes comprehensive unit and integration tests.

Fixes #9225

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-12-02 11:36:15 -08:00
Xun Jiang
c62a486765 Add ConfigMap parameters validation for install CLI and server start.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-08-22 20:31:38 +08:00
Tiger Kaovilai
84b33efc2e Add priorityclasses to high priority restore list
Fixes #4201: Ensure PriorityClasses are restored before pods that
reference them, preventing restoration failures when pods depend on
custom PriorityClasses.

🤖 Generated with [Claude Code](https://claude.ai/code)

Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-08-11 19:24:58 -05:00
Xun Jiang
ec99b50970 Remove the repository maintenance job parameters from velero server.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-08-07 23:25:22 +08:00
Lyndon-Li
34f8b73507 bump up kopia to v0.21.1
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-22 15:56:04 +08:00
Shubham Pampattiwar
a73a150d98 Accommodate VGS workflows in PVC CSI plugin
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

make update

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

lint fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

add unit tests for getVSForPVC func

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Use v1beta1 instead of v1 v1alpha1

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

go mod tidy

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

update updateVGSCreatedVS func to use retry on conflict

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

make update minor fix

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

fix ut assert

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Address PR feedback

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

minor updates

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

remove unsused func and add todo for dep upgrades

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-07-21 11:36:40 -07:00
Lyndon-Li
88ec5fa193 issue 8813: remove restic from the valid uploader type
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-07-07 15:55:24 +08:00
Daniel Jiang
a550910f36 Add Gauge metric for BSL availability
The label of the gauge is the name of BSL

Signed-off-by: Daniel Jiang <daniel.jiang@broadcom.com>
2025-07-03 17:36:19 +08:00
lyndon-li
18f817295c Merge branch 'main' into vgdp-ms-cancel-pvb-pvr 2025-06-24 10:52:54 +08:00
Matthieu MOREL
07ea14962c fix require-error rule from testifylint
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-06-23 15:39:54 +00:00
Lyndon-Li
cded6bd207 cancel pvb/pvr on velero server restarts
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-06-23 15:29:09 +08:00
Shubham Pampattiwar
97a4d62d3c Extend PVCAction itemblock plugin to support grouping PVCs under VolumeGroupSnapshot label
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Update VGS label key and address PR feedback

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

update log level to debug for edge cases

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

Change VGS label key constant location

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>

run make update

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-06-10 07:01:45 -07:00
Lyndon-Li
5ccf22e0b0 Merge branch 'main' into vgdp-ms-pvb-data-path 2025-06-03 13:26:52 +08:00
Lyndon-Li
92c72b1a63 data path for vgdp ms pvb
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-06-03 13:25:48 +08:00
Shubham Pampattiwar
d2c6b6bc3e Add support for configuring VGS label key (#8938)
Some checks failed
Run the E2E test on kind / build (push) Failing after 6m38s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 10s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m17s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 1m40s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 49s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 45s
add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-05-30 11:03:47 -04:00
Matthieu MOREL
c6a420bd3a chore: define common aliases for k8s packages (#8672)
Some checks failed
Run the E2E test on kind / build (push) Failing after 6m48s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 35s
Close stale issues and PRs / stale (push) Successful in 8s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 1m11s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-aws, main) (push) Failing after 47s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-gcp, main) (push) Failing after 49s
Trivy Nightly Scan / Trivy nightly scan (velero-plugin-for-microsoft-azure, main) (push) Failing after 43s
* lchore: define common alias for k8s packages

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

* Update .golangci.yaml

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>

---------

Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-04-22 06:14:47 -04:00
Scott Seago
fe14a2c934 Move pvc annotation removal from CSI RIA to regular PVC RIA
Combine existing PVC non-CSI RIAs and move annotation
removal out of the CSI plugin to fix issues with
CSI volumes when using fs-backup

Signed-off-by: Scott Seago <sseago@redhat.com>
2025-03-05 15:55:55 -05:00
Xun Jiang
6b7dd12bf7 Modify VS and VSC restore actions.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-02-25 10:44:45 +08:00
Xun Jiang
620a116e7f Modify CSI related DeleteItemActions.
Remove the VS DIA.
Modify the VSC DIA: create then delete the VSC.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-02-19 14:36:59 +08:00
Tiger Kaovilai
a3cee616dc Upgrade go.mod k8s.io/ go.mod to v0.31.3 and set klog.SetLogger() for client-go (#8450)
Some checks failed
Run the E2E test on kind / build (push) Failing after 5m44s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
build-image / Build (push) Failing after 10s
Main CI / Build (push) Failing after 31s
Close stale issues and PRs / stale (push) Successful in 7s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 59s
Trivy Nightly Scan / Trivy nightly scan (velero-restore-helper, main) (push) Failing after 45s
Also bumped to support upgraded k8s.io/ deps.
- controller-gen to v0.16.5
- sigs.k8s.io/controller-runtime v0.19.2

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-02-17 15:05:10 -05:00
Matthieu MOREL
cbba3bdde7 chore: enable use-any from revive
Signed-off-by: Matthieu MOREL <matthieu.morel35@gmail.com>
2025-01-17 07:58:10 +01:00
lyndon-li
5b1738abf8 Merge pull request #8580 from Lyndon-Li/recall-repo-maintenance-history-on-restart
Some checks failed
Run the E2E test on kind / build (push) Failing after 5m31s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 32s
Close stale issues and PRs / stale (push) Successful in 9s
Trivy Nightly Scan / Trivy nightly scan (velero, main) (push) Failing after 53s
Trivy Nightly Scan / Trivy nightly scan (velero-restore-helper, main) (push) Failing after 49s
Recall repo maintenance history on restart
2025-01-17 14:08:27 +08:00
Daniel Jiang
dc02caf2b0 Skip patching the PV in finalization for failed operation
This commit makes change in restore finalizer controller, to make it
check the status in item operation of a PVC before patch the PV that is
bound to it.  If the operation is not successful it will skip patching
the PV.

Signed-off-by: Daniel Jiang <daniel.jiang@broadcom.com>
2025-01-09 01:42:50 +08:00
Lyndon-Li
4ce7361f5a recall repo maintenance history on restart
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-01-07 12:58:43 +08:00
Lyndon-Li
ceeab10b6e Merge branch 'main' into recall-repo-maintenance-history-on-restart 2025-01-06 17:21:52 +08:00
Lyndon-Li
6b73a256d5 recall repo maintenance history on restart
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-01-06 17:11:03 +08:00
Lyndon-Li
db69829fd7 repo maintenance job out of repo manager
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-01-06 16:25:33 +08:00
Wenkai Yin(尹文开)
eb5230e12f Merge restore helper image into Velero server image
Merge restore helper image into Velero server image

Fixes #8484

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
2025-01-03 14:12:23 +08:00
Lyndon-Li
3504546ba9 Merge branch 'main' into fail-fs-backup-on-windows-nodes 2024-12-20 13:20:01 +08:00
Lyndon-Li
a711b1067b fail fs-backup for windows nodes
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2024-12-18 10:46:00 +08:00
Lyndon-Li
11cd6d922b hybrid deploy
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2024-12-17 13:05:46 +08:00
Scott Seago
6588141090 Add --item-block-worker-count flag to velero install and server
Signed-off-by: Scott Seago <sseago@redhat.com>
2024-11-07 10:58:36 -05:00
Wenkai Yin(尹文开)
07847925fe Use aggregated discovery API to discovery API groups and resources
Use aggregated discovery API to discovery API groups and resources

Fixes #7526

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
2024-10-28 13:59:16 +08:00
Wenkai Yin(尹文开)
390ac497bb Add the Carvel package related resources to the restore priority list
Add the Carvel package related resources to the restore priority list

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
2024-09-19 16:47:00 +08:00
Tiger Kaovilai
3f9c2dc789 Reduces ~140 indirect imports for plugin/framework importers (#8208)
* Avoid plugin framework importers from needing cloud provider imports

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2024-09-13 10:21:51 +08:00
Xun Jiang/Bruce Jiang
efcf836d16 Merge pull request #8201 from blackpiglet/update_velero_install_parameter
Add the ConfigMap-specified parameters into velero install CLI
2024-09-12 13:08:56 +08:00
Xun Jiang
68f3545424 Add the ConfigMap-specified parameters into velero install CLI.
Rename backup-repository-config to backup-repository-configmap.
Rename repo-maintenance-job-config to repo-maintenance-job-configmap.
Rename node-agent-config to node-agent-configmap.
Add those three parameters to `velero install` CLI.
Modify the design and the site documents.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2024-09-12 11:24:14 +08:00
Tiger Kaovilai
c643ee5fd4 Retry completion status patch for backup and restore resources
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>

update to design #8063

Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2024-09-10 17:01:14 -04:00
Xun Jiang
26cc41f26d Implement the Repo maintanence Job configuration design.
Remove the resource parameters from the velero server CLI.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2024-09-09 22:42:56 +08:00
Wenkai Yin(尹文开)
dc6eeafe98 Pass Velero server command args to the plugins
Pass Velero server command args to the plugins

Fixes #7806

Signed-off-by: Wenkai Yin(尹文开) <yinw@vmware.com>
2024-09-04 13:43:27 +08:00
lyndon-li
8fde4a017d Merge pull request #8054 from sseago/iba-plugins
Iba plugins
2024-08-15 10:21:25 +08:00
lyndon-li
07c03a8919 Merge pull request #8085 from Lyndon-Li/data-mover-ms-node-agent-resume
Data mover micro service node agent resume
2024-08-14 00:14:47 +08:00