Compare commits

..

41 Commits

Author SHA1 Message Date
Xun Jiang
8704b4d7f8 Add Windows support for release dev branch.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-10-31 11:45:21 +08:00
lyndon-li
4ce4a4803d Merge pull request #9376 from Lyndon-Li/release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 8m56s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 39s
issue 9365: prevent multiple update of PVR
2025-10-29 15:51:18 +08:00
Lyndon-Li
ec7fe10816 issue 9365: prevent multiple update of PVR
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-10-29 15:01:33 +08:00
Wenkai Yin(尹文开)
3ae7183473 Merge pull request #9371 from blackpiglet/1.17.1_bump
Some checks failed
Run the E2E test on kind / build (push) Failing after 7m36s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 38s
Bump base image and Golang version for v1.17.1
2025-10-28 17:58:42 +08:00
Xun Jiang
bd4c53d13e Bump base image and Golang version for v1.17.1
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-10-28 15:36:01 +08:00
lyndon-li
988bfa55d4 Merge pull request #9341 from Lyndon-Li/release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 9m9s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 1m23s
[1.17] issue 9332: make bytesDone correct for incremental backup
2025-10-17 11:08:40 +08:00
Lyndon-Li
71ad893618 issue 9332: make bytesDone correct for incremental backup
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-10-17 10:45:41 +08:00
Xun Jiang/Bruce Jiang
1f32333aaa VerifyJSONConfigs verify every elements in Data. (#9303)
Some checks failed
Run the E2E test on kind / build (push) Failing after 7s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 5s
Add error message in the velero install CLI output if VerifyJSONConfigs fail.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-10-04 23:45:26 -04:00
lyndon-li
8ad7827f05 Merge pull request #9300 from sseago/privileged-fs-backup-pods-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 6s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 4s
[release-1.17] Privileged fs backup pods 1.17
2025-09-28 11:37:06 +08:00
lyndon-li
d0c176077b Merge branch 'release-1.17' into privileged-fs-backup-pods-1.17 2025-09-28 10:54:14 +08:00
Scott Seago
1ca4c54c60 Add option for privileged fs-backup pod
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-09-26 13:40:39 -04:00
Shubham Pampattiwar
d2eafe63ed Fix maintenance jobs toleration inheritance from Velero deployment (#9299)
Some checks failed
Run the E2E test on kind / build (push) Failing after 7s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 3s
fix codespell and add changelog file


(cherry picked from commit 5ba00dfb09)

update changelog filename



update changelog

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-09-26 11:14:23 -04:00
lyndon-li
14e2e25801 Merge pull request #9297 from Lyndon-Li/release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 6s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 3s
[1.17] backupPVC to different node
2025-09-25 13:33:28 +08:00
Lyndon-Li
cf9e7c5fcb backupPVC to different node
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-09-25 11:21:18 +08:00
Lyndon-Li
2e00746550 backupPVC to different node
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-09-25 11:18:34 +08:00
Shubham Pampattiwar
99b2c57511 Merge pull request #9292 from Lyndon-Li/release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 6s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 3s
[1.17] Issue #9247: Protect VolumeSnapshot field from race condition
2025-09-23 06:45:44 -07:00
0xLeo258
d1f7f152b7 Add built-in mutex for SynchronizedVSList && Update unit tests
Signed-off-by: 0xLeo258 <noixe0312@gmail.com>
2025-09-23 13:56:16 +08:00
0xLeo258
d82af8d8b5 add changelog
Signed-off-by: 0xLeo258 <noixe0312@gmail.com>
2025-09-23 13:51:32 +08:00
0xLeo258
a46b86fa29 fix9247: Protect VolumeSnapshot field
Signed-off-by: 0xLeo258 <noixe0312@gmail.com>
2025-09-23 13:51:21 +08:00
lyndon-li
96d5fb7210 Merge pull request #9290 from Lyndon-Li/release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 5s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 5s
[1.17] Issue #9234: Fix plugin reentry with safe VolumeSnapshotterCache
2025-09-23 13:50:22 +08:00
0xLeo258
c71e065863 add changelog
Signed-off-by: 0xLeo258 <noixe0312@gmail.com>
2025-09-23 13:19:36 +08:00
0xLeo258
60338d9740 fix 9234: Add safe VolumeSnapshotterCache
Signed-off-by: 0xLeo258 <noixe0312@gmail.com>
2025-09-23 13:15:19 +08:00
lyndon-li
afa71e9e03 Merge pull request #9277 from shubham-pampattiwar/fix-backup-q-accum-cp
Some checks failed
Run the E2E test on kind / build (push) Has been cancelled
Run the E2E test on kind / setup-test-matrix (push) Has been cancelled
Main CI / Build (push) Has been cancelled
Run the E2E test on kind / run-e2e-test (push) Has been cancelled
Fix Schedule Backup Queue Accumulation During Extended Blocking Scenarios
2025-09-19 11:59:11 +08:00
lyndon-li
fc877dd2dc Merge branch 'release-1.17' into fix-backup-q-accum-cp 2025-09-19 11:30:11 +08:00
lyndon-li
bb147b972b Merge pull request #9285 from priyansh17/release-1.17
Update AzureAD Microsoft Authentication Library to v1.5.0 (#9244)
2025-09-19 11:26:17 +08:00
lyndon-li
079394cd4f Merge branch 'release-1.17' into release-1.17 2025-09-19 10:54:48 +08:00
lyndon-li
fc4394964f Merge pull request #9282 from kaovilai/bitnamiminio-1.17
1.17: Fix E2E tests: Build MinIO from Bitnami Dockerfile to replace deprecated image
2025-09-19 10:54:15 +08:00
Priyansh Choudhary
85f2f23076 Added changelog
Signed-off-by: Priyansh Choudhary im1706@gmail.com
Signed-off-by: Priyansh Choudhary <im1706@gmail.com>
2025-09-19 03:20:14 +05:30
Priyansh Choudhary
aa71b53490 Update AzureAD Microsoft Authentication Library to v1.5.0 (#9244)
Signed-off-by: Priyansh Choudhary <im1706@gmail.com>
2025-09-19 03:20:13 +05:30
Tiger Kaovilai
30cf11a6b1 Fix E2E tests: Build MinIO from Bitnami Dockerfile to replace deprecated image
The Bitnami MinIO image bitnami/minio:2021.6.17-debian-10-r7 is no longer
available on Docker Hub, causing E2E tests to fail.

This change implements a solution to build the MinIO image locally from
Bitnami's public Dockerfile and cache it for subsequent runs:
- Fetches the latest commit hash of the Bitnami MinIO Dockerfile
- Uses GitHub Actions cache to store/retrieve built images
- Only rebuilds when the upstream Dockerfile changes
- Maintains compatibility with existing environment variables

Fixes #9279

🤖 Generated with [Claude Code](https://claude.ai/code)

Update .github/workflows/e2e-test-kind.yaml

Signed-off-by: Tiger Kaovilai <passawit.kaovilai@gmail.com>
Co-Authored-By: Claude <noreply@anthropic.com>
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2025-09-18 08:53:28 -04:00
Shubham Pampattiwar
f404ff207d Fix Schedule Backup Queue Accumulation During Extended Blocking Scenarios
Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
(cherry picked from commit 59289fba76)

add changelog file

Signed-off-by: Shubham Pampattiwar <spampatt@redhat.com>
2025-09-17 09:44:44 -07:00
lyndon-li
690b074891 Merge pull request #9266 from sseago/iba-perf-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 4s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 5s
[release-1.17] Get pod list once per namespace in pvc IBA
2025-09-17 11:15:16 +08:00
Scott Seago
c188c454d7 Get pod list once per namespace in pvc IBA
Signed-off-by: Scott Seago <sseago@redhat.com>
2025-09-16 17:39:30 -04:00
Wenkai Yin(尹文开)
18a690d69e Merge pull request #9220 from kaovilai/9173-release-1.17
Some checks failed
Run the E2E test on kind / build (push) Failing after 5s
Run the E2E test on kind / setup-test-matrix (push) Successful in 4s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 3s
release-1.17: feat: Permit specifying annotations for the BackupPVC #9173
2025-09-15 17:05:10 +08:00
lyndon-li
6986cde4d3 Merge branch 'release-1.17' into 9173-release-1.17 2025-09-15 15:46:44 +08:00
lyndon-li
3172d9f99c Merge pull request #9228 from vmware-tanzu/bump_k8s_lib_to_1.33
Some checks failed
Run the E2E test on kind / build (push) Failing after 6s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 5s
Bump k8s library to v1.33.
2025-09-09 17:33:34 +08:00
Xun Jiang
c34865a5fc Bump k8s library to v1.33.
Some checks failed
Run the E2E test on kind / build (push) Failing after 8s
Run the E2E test on kind / setup-test-matrix (push) Successful in 2s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Replace deprecated EventExpansion method with WithContext methods.
Modify UTs.
Align the E2E ginkgo CLI version with go.mod

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2025-09-08 20:37:51 +08:00
Clément Nussbaumer
344b09a582 test: fix backuppvc annotations test case
Signed-off-by: Clément Nussbaumer <clement.nussbaumer@postfinance.ch>
2025-08-29 08:57:12 -05:00
Clément Nussbaumer
53c46b01c7 feat: Permit specifying annotations for the BackupPVC
Signed-off-by: Clément Nussbaumer <clement.nussbaumer@postfinance.ch>
2025-08-29 08:57:12 -05:00
lyndon-li
a10f413cab Merge pull request #9216 from Lyndon-Li/release-1.17
Pin version of golang and base image
2025-08-28 15:34:43 +08:00
Lyndon-Li
de1cd6dcbf pin version of golang and base image
Signed-off-by: Lyndon-Li <lyonghui@vmware.com>
2025-08-28 15:04:51 +08:00
40 changed files with 50 additions and 1154 deletions

View File

@@ -8,26 +8,18 @@ on:
- "design/**"
- "**/*.md"
jobs:
get-go-version:
uses: ./.github/workflows/get-go-version.yaml
with:
ref: ${{ github.event.pull_request.base.ref }}
# Build the Velero CLI and image once for all Kubernetes versions, and cache it so the fan-out workers can get it.
build:
runs-on: ubuntu-latest
needs: get-go-version
outputs:
minio-dockerfile-sha: ${{ steps.minio-version.outputs.dockerfile_sha }}
steps:
- name: Check out the code
uses: actions/checkout@v5
- name: Set up Go version
uses: actions/setup-go@v6
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ needs.get-go-version.outputs.version }}
go-version-file: 'go.mod'
# Look for a CLI that's made for this PR
- name: Fetch built CLI
id: cli-cache
@@ -105,7 +97,6 @@ jobs:
needs:
- build
- setup-test-matrix
- get-go-version
runs-on: ubuntu-latest
strategy:
matrix: ${{fromJson(needs.setup-test-matrix.outputs.matrix)}}
@@ -113,12 +104,10 @@ jobs:
steps:
- name: Check out the code
uses: actions/checkout@v5
- name: Set up Go version
uses: actions/setup-go@v6
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ needs.get-go-version.outputs.version }}
go-version-file: 'go.mod'
# Fetch the pre-built MinIO image from the build job
- name: Fetch built MinIO Image
uses: actions/cache@v4

View File

@@ -1,38 +0,0 @@
on:
workflow_call:
inputs:
ref:
description: "The target branch's ref"
required: true
type: string
outputs:
version:
description: "The expected Go version"
value: ${{ jobs.extract.outputs.version }}
jobs:
extract:
runs-on: ubuntu-latest
outputs:
version: ${{ steps.pick-version.outputs.version }}
steps:
- name: Dump github context
env:
GITHUB_CONTEXT: ${{ toJson(github) }}
run: echo "$GITHUB_CONTEXT"
- name: Check out the code
uses: actions/checkout@v5
- id: pick-version
run: |
if [ "${{ inputs.ref }}" == "main" ]; then
version=$(grep '^go ' go.mod | awk '{print $2}' | cut -d. -f1-2)
else
goDirectiveVersion=$(grep '^go ' go.mod | awk '{print $2}')
toolChainVersion=$(grep '^toolchain ' go.mod | awk '{print $2}')
version=$(printf "%s\n%s\n" "$goDirectiveVersion" "$toolChainVersion" | sort -V | tail -n1)
fi
echo "version=$version"
echo "version=$version" >> $GITHUB_OUTPUT

View File

@@ -1,26 +1,18 @@
name: Pull Request CI Check
on: [pull_request]
jobs:
get-go-version:
uses: ./.github/workflows/get-go-version.yaml
with:
ref: ${{ github.event.pull_request.base.ref }}
build:
name: Run CI
needs: get-go-version
runs-on: ubuntu-latest
strategy:
fail-fast: false
steps:
- name: Check out the code
uses: actions/checkout@v5
- name: Set up Go version
uses: actions/setup-go@v6
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ needs.get-go-version.outputs.version }}
go-version-file: 'go.mod'
- name: Make ci
run: make ci
- name: Upload test coverage

View File

@@ -7,24 +7,16 @@ on:
- "design/**"
- "**/*.md"
jobs:
get-go-version:
uses: ./.github/workflows/get-go-version.yaml
with:
ref: ${{ github.event.pull_request.base.ref }}
build:
name: Run Linter Check
runs-on: ubuntu-latest
needs: get-go-version
steps:
- name: Check out the code
uses: actions/checkout@v5
- name: Set up Go version
uses: actions/setup-go@v6
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ needs.get-go-version.outputs.version }}
go-version-file: 'go.mod'
- name: Linter check
uses: golangci/golangci-lint-action@v8
with:

View File

@@ -9,24 +9,17 @@ on:
- '*'
jobs:
get-go-version:
uses: ./.github/workflows/get-go-version.yaml
with:
ref: ${{ github.ref }}
build:
name: Build
runs-on: ubuntu-latest
needs: get-go-version
steps:
- name: Check out the code
uses: actions/checkout@v5
- name: Set up Go version
uses: actions/setup-go@v6
- name: Set up Go
uses: actions/setup-go@v5
with:
go-version: ${{ needs.get-go-version.outputs.version }}
go-version-file: 'go.mod'
- name: Set up QEMU
id: qemu
uses: docker/setup-qemu-action@v3

View File

@@ -7,7 +7,7 @@ jobs:
stale:
runs-on: ubuntu-latest
steps:
- uses: actions/stale@v10.0.0
- uses: actions/stale@v9.1.0
with:
repo-token: ${{ secrets.GITHUB_TOKEN }}
stale-issue-message: "This issue is stale because it has been open 60 days with no activity. Remove stale label or comment or this will be closed in 14 days. If a Velero team member has requested log or more information, please provide the output of the shared commands."

View File

@@ -13,7 +13,7 @@
# limitations under the License.
# Velero binary build section
FROM --platform=$BUILDPLATFORM golang:1.24-bookworm AS velero-builder
FROM --platform=$BUILDPLATFORM golang:1.24.9-bookworm AS velero-builder
ARG GOPROXY
ARG BIN
@@ -49,7 +49,7 @@ RUN mkdir -p /output/usr/bin && \
go clean -modcache -cache
# Restic binary build section
FROM --platform=$BUILDPLATFORM golang:1.24-bookworm AS restic-builder
FROM --platform=$BUILDPLATFORM golang:1.24.9-bookworm AS restic-builder
ARG GOPROXY
ARG BIN
@@ -73,7 +73,7 @@ RUN mkdir -p /output/usr/bin && \
go clean -modcache -cache
# Velero image packing section
FROM paketobuildpacks/run-jammy-tiny:latest
FROM paketobuildpacks/run-jammy-tiny:0.2.78
LABEL maintainer="Xun Jiang <jxun@vmware.com>"

View File

@@ -15,7 +15,7 @@
ARG OS_VERSION=1809
# Velero binary build section
FROM --platform=$BUILDPLATFORM golang:1.24-bookworm AS velero-builder
FROM --platform=$BUILDPLATFORM golang:1.24.9-bookworm AS velero-builder
ARG GOPROXY
ARG BIN

View File

@@ -42,7 +42,7 @@ The following is a list of the supported Kubernetes versions for each Velero ver
| Velero version | Expected Kubernetes version compatibility | Tested on Kubernetes version |
|----------------|-------------------------------------------|-------------------------------------|
| 1.17 | 1.18-latest | 1.31.7, 1.32.3, 1.33.1, and 1.34.0 |
| 1.17 | 1.18-latest | 1.31.7, 1.32.3, and 1.33.1 |
| 1.16 | 1.18-latest | 1.31.4, 1.32.3, and 1.33.0 |
| 1.15 | 1.18-latest | 1.28.8, 1.29.8, 1.30.4 and 1.31.1 |
| 1.14 | 1.18-latest | 1.27.9, 1.28.9, and 1.29.4 |

View File

@@ -52,7 +52,7 @@ git_sha = str(local("git rev-parse HEAD", quiet = True, echo_off = True)).strip(
tilt_helper_dockerfile_header = """
# Tilt image
FROM golang:1.24 as tilt-helper
FROM golang:1.24.9 as tilt-helper
# Support live reloading with Tilt
RUN wget --output-document /restart.sh --quiet https://raw.githubusercontent.com/windmilleng/rerun-process-wrapper/master/restart.sh && \

View File

@@ -0,0 +1 @@
Backport to 1.17 (PR#9244 Update AzureAD Microsoft Authentication Library to v1.5.0)

View File

@@ -0,0 +1 @@
Fix issue #9332, add bytesDone for cache files

View File

@@ -0,0 +1 @@
Fix issue #9365, prevent fake completion notification due to multiple update of single PVR

View File

@@ -1,257 +0,0 @@
# Concurrent Backup Processing
This enhancement will enable Velero to process multiple backups at the same time. This is largely a usability enhancement rather than a performance enhancement, since the overall backup throughput may not be significantly improved over the current implementation, since we are already processing individual backup items in parallel. It is a significant usability improvement, though, as with the current design, a user who submits a small backup may have to wait significantly longer than expected if the backup is submitted immediately after a large backup.
## Background
With the current implementation, only one backup may be `InProgress` at a time. A second backup created will not start processing until the first backup moves on to `WaitingForPluginOperations` or `Finalizing`. This is a usability concern, especially in clusters when multiple users are initiating backups. With this enhancement, we intend to allow multiple backups to be processed concurrently. This will allow backups to start processing immediately, even if a large backup was just submitted by another user. This enhancement will build on top of the prior parallel item processing feature by creating a dedicatede ItemBlock worker pool for each running backup. The pool will be created at the beginning of the backup reconcile, and the input channel will be passed to the Kubernetes backupper just like it is in the current release.
The primary challenge is to make sure that the same workload in multiple backups is not backed up concurrently. If that were to happen, we would risk data corruption, especially around the processing of pod hooks and volume backup. For this first release we will take a conservative, high-level approach to overlap detection. Two backups will not run concurrently if there is any overlap in included namespaces. For example, if a backup that includes `ns1` and `ns2` is running, then a second backup for `ns2` and `ns3` will not be started. If a backup which does not filter namespaces is running (either a whole cluster backup or a non-namespace-limited backup with a label selector) then no other backups will be started, since a backup across all namespaces overlaps with any other backup. Calculating item-level overlap for queued backups is problematic since we don't know which items are included in a backup until backup processing has begun. A future release may add ItemBlock overlap detection, where at the item block worker level, the same item will not be processed by two different workers at the same time. This works together with workload conflict detection to further detect conflicts in a more granular level for shared resources between backups. Eventually, with a more complete understanding of individual workloads (either via ItemBlocks or some higher level model), the namespace level overlap detection may be relaxed in future versions.
## Goals
- Process multiple backups concurrently
- Detect namespace overlap to avoid conflicts
- For queued backups (not yet runnable due to concurrency limits or overlap), indicate the queue position in status
## Non Goals
- Handling NFS PVs when more than one PV point to the same underlying NFS share
- Handling VGDP cancellation for failed backups on restart
- Mounting a PVC for scenarios in which /tmp is too small for the number of concurrent backups
- Providing a mechanism to identify high priority backups which get preferential treatment in terms of ItemBlock worker availability
- Item-level overlap detection (future feature)
- Providing the ability to disable namespace-level overlap detection once Item-level overlap detection is in place (although this may be supported in a future version).
## High-Level Design
### Backup CRD changes
Two new backup phases will be added: `Queued` and `ReadyToStart`. In the Backup workflow, new backups will be moved to the Queued phase when they are added to the backup queue. When a backup is removed from the queue because it is now able to run, it will be moved to the `ReadyToStart` phase, which will allow the backup controller to start processing it.
In addition, a new Status field, `QueuePosition`, will be added to track the backup's current position in the queue.
### New Controller: `backupQueueReconciler`
A new reconciler will be added, `backupQueueReconciler` which will use the current `backupReconciler` logic for reconciling `New` backups but instead of running the backup, it will move the Backup to the `Queued` phase and set `QueuePosition`.
In addition, this reconciler will periodically reconcile all queued backups (on some configurable time interval) and if there is a runnable backup, remove it from the queue, update `QueuePosition` for any queued backups behind it, and update its phase to `ReadyToStart`.
Queued backups will be reconciled in order based on `QueuePosition`, so the first runnable backup found will be processed. A backup is runnable if both of the following conditions are true:
1) The total number of backups either `InProgress` or `ReadyToStart` is less than the configured number of concurrent backups.
2) The backup has no overlap with any backups currently `InProgress` or `ReadyToStart` or with any `Queued` backups with a higher (i.e. closer to 1) queue position than this backup.
### Updates to Backup controller
The current `backupReconciler` will change its reconciling rules. Instead of watching and reconciling New backups, it will reconcile `ReadyToStart` backups. In addition, it will be configured to run in parallel by setting `MaxConcurrentReconciles` based on the `concurrent-backups` server arg.
The startup (and shutdown) of the ItemBlock worker pool will be moved from reconciler startup to the backup reconcile, which will give each running backup its own dedicated worker pool. The per-backup worker pool will will use the existing `--item-block-worker-count` installer/server arg. This means that the maximum number of ItemBlock workers for the entire Velero pod will be the ItemBlock worker count multiplied by concurrentBackups. For example, if concurrentBackups is 5, and itemBlockWorkerCount is 6, then there will be, at most, 30 worker threads active, 5 dedicated to each InProgress backup, but this maximum will only be achieved when the maximum number of backups are InProgress. This also means that each InProgress backup will have a dedicated ItemBlock input channel with the same fixed buffer size.
## Detailed Design
### New Install/Server configuration args
A new install/server arg, `concurrent-backups` will be added. This will be an int-valued field specifying the number of backups which may be processed concurrently (with phase `InProgress`). If not specified, the default value of 1 will be used.
### Consideration of backup overlap and concurrent backup processing
The primary consideration for running additional backups concurrently is the configured `concurrent-backups` parameter. If the total number of `InProgress` and `ReadyToStart` backups is equal to `concurrent-backups` then any `Queued` backups will remain in the queue.
The second consideration is backup overlap. In order to prevent interaction between running backups (particularly around volume backup and pod hooks), we cannot allow two overlapping backups to run at the same time. For now, we will define overlap broadly -- requiring that two concurrent backups don't include any of the same namespaces. A backup for `ns1` can run concurrently with a backup for `ns2`, but a backup for `[ns1,ns2]` cannot run concurrently with a backup for `ns1`. One consequence of this approach is that a backup which includes all namespaces (even if further filtered by resource or label) cannot run concurrently with *any other backup*.
When determining which queued backup to run next, velero will look for the next queued backup which has no overlap with any InProgress backup or any Queued backup ahead of it. The reason we need to consider queued as well as running backups for overlap detection is as follows.
Consider the following scenario. These are the current not-completed backups (ordered from oldest to newest)
1. backup1, includedNamespaces: [ns1, ns2], phase: InProgress
2. backup2, includedNamespaces: [ns2, ns3, ns5], phase: Queued, QueuePosition: 1
3. backup3, includedNamespaces: [ns4, ns3], phase: Queued, QueuePosition: 2
4. backup4, includedNamespaces: [ns5, ns6], phase: Queued, QueuePosition: 2
5. backup5, includedNamespaces: [ns8, ns9], phase: Queued, QueuePosition: 3
Assuming `concurrent-backups` is 2, on the next reconcile, Velero will be able to start a second backup if there is one with no overlap. `backup2` cannot run, since `ns2` overlaps between it and the running `backup1`. If we only considered running overlap (and not queued overlap), then `backup3` could run now. It conflicts with the queued `backup2` on `ns3` but it does not conflict with the running backup. However, if it runs now, then when `backup1` completes, then `backup2` still can't run (since it now overlaps with running `backup3`on `ns3`), so `backup4` starts instead. Now when `backup3` completes, `backup2` still can't run (since it now conflicts with `backup4` on `ns5`). This means that even though it was the second backup created, it's the fourth to run -- providing worse time to completion than without parallel backups. If a queued backup has a large number of namespaces (a full-cluster backup for example), it would never run as long as new single-namespace backups keep being added to the queue.
To resolve this problem we consider both running backups as well as backups ahead in the queue when resolving overlap conflicts. In the above scenario, `backup2` can't run yet since it overlaps with the running backup on `ns2`. In addition, `backup3` and `backup4` also can't run yet since they overlap with queued `backup2`. Therefore, `backup5` will run now. Once `backup1` completes, `backup2` will be free to run.
### Backup CRD changes
New Backup phases:
```go
const (
// BackupPhaseQueued means the backup has been added to the
// queue by the BackupQueueReconciler.
BackupPhaseQueued BackupPhase = "Queued"
// BackupPhaseReadyToStart means the backup has been removed from the
// queue by the BackupQueueReconciler and is ready to start.
BackupPhaseReadyToStart BackupPhase = "ReadyToStart"
)
```
In addition, a new Status field, `queuePosition`, will be added to track the backup's current position in the queue.
```go
// QueuePosition is the position held by the backup in the queue.
// QueuePosition=1 means this backup is the next to be considered.
// Only relevant when Phase is "Queued"
// +optional
QueuePosition int `json:"queuePosition,omitempty"`
```
### New Controller: `backupQueueReconciler`
A new reconciler will be added, `backupQueueReconciler` which will reconcile backups under these conditions:
1) Watching Create/Update for backups in `New` (or empty) phase
2) Watching for Backup phase transition from `InProgress` to something else to reconcile all `Queued` backups
2) Watching for Backup phase transition from `New` (or empty) to `Queued` to reconcile all `Queued` backups
2) Periodic reconcile of `Queued` backups to handle backups queued at server startup as well as to make sure we never have a situation where backups are queued indefinitely because of a race condition or was otherwise missed in the reconcile on prior backup completion.
The reconciler will be set up as follows -- note that New backups are reconciled on Create/Update, while Queued backups are reconciled when an InProgress backup moves on to another state or when a new backup moves to the Queued state. We also reconcile Queued backups periodically to handle the case of a Velero pod restart with Queued backups, as well as to handle possible edge cases where a queued backup doesn't get moved out of the queue at the point of backup completion or an error occurs during a prior Queued backup reconcile.
```go
func (c *backupOperationsReconciler) SetupWithManager(mgr ctrl.Manager) error {
// only consider Queued backups, order by QueuePosition
gp := kube.NewGenericEventPredicate(func(object client.Object) bool {
backup := object.(*velerov1api.Backup)
return (backup.Status.Phase == velerov1api.BackupPhaseQueued)
})
s := kube.NewPeriodicalEnqueueSource(c.logger.WithField("controller", constant.ControllerBackupOperations), mgr.GetClient(), &velerov1api.BackupList{}, c.frequency, kube.PeriodicalEnqueueSourceOption{
Predicates: []predicate.Predicate{gp},
OrderFunc: queuePositionOrderFunc,
})
return ctrl.NewControllerManagedBy(mgr).
For(&velerov1api.Backup{}, builder.WithPredicates(predicate.Funcs{
UpdateFunc: func(ue event.UpdateEvent) bool {
backup := ue.ObjectNew.(*velerov1api.Backup)
return backup.Status.Phase == "" || backup.status.Phase == velerov1api.BackupPhaseNew
},
CreateFunc: func(event.CreateEvent) bool {
return backup.Status.Phase == "" || backup.status.Phase == velerov1api.BackupPhaseNew
},
DeleteFunc: func(de event.DeleteEvent) bool {
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
return false
},
})).
Watch(
&source.Kind{Type: &velerov1api.Backup{}},
&handler.EnqueueRequestsFromMapFunc{
ToRequests: handler.ToRequestsFunc(func(a handler.MapObject) []reconcile.Request {
backupList := velerov1api.BackupList{}
if err := p.List(ctx, backupList); err != nil {
p.logger.WithError(err).Error("error listing backups")
return
}
requests = []reconcile.request{}
// filter backup list by Phase=queued
// sort backup list by queuePosition
return requests
}),
},
builder.WithPredicates(predicate.Funcs{
UpdateFunc: func(ue event.UpdateEvent) bool {
oldBackup := ue.ObjectOld.(*velerov1api.Backup)
newBackup := ue.ObjectNew.(*velerov1api.Backup)
return oldBackup.Status.Phase == velerov1api.BackupPhaseInProgress &&
newBackup.Status.Phase != velerov1api.BackupPhaseInProgress ||
oldBackup.Status.Phase != velerov1api.BackupPhaseQueued &&
newBackup.Status.Phase == velerov1api.BackupPhaseQueued
},
CreateFunc: func(event.CreateEvent) bool {
return false
},
DeleteFunc: func(de event.DeleteEvent) bool {
return false
},
GenericFunc: func(ge event.GenericEvent) bool {
return false
},
}).
WatchesRawSource(s).
Named(constant.ControllerBackupQueue).
Complete(c)
}
```
New backups will be queued: Phase will be set to `Queued`, and `QueuePosition` will be set to a int value incremented from the highest current `QueuePosition` value among Queued backups.
Queued backups will be removed from the queue if runnable:
1) If the total number of backups either InProgress or ReadyToStart is greater than or equal to the concurrency limit, then exit without removing from the queue.
2) If the current backup overlaps with any InProgress, ReadyToStart, or Queued backup with `QueuePosition < currentBackup.QueuePosition` then exit without removing from the queue.
3) If we get here, the backup is runnable. To resolve a potential race condition where an InProgress backup completes between reconciling the backup with QueuePosition `n-1` and reconciling the current backup with QueuePosition `n`, we also check to see whether there are any runnable backups in the queue ahead of this one. The only time this will happen is if a backup completes immediately before reconcile starts which either frees up a concurrency slot or removes a namespace conflict. In this case, we don't want to run the current backup since the one ahead of this one in the queue (which was recently passed over before the InProgress backup completed) must run first. In this case, exit without removing from the queue.
4) If we get here, remove the backup from the queue by setting Phase to `ReadyToStart` and `QueuePosition` to zero. Decrement the `QueuePosition` of any other Queued backups with a `QueuePosition` higher than the current backup's queue position prior to dequeuing. At this point, the backup reconciler will start the backup.
`if len(inProgressBackups)+len(pendingStartBackups) >= concurrentBackups`
```
switch original.Status.Phase {
case "", velerov1api.BackupPhaseNew:
// enqueue backup -- set phase=Queued, set queuePosition=maxCurrentQueuePosition+1
}
// We should only ever get these events when added in order by the periodical enqueue source
// so as long as the current backup has not conflicts ahead of it or running, we should be good to
// dequeue
case "", velerov1api.BackupPhaseQueued:
// list backups, filter on Queued, ReadyToStart, and InProgress
// if number of InProgress backups + number of ReadyToStart backups >= concurrency limit, exit
// generate list of all namespaces included in InProgress, ReadyToStart, and Queued backups with
// queuePosition < backup.Status.QueuePosition
// if overlap found, exit
// check backups ahead of this one in the queue for runnability. If any are runnable, exit
// dequeue backup: set Phase to ReadyToStart, QueuePosition to 0, and decrement QueuePosition
// for all QueuedBackups behind this one in the queue
}
```
The queue controller will run as a single reconciler thread, so we will not need to deal with concurrency issues when moving backups from New to Queued or from Queued to ReadyToStart, and all of the updates to QueuePosition will be from a single thread.
### Updates to Backup controller
The Reconcile logic will be updated to respond to ReadyToStart backups instead of New backups:
```
@@ -234,8 +234,8 @@ func (b *backupReconciler) Reconcile(ctx context.Context, req ctrl.Request) (ctr
// InProgress, we still need this check so we can return nil to indicate we've finished processing
// this key (even though it was a no-op).
switch original.Status.Phase {
- case "", velerov1api.BackupPhaseNew:
- // only process new backups
+ case velerov1api.BackupPhaseReadyToStart:
+ // only process ReadyToStart backups
default:
b.logger.WithFields(logrus.Fields{
"backup": kubeutil.NamespaceAndName(original),
```
In addition, it will be configured to run in parallel by setting `MaxConcurrentReconciles` based on the `concurrent-backups` server arg.
```
@@ -149,6 +149,9 @@ func NewBackupReconciler(
func (b *backupReconciler) SetupWithManager(mgr ctrl.Manager) error {
return ctrl.NewControllerManagedBy(mgr).
For(&velerov1api.Backup{}).
+ WithOptions(controller.Options{
+ MaxConcurrentReconciles: concurrentBackups,
+ }).
Named(constant.ControllerBackup).
Complete(b)
}
```
The controller-runtime core reconciler logic already prevents the same resource from being reconciled by two different reconciler threads, so we don't need to worry about concurrency issues at the controller level.
The workerPool reference will be moved from the backupReconciler to the backupRequest, since this will now be backup-specific, and the initialization code for the worker pool will be moved from the reconciler init into the backup reconcile. This worker pool will be shut down upon exiting the Reconcile method.
### Resilience to restart of velero pod
The new backup phases (`Queued` and `ReadyToStart`) will be resilient to velero pod restarts. If the velero pod crashes or is restarted, only backups in the `InProgress` phase will be failed, so there is no change to current behavior. Queued backups will retain their queue position on restart, and ReadyToStart backups will move to InProgress when reconciled.
### Observability
#### Logging
When a backup is dequeued, an info log message will also include the wait time, calculated as `now - creationTimestamp`. When a backup is passed over due to overlap, an info log message will indicate which namespaces were in conflict.
#### Velero CLI
The `velero backup describe` output will include the current queue position for queued backups.

2
go.mod
View File

@@ -2,6 +2,8 @@ module github.com/vmware-tanzu/velero
go 1.24.0
toolchain go1.24.9
require (
cloud.google.com/go/storage v1.55.0
github.com/Azure/azure-sdk-for-go/sdk/azcore v1.18.1

View File

@@ -12,7 +12,7 @@
# See the License for the specific language governing permissions and
# limitations under the License.
FROM --platform=$TARGETPLATFORM golang:1.24-bookworm
FROM --platform=$TARGETPLATFORM golang:1.24.9-bookworm
ARG GOPROXY

View File

@@ -1,52 +0,0 @@
/*
Copyright 2019 the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package builder
import (
corev1api "k8s.io/api/core/v1"
schedulingv1api "k8s.io/api/scheduling/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
)
type PriorityClassBuilder struct {
object *schedulingv1api.PriorityClass
}
func ForPriorityClass(name string) *PriorityClassBuilder {
return &PriorityClassBuilder{
object: &schedulingv1api.PriorityClass{
ObjectMeta: metav1.ObjectMeta{
Name: name,
},
},
}
}
func (p *PriorityClassBuilder) Value(value int) *PriorityClassBuilder {
p.object.Value = int32(value)
return p
}
func (p *PriorityClassBuilder) PreemptionPolicy(policy string) *PriorityClassBuilder {
preemptionPolicy := corev1api.PreemptionPolicy(policy)
p.object.PreemptionPolicy = &preemptionPolicy
return p
}
func (p *PriorityClassBuilder) Result() *schedulingv1api.PriorityClass {
return p.object
}

View File

@@ -143,10 +143,6 @@ func GetConfigs(ctx context.Context, namespace string, kubeClient kubernetes.Int
return nil, errors.Errorf("data is not available in config map %s", configName)
}
if len(cm.Data) > 1 {
return nil, errors.Errorf("more than one keys are found in ConfigMap %s's data. only expect one", configName)
}
jsonString := ""
for _, v := range cm.Data {
jsonString = v

View File

@@ -249,7 +249,6 @@ func TestGetConfigs(t *testing.T) {
cmWithValidData := builder.ForConfigMap("fake-ns", "node-agent-config").Data("fake-key", "{\"loadConcurrency\":{\"globalConfig\": 5}}").Result()
cmWithPriorityClass := builder.ForConfigMap("fake-ns", "node-agent-config").Data("fake-key", "{\"priorityClassName\": \"high-priority\"}").Result()
cmWithPriorityClassAndOther := builder.ForConfigMap("fake-ns", "node-agent-config").Data("fake-key", "{\"priorityClassName\": \"low-priority\", \"loadConcurrency\":{\"globalConfig\": 3}}").Result()
cmWithMultipleKeysInData := builder.ForConfigMap("fake-ns", "node-agent-config").Data("fake-key-1", "{}", "fake-key-2", "{}").Result()
tests := []struct {
name string
@@ -332,14 +331,6 @@ func TestGetConfigs(t *testing.T) {
},
},
},
{
name: "ConfigMap's Data has more than one key",
namespace: "fake-ns",
kubeClientObj: []runtime.Object{
cmWithMultipleKeysInData,
},
expectErr: "more than one keys are found in ConfigMap node-agent-config's data. only expect one",
},
}
for _, test := range tests {

View File

@@ -92,12 +92,18 @@ func newRestorer(
_, _ = pvrInformer.AddEventHandler(
cache.ResourceEventHandlerFuncs{
UpdateFunc: func(_, obj any) {
pvr := obj.(*velerov1api.PodVolumeRestore)
UpdateFunc: func(oldObj, newObj any) {
pvr := newObj.(*velerov1api.PodVolumeRestore)
pvrOld := oldObj.(*velerov1api.PodVolumeRestore)
if pvr.GetLabels()[velerov1api.RestoreUIDLabel] != string(restore.UID) {
return
}
if pvr.Status.Phase == pvrOld.Status.Phase {
return
}
if pvr.Status.Phase == velerov1api.PodVolumeRestorePhaseCompleted || pvr.Status.Phase == velerov1api.PodVolumeRestorePhaseFailed || pvr.Status.Phase == velerov1api.PodVolumeRestorePhaseCanceled {
r.resultsLock.Lock()
defer r.resultsLock.Unlock()

View File

@@ -121,6 +121,7 @@ func (p *Progress) UploadStarted() {}
// CachedFile statistic the total bytes been cached currently
func (p *Progress) CachedFile(fname string, numBytes int64) {
atomic.AddInt64(&p.cachedBytes, numBytes)
atomic.AddInt64(&p.processedBytes, numBytes)
p.UpdateProgress()
}

View File

@@ -447,13 +447,8 @@ func GetVolumeSnapshotClassForStorageClass(
return &vsClass, nil
}
return nil, fmt.Errorf(
"failed to get VolumeSnapshotClass for provisioner %s: "+
"ensure that the desired VolumeSnapshotClass has the %s label or %s annotation, "+
"and that its driver matches the StorageClass provisioner",
provisioner,
velerov1api.VolumeSnapshotClassSelectorLabel,
velerov1api.VolumeSnapshotClassKubernetesAnnotation,
)
"failed to get VolumeSnapshotClass for provisioner %s, ensure that the desired VolumeSnapshot class has the %s label or %s annotation",
provisioner, velerov1api.VolumeSnapshotClassSelectorLabel, velerov1api.VolumeSnapshotClassKubernetesAnnotation)
}
// IsVolumeSnapshotClassHasListerSecret returns whether a volumesnapshotclass has a snapshotlister secret

View File

@@ -23,8 +23,6 @@ By default, `velero install` does not install Velero's [File System Backup][3].
If you've already run `velero install` without the `--use-node-agent` flag, you can run the same command again, including the `--use-node-agent` flag, to add the file system backup to your existing install.
Note that for some use cases (including installation on OpenShift clusters) the fs-backup pods must run in a Privileged security context. This is configured through the node-agent configmap (see below) by setting `privilegedFsBackup` to `true` in the configmap.
## CSI Snapshot Data Movement
Velero node-agent is required by [CSI Snapshot Data Movement][12] when Velero built-in data mover is used. By default, `velero install` does not install Velero's node-agent. To enable it, specify the `--use-node-agent` flag.

View File

@@ -23,6 +23,8 @@ By default, `velero install` does not install Velero's [File System Backup][3].
If you've already run `velero install` without the `--use-node-agent` flag, you can run the same command again, including the `--use-node-agent` flag, to add the file system backup to your existing install.
Note that for some use cases (including installation on OpenShift clusters) the fs-backup pods must run in a Privileged security context. This is configured through the node-agent configmap (see below) by setting `privilegedFsBackup` to `true` in the configmap.
## CSI Snapshot Data Movement
Velero node-agent is required by [CSI Snapshot Data Movement][12] when Velero built-in data mover is used. By default, `velero install` does not install Velero's node-agent. To enable it, specify the `--use-node-agent` flag.

View File

@@ -39,12 +39,10 @@ import (
. "github.com/vmware-tanzu/velero/test/e2e/basic/resources-check"
. "github.com/vmware-tanzu/velero/test/e2e/bsl-mgmt"
. "github.com/vmware-tanzu/velero/test/e2e/migration"
. "github.com/vmware-tanzu/velero/test/e2e/nodeagentconfig"
. "github.com/vmware-tanzu/velero/test/e2e/parallelfilesdownload"
. "github.com/vmware-tanzu/velero/test/e2e/parallelfilesupload"
. "github.com/vmware-tanzu/velero/test/e2e/privilegesmgmt"
. "github.com/vmware-tanzu/velero/test/e2e/pv-backup"
. "github.com/vmware-tanzu/velero/test/e2e/repomaintenance"
. "github.com/vmware-tanzu/velero/test/e2e/resource-filtering"
. "github.com/vmware-tanzu/velero/test/e2e/resourcemodifiers"
. "github.com/vmware-tanzu/velero/test/e2e/resourcepolicies"
@@ -662,24 +660,6 @@ var _ = Describe(
ParallelFilesDownloadTest,
)
var _ = Describe(
"Test Repository Maintenance Job Configuration's global part",
Label("RepoMaintenance", "LongTime"),
GlobalRepoMaintenanceTest,
)
var _ = Describe(
"Test Repository Maintenance Job Configuration's specific part",
Label("RepoMaintenance", "LongTime"),
SpecificRepoMaintenanceTest,
)
var _ = Describe(
"Test node agent config's LoadAffinity part",
Label("NodeAgentConfig", "LoadAffinity"),
LoadAffinities,
)
func GetKubeConfigContext() error {
var err error
var tcDefault, tcStandby k8s.TestClient
@@ -760,12 +740,6 @@ var _ = BeforeSuite(func() {
).To(Succeed())
}
By("Install PriorityClasses for E2E.")
Expect(veleroutil.CreatePriorityClasses(
context.Background(),
test.VeleroCfg.ClientToInstallVelero.Kubebuilder,
)).To(Succeed())
if test.InstallVelero {
By("Install test resources before testing")
Expect(
@@ -790,8 +764,6 @@ var _ = AfterSuite(func() {
test.StorageClassName,
),
).To(Succeed())
By("Delete PriorityClasses created by E2E")
Expect(
k8s.DeleteStorageClass(
ctx,
@@ -811,11 +783,6 @@ var _ = AfterSuite(func() {
).To(Succeed())
}
Expect(veleroutil.DeletePriorityClasses(
ctx,
test.VeleroCfg.ClientToInstallVelero.Kubebuilder,
)).To(Succeed())
// If the Velero is installed during test, and the FailFast is not enabled,
// uninstall Velero. If not, either Velero is not installed, or kept it for debug on failure.
if test.InstallVelero && (testSuitePassed || !test.VeleroCfg.FailFast) {

View File

@@ -342,12 +342,6 @@ func (m *migrationE2E) Restore() error {
Expect(veleroutil.InstallStorageClasses(
m.VeleroCfg.StandbyClusterCloudProvider)).To(Succeed())
By("Install PriorityClass for E2E.")
Expect(veleroutil.CreatePriorityClasses(
context.Background(),
test.VeleroCfg.StandbyClient.Kubebuilder,
)).To(Succeed())
if strings.EqualFold(m.VeleroCfg.Features, test.FeatureCSI) &&
m.VeleroCfg.UseVolumeSnapshots {
By("Install VolumeSnapshotClass for E2E.")
@@ -453,7 +447,6 @@ func (m *migrationE2E) Clean() error {
Expect(k8sutil.KubectlConfigUseContext(
m.Ctx, m.VeleroCfg.StandbyClusterContext)).To(Succeed())
m.VeleroCfg.ClientToInstallVelero = m.VeleroCfg.StandbyClient
m.VeleroCfg.ClusterToInstallVelero = m.VeleroCfg.StandbyClusterName
@@ -466,6 +459,7 @@ func (m *migrationE2E) Clean() error {
fmt.Println("Fail to delete StorageClass1: ", err)
return
}
if err := k8sutil.DeleteStorageClass(
m.Ctx,
*m.VeleroCfg.ClientToInstallVelero,
@@ -475,12 +469,6 @@ func (m *migrationE2E) Clean() error {
return
}
By("Delete PriorityClasses created by E2E")
Expect(veleroutil.DeletePriorityClasses(
m.Ctx,
m.VeleroCfg.ClientToInstallVelero.Kubebuilder,
)).To(Succeed())
if strings.EqualFold(m.VeleroCfg.Features, test.FeatureCSI) &&
m.VeleroCfg.UseVolumeSnapshots {
By("Delete VolumeSnapshotClass created by E2E")

View File

@@ -1,347 +0,0 @@
/*
Copyright the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package nodeagentconfig
import (
"context"
"encoding/json"
"fmt"
"strings"
"time"
. "github.com/onsi/gomega"
"github.com/pkg/errors"
corev1api "k8s.io/api/core/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
"k8s.io/apimachinery/pkg/util/wait"
"sigs.k8s.io/controller-runtime/pkg/client"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
velerov2alpha1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
"github.com/vmware-tanzu/velero/pkg/builder"
velerotypes "github.com/vmware-tanzu/velero/pkg/types"
"github.com/vmware-tanzu/velero/pkg/util/kube"
velerokubeutil "github.com/vmware-tanzu/velero/pkg/util/kube"
"github.com/vmware-tanzu/velero/test"
. "github.com/vmware-tanzu/velero/test/e2e/test"
k8sutil "github.com/vmware-tanzu/velero/test/util/k8s"
veleroutil "github.com/vmware-tanzu/velero/test/util/velero"
)
type NodeAgentConfigTestCase struct {
TestCase
nodeAgentConfigs velerotypes.NodeAgentConfigs
nodeAgentConfigMapName string
}
var LoadAffinities func() = TestFunc(&NodeAgentConfigTestCase{
nodeAgentConfigs: velerotypes.NodeAgentConfigs{
LoadAffinity: []*kube.LoadAffinity{
{
NodeSelector: metav1.LabelSelector{
MatchLabels: map[string]string{
"beta.kubernetes.io/arch": "amd64",
},
},
StorageClass: test.StorageClassName,
},
{
NodeSelector: metav1.LabelSelector{
MatchLabels: map[string]string{
"kubernetes.io/arch": "amd64",
},
},
StorageClass: test.StorageClassName2,
},
},
BackupPVCConfig: map[string]velerotypes.BackupPVC{
test.StorageClassName: {
StorageClass: test.StorageClassName2,
},
},
RestorePVCConfig: &velerotypes.RestorePVC{
IgnoreDelayBinding: true,
},
PriorityClassName: test.PriorityClassNameForDataMover,
},
nodeAgentConfigMapName: "node-agent-config",
})
func (n *NodeAgentConfigTestCase) Init() error {
// generate random number as UUIDgen and set one default timeout duration
n.TestCase.Init()
// generate variable names based on CaseBaseName + UUIDgen
n.CaseBaseName = "node-agent-config-" + n.UUIDgen
n.BackupName = "backup-" + n.CaseBaseName
n.RestoreName = "restore-" + n.CaseBaseName
// generate namespaces by NamespacesTotal
n.NamespacesTotal = 1
n.NSIncluded = &[]string{}
for nsNum := 0; nsNum < n.NamespacesTotal; nsNum++ {
createNSName := fmt.Sprintf("%s-%00000d", n.CaseBaseName, nsNum)
*n.NSIncluded = append(*n.NSIncluded, createNSName)
}
// assign values to the inner variable for specific case
n.VeleroCfg.UseNodeAgent = true
n.VeleroCfg.UseNodeAgentWindows = true
// Need to verify the data mover pod content, so don't wait until backup completion.
n.BackupArgs = []string{
"create", "--namespace", n.VeleroCfg.VeleroNamespace, "backup", n.BackupName,
"--include-namespaces", strings.Join(*n.NSIncluded, ","),
"--snapshot-volumes=true", "--snapshot-move-data",
}
// Need to verify the data mover pod content, so don't wait until restore completion.
n.RestoreArgs = []string{
"create", "--namespace", n.VeleroCfg.VeleroNamespace, "restore", n.RestoreName,
"--from-backup", n.BackupName,
}
// Message output by ginkgo
n.TestMsg = &TestMSG{
Desc: "Validate Node Agent ConfigMap configuration",
FailedMSG: "Failed to apply and / or validate configuration in VGDP pod.",
Text: "Should be able to apply and validate configuration in VGDP pod.",
}
return nil
}
func (n *NodeAgentConfigTestCase) InstallVelero() error {
// Because this test needs to use customized Node Agent ConfigMap,
// need to uninstall and reinstall Velero.
fmt.Println("Start to uninstall Velero")
if err := veleroutil.VeleroUninstall(n.Ctx, n.VeleroCfg); err != nil {
fmt.Printf("Fail to uninstall Velero: %s\n", err.Error())
return err
}
result, err := json.Marshal(n.nodeAgentConfigs)
if err != nil {
return err
}
repoMaintenanceConfig := builder.ForConfigMap(n.VeleroCfg.VeleroNamespace, n.nodeAgentConfigMapName).
Data("node-agent-config", string(result)).Result()
n.VeleroCfg.NodeAgentConfigMap = n.nodeAgentConfigMapName
return veleroutil.PrepareVelero(
n.Ctx,
n.CaseBaseName,
n.VeleroCfg,
repoMaintenanceConfig,
)
}
func (n *NodeAgentConfigTestCase) CreateResources() error {
for _, ns := range *n.NSIncluded {
if err := k8sutil.CreateNamespace(n.Ctx, n.Client, ns); err != nil {
fmt.Printf("Fail to create ns %s: %s\n", ns, err.Error())
return err
}
pvc, err := k8sutil.CreatePVC(n.Client, ns, "volume-1", test.StorageClassName, nil)
if err != nil {
fmt.Printf("Fail to create PVC %s: %s\n", "volume-1", err.Error())
return err
}
vols := k8sutil.CreateVolumes(pvc.Name, []string{"volume-1"})
deployment := k8sutil.NewDeployment(
n.CaseBaseName,
(*n.NSIncluded)[0],
1,
map[string]string{"app": "test"},
n.VeleroCfg.ImageRegistryProxy,
n.VeleroCfg.WorkerOS,
).WithVolume(vols).Result()
deployment, err = k8sutil.CreateDeployment(n.Client.ClientGo, ns, deployment)
if err != nil {
fmt.Printf("Fail to create deployment %s: %s \n", deployment.Name, err.Error())
return errors.Wrap(err, fmt.Sprintf("failed to create deployment: %s", err.Error()))
}
if err := k8sutil.WaitForReadyDeployment(n.Client.ClientGo, deployment.Namespace, deployment.Name); err != nil {
fmt.Printf("Fail to create deployment %s: %s\n", n.CaseBaseName, err.Error())
return err
}
}
return nil
}
func (n *NodeAgentConfigTestCase) Backup() error {
if err := veleroutil.VeleroCmdExec(n.Ctx, n.VeleroCfg.VeleroCLI, n.BackupArgs); err != nil {
return err
}
backupPodList := new(corev1api.PodList)
wait.PollUntilContextTimeout(n.Ctx, 5*time.Second, 5*time.Minute, true, func(ctx context.Context) (bool, error) {
duList := new(velerov2alpha1api.DataUploadList)
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(
n.Ctx,
duList,
&client.ListOptions{Namespace: n.VeleroCfg.VeleroNamespace},
); err != nil {
fmt.Printf("Fail to list DataUpload: %s\n", err.Error())
return false, fmt.Errorf("Fail to list DataUpload: %w", err)
} else {
if len(duList.Items) <= 0 {
fmt.Println("No DataUpload found yet. Continue polling.")
return false, nil
}
}
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(
n.Ctx,
backupPodList,
&client.ListOptions{
LabelSelector: labels.SelectorFromSet(map[string]string{
velerov1api.DataUploadLabel: duList.Items[0].Name,
}),
}); err != nil {
fmt.Printf("Fail to list backupPod %s\n", err.Error())
return false, errors.Wrapf(err, "error to list backup pods")
} else {
if len(backupPodList.Items) <= 0 {
fmt.Println("No backupPod found yet. Continue polling.")
return false, nil
}
}
return true, nil
})
fmt.Println("Start to verify backupPod content.")
Expect(backupPodList.Items[0].Spec.PriorityClassName).To(Equal(n.nodeAgentConfigs.PriorityClassName))
// In backup, only the second element of LoadAffinity array should be used.
expectedAffinity := velerokubeutil.ToSystemAffinity(n.nodeAgentConfigs.LoadAffinity[1:])
Expect(backupPodList.Items[0].Spec.Affinity).To(Equal(expectedAffinity))
fmt.Println("backupPod content verification completed successfully.")
wait.PollUntilContextTimeout(n.Ctx, 5*time.Second, 5*time.Minute, true, func(ctx context.Context) (bool, error) {
backup := new(velerov1api.Backup)
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.Get(
n.Ctx,
client.ObjectKey{Namespace: n.VeleroCfg.VeleroNamespace, Name: n.BackupName},
backup,
); err != nil {
return false, err
}
if backup.Status.Phase != velerov1api.BackupPhaseCompleted &&
backup.Status.Phase != velerov1api.BackupPhaseFailed &&
backup.Status.Phase != velerov1api.BackupPhasePartiallyFailed {
fmt.Printf("backup status is %s. Continue polling until backup reach to a final state.\n", backup.Status.Phase)
return false, nil
}
return true, nil
})
return nil
}
func (n *NodeAgentConfigTestCase) Restore() error {
if err := veleroutil.VeleroCmdExec(n.Ctx, n.VeleroCfg.VeleroCLI, n.RestoreArgs); err != nil {
return err
}
restorePodList := new(corev1api.PodList)
wait.PollUntilContextTimeout(n.Ctx, 5*time.Second, 5*time.Minute, true, func(ctx context.Context) (bool, error) {
ddList := new(velerov2alpha1api.DataDownloadList)
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(
n.Ctx,
ddList,
&client.ListOptions{Namespace: n.VeleroCfg.VeleroNamespace},
); err != nil {
fmt.Printf("Fail to list DataDownload: %s\n", err.Error())
return false, fmt.Errorf("Fail to list DataDownload %w", err)
} else {
if len(ddList.Items) <= 0 {
fmt.Println("No DataDownload found yet. Continue polling.")
return false, nil
}
}
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(
n.Ctx,
restorePodList,
&client.ListOptions{
LabelSelector: labels.SelectorFromSet(map[string]string{
velerov1api.DataDownloadLabel: ddList.Items[0].Name,
}),
}); err != nil {
fmt.Printf("Fail to list restorePod %s\n", err.Error())
return false, errors.Wrapf(err, "error to list restore pods")
} else {
if len(restorePodList.Items) <= 0 {
fmt.Println("No restorePod found yet. Continue polling.")
return false, nil
}
}
return true, nil
})
fmt.Println("Start to verify restorePod content.")
Expect(restorePodList.Items[0].Spec.PriorityClassName).To(Equal(n.nodeAgentConfigs.PriorityClassName))
// In restore, only the first element of LoadAffinity array should be used.
expectedAffinity := velerokubeutil.ToSystemAffinity(n.nodeAgentConfigs.LoadAffinity[:1])
Expect(restorePodList.Items[0].Spec.Affinity).To(Equal(expectedAffinity))
fmt.Println("restorePod content verification completed successfully.")
wait.PollUntilContextTimeout(n.Ctx, 5*time.Second, 5*time.Minute, true, func(ctx context.Context) (bool, error) {
restore := new(velerov1api.Restore)
if err := n.VeleroCfg.ClientToInstallVelero.Kubebuilder.Get(
n.Ctx,
client.ObjectKey{Namespace: n.VeleroCfg.VeleroNamespace, Name: n.RestoreName},
restore,
); err != nil {
return false, err
}
if restore.Status.Phase != velerov1api.RestorePhaseCompleted &&
restore.Status.Phase != velerov1api.RestorePhaseFailed &&
restore.Status.Phase != velerov1api.RestorePhasePartiallyFailed {
fmt.Printf("restore status is %s. Continue polling until restore reach to a final state.\n", restore.Status.Phase)
return false, nil
}
return true, nil
})
return nil
}

View File

@@ -1,247 +0,0 @@
/*
Copyright 2021 the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package repomaintenance
import (
"encoding/json"
"fmt"
"strings"
"time"
. "github.com/onsi/gomega"
"github.com/pkg/errors"
batchv1api "k8s.io/api/batch/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
"sigs.k8s.io/controller-runtime/pkg/client"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/builder"
velerotypes "github.com/vmware-tanzu/velero/pkg/types"
"github.com/vmware-tanzu/velero/pkg/util/kube"
velerokubeutil "github.com/vmware-tanzu/velero/pkg/util/kube"
"github.com/vmware-tanzu/velero/test"
. "github.com/vmware-tanzu/velero/test/e2e/test"
k8sutil "github.com/vmware-tanzu/velero/test/util/k8s"
veleroutil "github.com/vmware-tanzu/velero/test/util/velero"
)
type RepoMaintenanceTestCase struct {
TestCase
repoMaintenanceConfigMapName string
repoMaintenanceConfigKey string
jobConfigs velerotypes.JobConfigs
}
var keepJobNum = 1
var GlobalRepoMaintenanceTest func() = TestFunc(&RepoMaintenanceTestCase{
repoMaintenanceConfigKey: "global",
repoMaintenanceConfigMapName: "global",
jobConfigs: velerotypes.JobConfigs{
KeepLatestMaintenanceJobs: &keepJobNum,
PodResources: &velerokubeutil.PodResources{
CPURequest: "100m",
MemoryRequest: "100Mi",
CPULimit: "200m",
MemoryLimit: "200Mi",
},
PriorityClassName: test.PriorityClassNameForRepoMaintenance,
},
})
var SpecificRepoMaintenanceTest func() = TestFunc(&RepoMaintenanceTestCase{
repoMaintenanceConfigKey: "",
repoMaintenanceConfigMapName: "specific",
jobConfigs: velerotypes.JobConfigs{
KeepLatestMaintenanceJobs: &keepJobNum,
PodResources: &velerokubeutil.PodResources{
CPURequest: "100m",
MemoryRequest: "100Mi",
CPULimit: "200m",
MemoryLimit: "200Mi",
},
PriorityClassName: test.PriorityClassNameForRepoMaintenance,
},
})
func (r *RepoMaintenanceTestCase) Init() error {
// generate random number as UUIDgen and set one default timeout duration
r.TestCase.Init()
// generate variable names based on CaseBaseName + UUIDgen
r.CaseBaseName = "repo-maintenance-" + r.UUIDgen
r.BackupName = "backup-" + r.CaseBaseName
r.RestoreName = "restore-" + r.CaseBaseName
// generate namespaces by NamespacesTotal
r.NamespacesTotal = 1
r.NSIncluded = &[]string{}
for nsNum := 0; nsNum < r.NamespacesTotal; nsNum++ {
createNSName := fmt.Sprintf("%s-%00000d", r.CaseBaseName, nsNum)
*r.NSIncluded = append(*r.NSIncluded, createNSName)
}
// If repoMaintenanceConfigKey is not set, it means testing the specific repo case.
// Need to assemble the BackupRepository name. The format is "volumeNamespace-bslName-uploaderName"
if r.repoMaintenanceConfigKey == "" {
r.repoMaintenanceConfigKey = (*r.NSIncluded)[0] + "-" + "default" + "-" + test.UploaderTypeKopia
}
// assign values to the inner variable for specific case
r.VeleroCfg.UseNodeAgent = true
r.VeleroCfg.UseNodeAgentWindows = true
r.BackupArgs = []string{
"create", "--namespace", r.VeleroCfg.VeleroNamespace, "backup", r.BackupName,
"--include-namespaces", strings.Join(*r.NSIncluded, ","),
"--snapshot-volumes=true", "--snapshot-move-data", "--wait",
}
// Message output by ginkgo
r.TestMsg = &TestMSG{
Desc: "Validate Repository Maintenance Job configuration",
FailedMSG: "Failed to apply and / or validate configuration in repository maintenance jobs.",
Text: "Should be able to apply and validate configuration in repository maintenance jobs.",
}
return nil
}
func (r *RepoMaintenanceTestCase) InstallVelero() error {
// Because this test needs to use customized repository maintenance ConfigMap,
// need to uninstall and reinstall Velero.
fmt.Println("Start to uninstall Velero")
if err := veleroutil.VeleroUninstall(r.Ctx, r.VeleroCfg); err != nil {
fmt.Printf("Fail to uninstall Velero: %s\n", err.Error())
return err
}
result, err := json.Marshal(r.jobConfigs)
if err != nil {
return err
}
repoMaintenanceConfig := builder.ForConfigMap(r.VeleroCfg.VeleroNamespace, r.repoMaintenanceConfigMapName).
Data(r.repoMaintenanceConfigKey, string(result)).Result()
r.VeleroCfg.RepoMaintenanceJobConfigMap = r.repoMaintenanceConfigMapName
return veleroutil.PrepareVelero(
r.Ctx,
r.CaseBaseName,
r.VeleroCfg,
repoMaintenanceConfig,
)
}
func (r *RepoMaintenanceTestCase) CreateResources() error {
for _, ns := range *r.NSIncluded {
if err := k8sutil.CreateNamespace(r.Ctx, r.Client, ns); err != nil {
fmt.Printf("Fail to create ns %s: %s\n", ns, err.Error())
return err
}
pvc, err := k8sutil.CreatePVC(r.Client, ns, "volume-1", test.StorageClassName, nil)
if err != nil {
fmt.Printf("Fail to create PVC %s: %s\n", "volume-1", err.Error())
return err
}
vols := k8sutil.CreateVolumes(pvc.Name, []string{"volume-1"})
deployment := k8sutil.NewDeployment(
r.CaseBaseName,
(*r.NSIncluded)[0],
1,
map[string]string{"app": "test"},
r.VeleroCfg.ImageRegistryProxy,
r.VeleroCfg.WorkerOS,
).WithVolume(vols).Result()
deployment, err = k8sutil.CreateDeployment(r.Client.ClientGo, ns, deployment)
if err != nil {
fmt.Printf("Fail to create deployment %s: %s \n", deployment.Name, err.Error())
return errors.Wrap(err, fmt.Sprintf("failed to create deployment: %s", err.Error()))
}
if err := k8sutil.WaitForReadyDeployment(r.Client.ClientGo, deployment.Namespace, deployment.Name); err != nil {
fmt.Printf("Fail to create deployment %s: %s\n", r.CaseBaseName, err.Error())
return err
}
}
return nil
}
func (r *RepoMaintenanceTestCase) Verify() error {
// Reduce the MaintenanceFrequency to 1 minute.
backupRepositoryList := new(velerov1api.BackupRepositoryList)
if err := r.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(
r.Ctx,
backupRepositoryList,
&client.ListOptions{
Namespace: r.VeleroCfg.Namespace,
LabelSelector: labels.SelectorFromSet(map[string]string{velerov1api.VolumeNamespaceLabel: (*r.NSIncluded)[0]}),
},
); err != nil {
return err
}
if len(backupRepositoryList.Items) <= 0 {
return fmt.Errorf("fail list BackupRepository. no item is returned")
}
backupRepository := backupRepositoryList.Items[0]
updated := backupRepository.DeepCopy()
updated.Spec.MaintenanceFrequency = metav1.Duration{Duration: time.Minute}
if err := r.VeleroCfg.ClientToInstallVelero.Kubebuilder.Patch(r.Ctx, updated, client.MergeFrom(&backupRepository)); err != nil {
fmt.Printf("failed to patch BackupRepository %q: %s", backupRepository.GetName(), err.Error())
return err
}
// The minimal time unit of Repository Maintenance is 5 minutes.
// Wait for more than one cycles to make sure the result is valid.
time.Sleep(6 * time.Minute)
jobList := new(batchv1api.JobList)
if err := r.VeleroCfg.ClientToInstallVelero.Kubebuilder.List(r.Ctx, jobList, &client.ListOptions{
Namespace: r.VeleroCfg.Namespace,
LabelSelector: labels.SelectorFromSet(map[string]string{"velero.io/repo-name": backupRepository.Name}),
}); err != nil {
return nil
}
resources, err := kube.ParseResourceRequirements(
r.jobConfigs.PodResources.CPURequest,
r.jobConfigs.PodResources.MemoryRequest,
r.jobConfigs.PodResources.CPULimit,
r.jobConfigs.PodResources.MemoryLimit,
)
if err != nil {
return errors.Wrap(err, "failed to parse resource requirements for maintenance job")
}
Expect(jobList.Items[0].Spec.Template.Spec.Containers[0].Resources).To(Equal(resources))
Expect(jobList.Items).To(HaveLen(*r.jobConfigs.KeepLatestMaintenanceJobs))
Expect(jobList.Items[0].Spec.Template.Spec.PriorityClassName).To(Equal(r.jobConfigs.PriorityClassName))
return nil
}

View File

@@ -41,7 +41,6 @@ depends on your test patterns.
*/
type VeleroBackupRestoreTest interface {
Init() error
InstallVelero() error
CreateResources() error
Backup() error
Destroy() error
@@ -110,10 +109,6 @@ func (t *TestCase) GenerateUUID() string {
return fmt.Sprintf("%08d", rand.IntN(100000000))
}
func (t *TestCase) InstallVelero() error {
return PrepareVelero(context.Background(), t.GetTestCase().CaseBaseName, t.GetTestCase().VeleroCfg)
}
func (t *TestCase) CreateResources() error {
return nil
}
@@ -226,8 +221,7 @@ func RunTestCase(test VeleroBackupRestoreTest) error {
fmt.Printf("Running test case %s %s\n", test.GetTestMsg().Desc, time.Now().Format("2006-01-02 15:04:05"))
if InstallVelero {
fmt.Printf("Install Velero for test case %s: %s", test.GetTestCase().CaseBaseName, time.Now().Format("2006-01-02 15:04:05"))
Expect(test.InstallVelero()).To(Succeed())
Expect(PrepareVelero(context.Background(), test.GetTestCase().CaseBaseName, test.GetTestCase().VeleroCfg)).To(Succeed())
}
defer test.Clean()

View File

@@ -57,11 +57,6 @@ const (
BackupRepositoryConfigName = "backup-repository-config"
)
const (
PriorityClassNameForDataMover = "data-mover"
PriorityClassNameForRepoMaintenance = "repo-maintenance"
)
var PublicCloudProviders = []string{AWS, Azure, GCP, Vsphere}
var LocalCloudProviders = []string{Kind, VanillaZFS}
var CloudProviders = append(PublicCloudProviders, LocalCloudProviders...)

View File

@@ -36,7 +36,6 @@ import (
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/util/wait"
clientset "k8s.io/client-go/kubernetes"
"sigs.k8s.io/controller-runtime/pkg/client"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/cmd/cli/install"
@@ -57,17 +56,7 @@ type installOptions struct {
WorkerOS string
}
/*
VeleroInstall is used to install Velero for E2E test
params:
ctx: The context
veleroCfg: Velero E2E case configuration
isStandbyCluster: Whether Velero is installed on standby cluster
objects: The objects are installed in Velero installed namespace, e.g. the ConfigMaps.
*/
func VeleroInstall(ctx context.Context, veleroCfg *test.VeleroConfig, isStandbyCluster bool, objects ...client.Object) error {
func VeleroInstall(ctx context.Context, veleroCfg *test.VeleroConfig, isStandbyCluster bool) error {
fmt.Printf("Velero install %s\n", time.Now().Format("2006-01-02 15:04:05"))
// veleroCfg struct including a set of BSL params and a set of additional BSL params,
@@ -163,15 +152,6 @@ func VeleroInstall(ctx context.Context, veleroCfg *test.VeleroConfig, isStandbyC
veleroCfg.VeleroNamespace,
)
}
veleroCfg.BackupRepoConfigMap = test.BackupRepositoryConfigName
// Install the passed-in objects in Velero installed namespace
for _, obj := range objects {
if err := veleroCfg.ClientToInstallVelero.Kubebuilder.Create(ctx, obj); err != nil {
fmt.Printf("fail to create object %s in namespace %s: %s\n", obj.GetName(), obj.GetNamespace(), err.Error())
return fmt.Errorf("fail to create object %s in namespace %s: %w", obj.GetName(), obj.GetNamespace(), err)
}
}
// For AWS IRSA credential test, AWS IAM service account is required, so if ServiceAccountName and EKSPolicyARN
// are both provided, we assume IRSA test is running, otherwise skip this IAM service account creation part.
@@ -332,9 +312,9 @@ func installVeleroServer(
// TODO: need to consider align options.UseNodeAgentWindows usage
// with options.UseNodeAgent
// Only version after v1.16.0 support windows node agent.
// Only version after v1.16.0 or release-1.16-dev support windows node agent.
if options.WorkerOS == common.WorkerOSWindows &&
(semver.Compare(version, "v1.16") >= 0 || version == "main") {
(semver.Compare(version, "v1.16") >= 0 || version == "main" || strings.Compare(version, "release-1.16-dev") >= 0) {
fmt.Println("Install node-agent-windows. The Velero version is ", version)
args = append(args, "--use-node-agent-windows")
}
@@ -655,7 +635,7 @@ func patchResources(resources *unstructured.UnstructuredList, namespace string,
APIVersion: corev1api.SchemeGroupVersion.String(),
},
ObjectMeta: metav1.ObjectMeta{
Name: "fs-restore-action-config",
Name: "restic-restore-action-config",
Namespace: namespace,
Labels: map[string]string{
"velero.io/plugin-config": "",
@@ -672,7 +652,7 @@ func patchResources(resources *unstructured.UnstructuredList, namespace string,
return errors.Wrapf(err, "failed to convert restore action config to unstructure")
}
resources.Items = append(resources.Items, un)
fmt.Printf("the restic restore helper image is set by the configmap %q \n", "fs-restore-action-config")
fmt.Printf("the restic restore helper image is set by the configmap %q \n", "restic-restore-action-config")
}
return nil
@@ -810,7 +790,7 @@ func CheckBSL(ctx context.Context, ns string, bslName string) error {
return err
}
func PrepareVelero(ctx context.Context, caseName string, veleroCfg test.VeleroConfig, objects ...client.Object) error {
func PrepareVelero(ctx context.Context, caseName string, veleroCfg test.VeleroConfig) error {
ready, err := IsVeleroReady(context.Background(), &veleroCfg)
if err != nil {
fmt.Printf("error in checking velero status with %v", err)
@@ -824,7 +804,7 @@ func PrepareVelero(ctx context.Context, caseName string, veleroCfg test.VeleroCo
return nil
}
fmt.Printf("need to install velero for case %s \n", caseName)
return VeleroInstall(context.Background(), &veleroCfg, false, objects...)
return VeleroInstall(context.Background(), &veleroCfg, false)
}
func VeleroUninstall(ctx context.Context, veleroCfg test.VeleroConfig) error {

View File

@@ -38,8 +38,6 @@ import (
"github.com/pkg/errors"
"golang.org/x/mod/semver"
schedulingv1api "k8s.io/api/scheduling/v1"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/labels"
ver "k8s.io/apimachinery/pkg/util/version"
"k8s.io/apimachinery/pkg/util/wait"
@@ -47,11 +45,9 @@ import (
"github.com/vmware-tanzu/velero/internal/volume"
velerov1api "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
"github.com/vmware-tanzu/velero/pkg/builder"
cliinstall "github.com/vmware-tanzu/velero/pkg/cmd/cli/install"
"github.com/vmware-tanzu/velero/pkg/cmd/util/flag"
veleroexec "github.com/vmware-tanzu/velero/pkg/util/exec"
"github.com/vmware-tanzu/velero/test"
. "github.com/vmware-tanzu/velero/test"
common "github.com/vmware-tanzu/velero/test/util/common"
. "github.com/vmware-tanzu/velero/test/util/k8s"
@@ -278,9 +274,6 @@ func getProviderVeleroInstallOptions(veleroCfg *VeleroConfig,
io.ItemBlockWorkerCount = veleroCfg.ItemBlockWorkerCount
io.ServerPriorityClassName = veleroCfg.ServerPriorityClassName
io.NodeAgentPriorityClassName = veleroCfg.NodeAgentPriorityClassName
io.RepoMaintenanceJobConfigMap = veleroCfg.RepoMaintenanceJobConfigMap
io.BackupRepoConfigMap = veleroCfg.BackupRepoConfigMap
io.NodeAgentConfigMap = veleroCfg.NodeAgentConfigMap
return io, nil
}
@@ -1819,43 +1812,3 @@ func KubectlGetAllDeleteBackupRequest(ctx context.Context, backupName, veleroNam
return common.GetListByCmdPipes(ctx, cmds)
}
func CreatePriorityClasses(ctx context.Context, client kbclient.Client) error {
dataMoverPriorityClass := builder.ForPriorityClass(test.PriorityClassNameForDataMover).
Value(90000).PreemptionPolicy("Never").Result()
if err := client.Create(ctx, dataMoverPriorityClass); err != nil {
fmt.Printf("Fail to create PriorityClass %s: %s\n", test.PriorityClassNameForDataMover, err.Error())
return fmt.Errorf("fail to create PriorityClass %s: %w", test.PriorityClassNameForDataMover, err)
}
repoMaintenancePriorityClass := builder.ForPriorityClass(test.PriorityClassNameForRepoMaintenance).
Value(80000).PreemptionPolicy("Never").Result()
if err := client.Create(ctx, repoMaintenancePriorityClass); err != nil {
fmt.Printf("Fail to create PriorityClass %s: %s\n", test.PriorityClassNameForRepoMaintenance, err.Error())
return fmt.Errorf("fail to create PriorityClass %s: %w", test.PriorityClassNameForRepoMaintenance, err)
}
return nil
}
func DeletePriorityClasses(ctx context.Context, client kbclient.Client) error {
priorityClassDataMover := &schedulingv1api.PriorityClass{
ObjectMeta: metav1.ObjectMeta{
Name: test.PriorityClassNameForDataMover,
},
}
if err := client.Delete(ctx, priorityClassDataMover); err != nil {
return err
}
priorityClassRepoMaintenance := &schedulingv1api.PriorityClass{
ObjectMeta: metav1.ObjectMeta{
Name: test.PriorityClassNameForRepoMaintenance,
},
}
if err := client.Delete(ctx, priorityClassRepoMaintenance); err != nil {
return err
}
return nil
}