Compare commits

..

8 Commits

Author SHA1 Message Date
Xun Jiang/Bruce Jiang 80067a8863 Fix unknown containerd config version error in run-e2e-test action (#9883)
Run the E2E test on kind / get-go-version (push) Successful in 56s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Main CI / get-go-version (push) Successful in 13s
Run the E2E test on kind / build (push) Failing after 5m49s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 25s
Bump kind version to v0.32.0 to support both v2, v3, and v4 version of containerd config.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2026-06-04 10:06:09 -04:00
lyndon-li 81757c272b Merge pull request #9857 from blackpiglet/jxun/1.18_n-3_upgrade_test
Run the E2E test on kind / get-go-version (push) Successful in 51s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Main CI / get-go-version (push) Successful in 13s
Run the E2E test on kind / build (push) Failing after 5m46s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 35s
Modify the e2e upgrade test to support n-1 upgrade.
2026-05-28 16:57:01 +08:00
Xun Jiang ff20d670d1 Modify the e2e upgrade test to support n-1 upgrade.
Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
2026-05-28 09:29:34 +08:00
lyndon-li a43a13b6ec Merge pull request #9842 from Lyndon-Li/release-1.18
Run the E2E test on kind / get-go-version (push) Successful in 1m5s
Run the E2E test on kind / setup-test-matrix (push) Successful in 3s
Main CI / get-go-version (push) Successful in 14s
Run the E2E test on kind / build (push) Failing after 8m12s
Run the E2E test on kind / run-e2e-test (push) Has been skipped
Main CI / Build (push) Failing after 36s
[1.18] Fix DataUploadDeleteAction creating CMs for foreign DataUploads
2026-05-22 15:48:28 +08:00
Tiger Kaovilai 8f565d495e Add changelog for unreleased version 9791
Signed-off-by: Tiger Kaovilai <tkaovila@redhat.com>
2026-05-22 15:25:00 +08:00
Christian Schlichtherle 2309e98e86 Also skip snapshot-info CM when DataUpload has no owner label
Per review feedback on #9791, the previous revision still let a
DataUpload with an empty velero.io/backup-name label fall through to
genConfigmap, creating a ConfigMap that deleteMovedSnapshots can never
match back to a snapshot. The CM is useless and only adds etcd churn.

Treat the missing-label case the same way as the foreign-owner case:
warn and skip the ConfigMap creation. Use a distinct warn message so
operators can tell the two misconfiguration classes apart in logs
(missing-label vs. owner mismatch from a captured velero namespace).

Test for the missing-label case is updated to assert no ConfigMap is
created and a warn is emitted. The warn assertion is generalized to
match the per-case message substring instead of a fixed string.

Signed-off-by: Christian Schlichtherle <christian@schlichtherle.de>
2026-05-22 15:22:00 +08:00
Christian Schlichtherle 14733d8892 Warn instead of silently skipping foreign DataUploads
Velero does not support self-protection: the velero namespace must
never be captured in a backup tarball. When it is, the tarball can
contain DataUpload CRs belonging to other backups, and the previous
revision of this change silently swallowed that case in the
DataUploadDeleteAction.

Per maintainer feedback, the action should make the misconfiguration
detectable rather than silent. Emit a warn-level log naming the
DataUpload, its owning backup-name label, and the executing backup,
and call out that the velero namespace should be excluded from
schedules. Continue to skip the snapshot-info ConfigMap creation so
that a mislabeled CM does not mask the real owning backup's snapshot
on deletion.

The test for the foreign-backup case now also asserts the warn is
emitted via a logrus test hook.

Signed-off-by: Christian Schlichtherle <cs@bsure-analytics.de>
2026-05-22 15:21:48 +08:00
Christian Schlichtherle 246dbc3c33 Fix DataUploadDeleteAction creating CMs for foreign DataUploads
When a backup tarball incidentally contains DataUpload CRs that belong to
a different backup (common when a schedule includes the velero namespace
where DataUploads live), DataUploadDeleteAction.Execute used to create a
"<du-name>-info" ConfigMap labeled with the *executing* backup's name
instead of the DataUpload's true owning backup. The ConfigMap is
created with Create-only semantics, so the wrong label is never
corrected.

deleteMovedSnapshots in the backup-deletion controller looks up these
ConfigMaps by velero.io/backup-name to discover which Kopia snapshots
to delete. With the wrong label, the real owning backup's expiry pass
finds no ConfigMaps for its DataUploads and silently leaves their Kopia
snapshots in object storage, leaking data over time.

Fix: in DataUploadDeleteAction.Execute, compare the DataUpload's
velero.io/backup-name label against input.Backup.Name (using
label.GetValidName to handle DNS-1035 truncation for long backup names).
If the label is present and differs, skip the DataUpload entirely; this
prevents the over-eager creation of misnamed ConfigMaps without changing
behavior for DataUploads that legitimately belong to the executing
backup, or for legacy DataUploads with no backup-name label.

Refs: #9472

Signed-off-by: Christian Schlichtherle <cs@bsure-analytics.de>
2026-05-22 15:21:18 +08:00
5 changed files with 213 additions and 2 deletions
+1 -1
View File
@@ -136,7 +136,7 @@ jobs:
- uses: engineerd/setup-kind@v0.6.2
with:
skipClusterLogsExport: true
version: "v0.27.0"
version: "v0.32.0"
image: "kindest/node:v${{ matrix.k8s }}"
- name: Fetch built CLI
id: cli-cache
@@ -0,0 +1 @@
Fix DataUploadDeleteAction creating snapshot-info ConfigMaps labeled with the wrong backup name when a DataUpload CR from another backup is incidentally captured in the backup tarball, which caused Kopia snapshots to be leaked in object storage on expiry of the real owning backup.
+39
View File
@@ -14,6 +14,7 @@ import (
velerov1 "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
velerov2alpha1 "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
"github.com/vmware-tanzu/velero/pkg/label"
"github.com/vmware-tanzu/velero/pkg/plugin/velero"
repotypes "github.com/vmware-tanzu/velero/pkg/repository/types"
)
@@ -35,6 +36,44 @@ func (d *DataUploadDeleteAction) Execute(input *velero.DeleteItemActionExecuteIn
if err := runtime.DefaultUnstructuredConverter.FromUnstructured(input.Item.UnstructuredContent(), &du); err != nil {
return errors.WithStack(errors.Wrapf(err, "failed to convert input.Item from unstructured"))
}
// Only create a snapshot-info ConfigMap when the DataUpload's owning
// backup (its velero.io/backup-name label) matches the backup currently
// being deleted. Two other cases reach this code path and must be
// skipped, because the resulting CM would be unmatchable and only adds
// etcd churn:
//
// 1. The label is missing. We have no verifiable owner, so a CM created
// with the executing backup's label is a guess that deleteMovedSnapshots
// cannot rely on.
// 2. The label names a different backup. Velero does not support
// self-protection, so this almost always means the velero namespace
// was captured in a backup tarball and the DataUpload CR belongs to
// an unrelated backup. Creating a CM labeled with the executing
// backup mislabels the snapshot and causes the real owning backup's
// deleteMovedSnapshots query to miss it, leaking the Kopia snapshot
// in the object store.
//
// Both cases warn so misconfigured installs surface in logs.
owner := du.Labels[velerov1.BackupNameLabel]
switch {
case owner == "":
d.logger.Warnf(
"DataUpload %q has no %q label, so its owning backup cannot be verified; "+
"skipping snapshot-info ConfigMap creation because a CM without a verifiable owner "+
"cannot be matched back to its snapshot at backup deletion time.",
du.Name, velerov1.BackupNameLabel,
)
return nil
case owner != label.GetValidName(input.Backup.Name):
d.logger.Warnf(
"DataUpload %q belongs to backup %q but is being deleted under backup %q; "+
"this almost always means the velero namespace was included in a backup tarball. "+
"Velero does not support self-protection — exclude the velero namespace from your schedules. "+
"Skipping snapshot-info ConfigMap creation to avoid mislabeling.",
du.Name, owner, input.Backup.Name,
)
return nil
}
cm := genConfigmap(input.Backup, *du)
if cm == nil {
// will not fail the backup deletion
@@ -0,0 +1,171 @@
/*
Copyright the Velero contributors.
Licensed under the Apache License, Version 2.0 (the "License");
you may not use this file except in compliance with the License.
You may obtain a copy of the License at
http://www.apache.org/licenses/LICENSE-2.0
Unless required by applicable law or agreed to in writing, software
distributed under the License is distributed on an "AS IS" BASIS,
WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
See the License for the specific language governing permissions and
limitations under the License.
*/
package datamover
import (
"fmt"
"strings"
"testing"
"github.com/sirupsen/logrus"
logrustest "github.com/sirupsen/logrus/hooks/test"
"github.com/stretchr/testify/assert"
"github.com/stretchr/testify/require"
corev1api "k8s.io/api/core/v1"
apierrors "k8s.io/apimachinery/pkg/api/errors"
metav1 "k8s.io/apimachinery/pkg/apis/meta/v1"
"k8s.io/apimachinery/pkg/apis/meta/v1/unstructured"
"k8s.io/apimachinery/pkg/runtime"
crclient "sigs.k8s.io/controller-runtime/pkg/client"
velerov1 "github.com/vmware-tanzu/velero/pkg/apis/velero/v1"
velerov2alpha1 "github.com/vmware-tanzu/velero/pkg/apis/velero/v2alpha1"
"github.com/vmware-tanzu/velero/pkg/builder"
"github.com/vmware-tanzu/velero/pkg/plugin/velero"
velerotest "github.com/vmware-tanzu/velero/pkg/test"
)
func toUnstructured(t *testing.T, du *velerov2alpha1.DataUpload) runtime.Unstructured {
t.Helper()
m, err := runtime.DefaultUnstructuredConverter.ToUnstructured(du)
require.NoError(t, err)
return &unstructured.Unstructured{Object: m}
}
func newCompletedDataUpload(name, ownerBackup string) *velerov2alpha1.DataUpload {
du := &velerov2alpha1.DataUpload{
TypeMeta: metav1.TypeMeta{
APIVersion: velerov2alpha1.SchemeGroupVersion.String(),
Kind: "DataUpload",
},
ObjectMeta: metav1.ObjectMeta{
Namespace: "velero",
Name: name,
},
Spec: velerov2alpha1.DataUploadSpec{
SnapshotType: velerov2alpha1.SnapshotTypeCSI,
SourcePVC: "my-pvc",
SourceNamespace: "app",
BackupStorageLocation: "default",
DataMover: "velero",
},
Status: velerov2alpha1.DataUploadStatus{
Phase: velerov2alpha1.DataUploadPhaseCompleted,
SnapshotID: "kopia-snapshot-id",
},
}
if ownerBackup != "" {
du.Labels = map[string]string{velerov1.BackupNameLabel: ownerBackup}
}
return du
}
func TestDataUploadDeleteActionAppliesTo(t *testing.T) {
a := NewDataUploadDeleteAction(logrus.StandardLogger(), nil)
selector, err := a.AppliesTo()
require.NoError(t, err)
require.Equal(t, velero.ResourceSelector{IncludedResources: []string{"datauploads.velero.io"}}, selector)
}
func TestDataUploadDeleteActionExecute(t *testing.T) {
tests := []struct {
name string
duName string
duOwnerBackup string // value placed in velero.io/backup-name label on the DataUpload
executingBackup string // name of the Backup being deleted (input.Backup.Name)
wantConfigMap bool
wantWarnContains string // substring expected in a warn-level log entry; empty means no warn expected
}{
{
name: "DataUpload owned by the executing backup creates a snapshot-info ConfigMap",
duName: "daily-backup-abcde",
duOwnerBackup: "daily-backup",
executingBackup: "daily-backup",
wantConfigMap: true,
wantWarnContains: "",
},
{
name: "DataUpload owned by a different backup is skipped and a warning is logged",
duName: "daily-backup-abcde",
duOwnerBackup: "daily-backup",
executingBackup: "hourly-backup",
wantConfigMap: false,
wantWarnContains: "velero namespace",
},
{
name: "DataUpload with no backup-name label is skipped and a warning is logged",
duName: "unlabeled-du",
duOwnerBackup: "",
executingBackup: "some-backup",
wantConfigMap: false,
wantWarnContains: "cannot be verified",
},
}
for _, tc := range tests {
t.Run(tc.name, func(t *testing.T) {
crClient := velerotest.NewFakeControllerRuntimeClient(t)
logger, hook := logrustest.NewNullLogger()
logger.SetLevel(logrus.DebugLevel)
action := NewDataUploadDeleteAction(logger, crClient)
du := newCompletedDataUpload(tc.duName, tc.duOwnerBackup)
backup := builder.ForBackup("velero", tc.executingBackup).StorageLocation("default").Result()
err := action.Execute(&velero.DeleteItemActionExecuteInput{
Item: toUnstructured(t, du),
Backup: backup,
})
require.NoError(t, err)
cm := &corev1api.ConfigMap{}
getErr := crClient.Get(t.Context(), crclient.ObjectKey{
Namespace: backup.Namespace,
Name: fmt.Sprintf("%s-info", du.Name),
}, cm)
if tc.wantConfigMap {
require.NoError(t, getErr, "expected snapshot-info ConfigMap to be created")
assert.Equal(t, tc.executingBackup, cm.Labels[velerov1.BackupNameLabel])
assert.Equal(t, "true", cm.Labels[velerov1.DataUploadSnapshotInfoLabel])
} else {
require.Error(t, getErr)
assert.True(t, apierrors.IsNotFound(getErr),
"expected no ConfigMap to be created for foreign DataUpload, but got: %v", getErr)
}
// The action must surface DataUploads it cannot generate a useful
// snapshot-info ConfigMap for as warnings, so operators who
// accidentally included the velero namespace in a backup (or
// otherwise produced DataUploads without a verifiable owner) can
// detect the misconfiguration from logs instead of having the
// case silently swallowed.
var sawWarn bool
for _, entry := range hook.AllEntries() {
if entry.Level == logrus.WarnLevel &&
strings.Contains(entry.Message, tc.duName) &&
(tc.wantWarnContains == "" || strings.Contains(entry.Message, tc.wantWarnContains)) {
sawWarn = true
break
}
}
assert.Equal(t, tc.wantWarnContains != "", sawWarn,
"unexpected warn log presence (wantContains=%q, got=%v); entries=%v",
tc.wantWarnContains, sawWarn, hook.AllEntries())
})
}
}
+1 -1
View File
@@ -76,7 +76,7 @@ HAS_VSPHERE_PLUGIN ?= false
RESTORE_HELPER_IMAGE ?=
#Released version only
UPGRADE_FROM_VELERO_VERSION ?= v1.16.2,v1.17.2
UPGRADE_FROM_VELERO_VERSION ?= v1.15.2,v1.16.2,v1.17.2
# UPGRADE_FROM_VELERO_CLI can has the same format(a list divided by comma) with UPGRADE_FROM_VELERO_VERSION
# Upgrade tests will be executed sequently according to the list by UPGRADE_FROM_VELERO_VERSION