Compare commits

...

3 Commits

Author SHA1 Message Date
copilot-swe-agent[bot] 360e7ef707 Refine Kopia-only performance and maintenance docs 2026-06-04 16:30:38 +00:00
copilot-swe-agent[bot] 9331bffa7d Remove restic references from main docs 2026-06-04 16:28:39 +00:00
copilot-swe-agent[bot] c602efe054 Initial plan 2026-06-04 16:23:42 +00:00
5 changed files with 17 additions and 69 deletions
@@ -18,7 +18,7 @@ Conclusively, you have two ways to add/change/delete configurations of a backup
- If the BackupRepository CR for the backup repository is already there, you should modify the `repositoryConfig` field. The new changes will be applied to the backup repository at the due time, it doesn't require Velero server to restart.
- Otherwise, you can create the backup repository configMap as a template for the BackupRepository CRs that are going to be created.
The backup repository configMap is repository type (i.e., kopia, restic) specific, so for one repository type, you only need to create one set of configurations, they will be applied to all BackupRepository CRs of the same type. Whereas, the changes of `repositoryConfig` field apply to the specific BackupRepository CR only, you may need to change every BackupRepository CR of the same type.
The backup repository configMap is repository type specific (for example, `kopia`), so for one repository type, you only need to create one set of configurations, they will be applied to all BackupRepository CRs of the same type. Whereas, the changes of `repositoryConfig` field apply to the specific BackupRepository CR only, you may need to change every BackupRepository CR of the same type.
Below is an example of the BackupRepository configMap with the configurations:
```yaml
+1 -37
View File
@@ -5,7 +5,7 @@ layout: docs
Velero supports backing up and restoring Kubernetes volumes attached to pods from the file system of the volumes, called
File System Backup (FSB shortly) or Pod Volume Backup. The data movement is fulfilled by using modules from free open-source
backup tools [restic][1] and [kopia][2]. This support is considered beta quality. Please see the list of [limitations](#limitations)
backup tool [kopia][2]. This support is considered beta quality. Please see the list of [limitations](#limitations)
to understand if it fits your use case.
Velero allows you to take snapshots of persistent volumes as part of your backups if youre using one of
@@ -38,7 +38,6 @@ It's important to understand that File System Backup (FSB) and volume snapshots
This behavior is automatic and ensures optimal backup performance and storage usage.
**NOTE:** hostPath volumes are not supported, but the [local volume type][5] is supported.
**NOTE:** restic is under the deprecation process by following [Velero Deprecation Policy][17], for more details, see the Restic Deprecation section.
## Setup File System Backup
@@ -710,39 +709,6 @@ For Kopia repository, by default, the cache is stored in the data mover pod's ro
- configure a limit of the cache size per backup repository, for more details, check [Backup Repository Configuration][18].
- configure a dedicated volume for cache data, for more details, check [Data Movement Cache Volume][22].
## Restic Deprecation
According to the [Velero Deprecation Policy][17], restic path is being deprecated starting from v1.15, specifically:
- For 1.15 and 1.16, if restic path is used by a backup, the backup still creates and succeeds but you will see warnings
- For 1.17 and 1.18, backups with restic path are disabled, but you are still allowed to restore from your previous restic backups
- From 1.19, both backups and restores with restic path will be disabled, you are not able to use 1.19 or higher to restore your restic backup data
From 1.17, backup from restic path is not allowed, though you can still restore from the existing backups created by restic path.
Velero could automatically identify the legacy backups and switch to restic path without user intervention.
### How Velero integrates with Restic
Velero integrate Restic binary directly, so the operations are done by calling Restic commands:
- Run `restic init` command to initialize the [restic repository](https://restic.readthedocs.io/en/latest/100_references.html#terminology)
- Run `restic prune` command periodically to prune restic repository
- Run `restic restore` commands to restore pod volume data
For a restore from restic path, restic commands are called by the node-agent itself; whereas, for kopia path backup/restore, the data path runs in the data mover pods.
Restore from restic path is handled by the legacy `PodVolumeRestore` controller, so Resume and Cancellation are not supported:
- When Velero server is restarted, the legacy `PodVolumeRestore` is left as orphan and contineue running, though the restore has already marked as `Failed`
- When node-agent is restarted, the `PodVolumeRestore` is marked as `Failed` directly
### Restic Repository
To support restic repository, the BackupRepository CR should be specially configured:
- You need to set the `resticRepoPrefix` value in BackupStorageLocation. For example, on AWS, `resticRepoPrefix` is something like
`s3:s3-us-west-2.amazonaws.com/bucket` (note that `resticRepoPrefix` doesn't work for Kopia).
Velero still effectively manage restic repository, though you cannot write any new backup to it:
- When you delete a backup, the restic repository snapshots (if any) could be deleted from restic repository
- Velero backup repository controller periodically runs mainteance jobs for BackupRepository CRs representing restic repositories
[1]: https://github.com/restic/restic
[2]: https://github.com/kopia/kopia
[3]: customize-installation.md#enable-file-system-backup
[4]: https://github.com/velero-io/velero/releases/
@@ -750,7 +716,6 @@ Velero still effectively manage restic repository, though you cannot write any n
[6]: https://kubernetes.io/docs/concepts/storage/volumes/#mount-propagation
[7]: https://github.com/bitsbeats/velero-pvc-watcher
[8]: https://docs.microsoft.com/en-us/azure/aks/azure-files-dynamic-pv
[9]: https://github.com/restic/restic/issues/1800
[10]: customize-installation.md#default-pod-volume-backup-to-file-system-backup
[11]: https://www.vcluster.com/
[12]: csi.md
@@ -758,7 +723,6 @@ Velero still effectively manage restic repository, though you cannot write any n
[14]: https://kubernetes.io/docs/concepts/workloads/pods/pod-qos/
[15]: customize-installation.md#customize-resource-requests-and-limits
[16]: performance-guidance.md
[17]: https://github.com/velero-io/velero/blob/main/GOVERNANCE.md#deprecation-policy
[18]: backup-repository-configuration.md
[19]: node-agent-concurrency.md
[20]: node-agent-prepare-queue-length.md
+12 -27
View File
@@ -3,9 +3,9 @@ title: "Velero File System Backup Performance Guide"
layout: docs
---
When using Velero to do file system backup & restore, Restic uploader or Kopia uploader are both supported now. But the resources used and time consumption are a big difference between them.
When using Velero to do file system backup & restore, Kopia uploader performance can vary based on data shape and resource settings.
We've done series rounds of tests against Restic uploader and Kopia uploader through Velero, which may give you some guidance. But the test results will vary from different infrastructures, and our tests are limited and couldn't cover a variety of data scenarios, **the test results and analysis are for reference only**.
We've done several rounds of tests against Kopia uploader through Velero, which may give you some guidance. But the test results will vary from different infrastructures, and our tests are limited and couldn't cover a variety of data scenarios, **the test results and analysis are for reference only**.
## Infrastructure
@@ -79,25 +79,21 @@ Server:
## Test
Below we've done 6 groups of tests, for each single group of test, we used limited resources (1 core CPU 2 GB memory or 4 cores CPU 4 GB memory) to do Velero file system backup under Restic path and Kopia path, and then compare the results.
Below we've done 6 groups of tests. For each single group of test, we used limited resources (1 core CPU 2 GB memory or 4 cores CPU 4 GB memory) to do Velero file system backup under Kopia path.
Recorded the metrics of time consumption, maximum CPU usage, maximum memory usage, and minio storage usage for node-agent daemonset, and the metrics of Velero deployment are not included since the differences are not obvious by whether using Restic uploader or Kopia uploader.
Recorded the metrics of time consumption, maximum CPU usage, maximum memory usage, and minio storage usage for node-agent daemonset. The metrics of Velero deployment are not included.
Compression is either disabled or not unavailable for both uploader.
Compression is disabled for testing purposes.
### Case 1: 4194304(4M) files, 2396745(2M) directories, 0B per file total 0B content
#### result:
|Uploader| Resources|Times |Max CPU|Max Memory|Repo Usage|
|--------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c2g |24m54s| 65% |1530 MB |80 MB |
| Restic | 1c2g |52m31s| 55% |1708 MB |3.3 GB |
| Kopia | 4c4g |24m52s| 63% |2216 MB |80 MB |
| Restic | 4c4g |52m28s| 54% |2329 MB |3.3 GB |
#### conclusion:
- The memory usage is larger than Velero's default memory limit (1GB) for both Kopia and Restic under massive empty files.
- For both using Kopia uploader and Restic uploader, there is no significant time reduction by increasing resources from 1c2g to 4c4g.
- Restic uploader is one more time slower than Kopia uploader under the same specification resources.
- Restic has an **irrational** repository size (3.3GB)
- The memory usage is larger than Velero's default memory limit (1GB) for Kopia under massive empty files.
- There is no significant time reduction by increasing resources from 1c2g to 4c4g.
### Case 2: Using the same size (100B) of file and default Velero's resource configuration, the testing quantity of files from 20 thousand to 2 million, these groups of cases mainly test the behavior with the increasing quantity of files.
@@ -106,58 +102,47 @@ Compression is either disabled or not unavailable for both uploader.
| Uploader | Resources|Times |Max CPU|Max Memory|Repo Usage|
|-------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c1g |2m34s | 70% |692 MB |108 MB |
| Restic| 1c1g |3m9s | 54% |714 MB |275 MB |
### Case 2.2 470596(40k) files, 137257 (10k)directories, 100B per file total 44.880MB content
#### result:
| Uploader | Resources|Times |Max CPU|Max Memory|Repo Usage|
|-------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c1g |3m45s | 68% |831 MB |108 MB |
| Restic| 1c1g |4m53s | 57% |788 MB |275 MB |
### Case 2.3 705894(70k) files, 137257(10k) directories, 100B per file total 67.319MB content
#### result:
|Uploader| Resources|Times |Max CPU|Max Memory|Repo Usage|
|--------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c1g |5m06s | 71% |861 MB |108 MB |
| Restic | 1c1g |6m23s | 56% |810 MB |275 MB |
### Case 2.4 2097152(2M) files, 2396745(2M) directories, 100B per file total 200.000MB content
#### result:
|Uploader| Resources|Times |Max CPU|Max Memory|Repo Usage|
|--------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c1g |OOM | 74% |N/A |N/A |
| Restic | 1c1g |41m47s| 52% |904 MB |3.2 GB |
#### conclusion:
- With the increasing number of files, there is no memory abnormal surge, the memory usage for both Kopia uploader and Restic uploader is linear increasing, until exceeds 1GB memory usage in Case 2.4 Kopia uploader OOM happened.
- With the increasing number of files, there is no memory abnormal surge, and memory usage is linearly increasing until it exceeds 1GB where Case 2.4 Kopia uploader OOM happened.
- Kopia uploader gets increasingly faster along with the increasing number of files.
- Restic uploader repository size is still much larger than Kopia uploader repository.
### Case 3: 10625(10k) files, 781 directories, 1.000MB per file total 10.376GB content
#### result:
|Uploader| Resources|Times |Max CPU|Max Memory|Repo Usage|
|--------|----------|:----:|------:|:--------:|:--------:|
| Kopia | 1c2g |1m37s | 75% |251 MB |10 GB |
| Restic | 1c2g |5m25s | 100% |153 MB |10 GB |
| Kopia | 4c4g |1m35s | 75% |248 MB |10 GB |
| Restic | 4c4g |3m17s | 171% |126 MB |10 GB |
#### conclusion:
- This case involves a relatively large backup size, there is no significant time reduction by increasing resources from 1c2g to 4c4g for Kopia uploader, but for Restic uploader when increasing CPU from 1 core to 4, backup time-consuming was shortened by one-third, which means in this scenario should allocate more CPU resources for Restic uploader.
- For the large backup size case, Restic uploader's repository size comes to normal
- This case involves a relatively large backup size, and there is no significant time reduction by increasing resources from 1c2g to 4c4g for Kopia uploader.
### Case 4: 900 files, 1 directory, 1.000GB per file total 900.000GB content
#### result:
|Uploader| Resources|Times |Max CPU|Max Memory|Repo Usage|
|--------|----------|:-----:|------:|:--------:|:--------:|
| Kopia | 1c2g |2h30m | 100% |714 MB |900 GB |
| Restic | 1c2g |Timeout| 100% |416 MB |N/A |
| Kopia | 4c4g |1h42m | 138% |786 MB |900 GB |
| Restic | 4c4g |2h15m | 351% |606 MB |900 GB |
#### conclusion:
- When the target backup data is relatively large, Restic uploader starts to Timeout under 1c2g. So it's better to allocate more memory for Restic uploader when backup large sizes of data.
- For backup large amounts of data, Kopia uploader is both less time-consuming and less resource usage.
- For backup large amounts of data, allocating more resources can reduce backup time for Kopia uploader.
## Summary
- With the same specification resources, Kopia uploader is less time-consuming when backup.
- Performance would be better if choosing Kopia uploader for the scenario in backup large mounts of data or massive small files.
- It's better to set one reasonable resource configuration instead of the default depending on your scenario. For default resource configuration, it's easy to be timeout with Restic uploader in backup large amounts of data, and it's easy to be OOM for both Kopia uploader and Restic uploader in backup of massive small files.
- Kopia uploader performs well when backing up large amounts of data or massive small files.
- It's better to set one reasonable resource configuration instead of the default depending on your scenario. With default configuration, it's easy to hit timeout or OOM in large-scale backups.
@@ -23,7 +23,7 @@ If there is a key value as `global` in the map, the key's value is applied to al
The other keys in the map is the combination of three elements of a BackupRepository, because those three keys can identify a unique BackupRepository:
* The namespace in which BackupRepository backs up volume data.
* The BackupRepository referenced BackupStorageLocation's name.
* The BackupRepository's type. Possible values are `kopia` and `restic`.
* The BackupRepository's type. Possible value is `kopia`.
If there is a key match with BackupRepository, the key's value is applied to the BackupRepository's maintenance jobs.
By this way, it's possible to let user configure before the BackupRepository is created.
@@ -45,7 +45,6 @@ For example, the following BackupRepository's key should be `test-default-kopia`
backupStorageLocation: default
maintenanceFrequency: 1h0m0s
repositoryType: kopia
resticIdentifier: gs:jxun:/restic/test
volumeNamespace: test
```
@@ -135,7 +134,7 @@ The frequency of running maintenance jobs could be set by the below command when
```bash
velero install --default-repo-maintain-frequency <DURATION>
```
For Kopia the default maintenance frequency is 1 hour, and Restic is 7 * 24 hours.
For Kopia the default maintenance frequency is 1 hour.
### Full Maintenance Interval customization
See [backup repository configuration][3]
@@ -154,4 +154,4 @@ Velero provides a way for you to skip TLS verification on the object store when
If true, the object store's TLS certificate will not be checked for validity before Velero or backup repository connects to the object storage. You can permanently skip TLS verification for an object store by setting `Spec.Config.InsecureSkipTLSVerify` to true in the [BackupStorageLocation](api-types/backupstoragelocation.md) CRD.
Note that Velero's File System Backup uses Restic or Kopia to do data transfer between object store and Kubernetes cluster disks. This means that when you specify `--insecure-skip-tls-verify` in Velero operations that involve File System Backup, Velero will convey this information to Restic or Kopia. For example, for Restic, Velero will add the Restic global command parameter `--insecure-tls` to Restic commands.
Note that Velero's File System Backup uses Kopia to do data transfer between object store and Kubernetes cluster disks. This means that when you specify `--insecure-skip-tls-verify` in Velero operations that involve File System Backup, Velero will convey this information to Kopia.