Add the ConfigMap-specified parameters into velero install CLI.

Rename backup-repository-config to backup-repository-configmap.
Rename repo-maintenance-job-config to repo-maintenance-job-configmap.
Rename node-agent-config to node-agent-configmap.
Add those three parameters to `velero install` CLI.
Modify the design and the site documents.

Signed-off-by: Xun Jiang <xun.jiang@broadcom.com>
This commit is contained in:
Xun Jiang
2024-09-10 16:31:48 +08:00
parent 46801a0828
commit 68f3545424
20 changed files with 202 additions and 87 deletions

View File

@@ -8,7 +8,11 @@ Velero uses selectable backup repositories for various backup/restore methods, i
Velero uses a BackupRepository CR to represent the instance of the backup repository. Now, a new field `repositoryConfig` is added to support various configurations to the underlying backup repository.
Velero also allows you to specify configurations before the BackupRepository CR is created through a configMap. The configurations in the configMap will be copied to the BackupRepository CR when it is created at the due time.
The configMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one configMap in each namespace which applies to Velero instance in that namespace only. The name of the configMap should be specified in the Velero server parameter `--backup-repository-config`.
The configMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one configMap in each namespace which applies to Velero instance in that namespace only. The name of the configMap should be specified in the Velero server parameter `--backup-repository-configmap`.
The users can specify the ConfigMap name during velero installation by CLI:
`velero install --backup-repository-configmap=<ConfigMap-Name>`
Conclusively, you have two ways to add/change/delete configurations of a backup repository:
- If the BackupRepository CR for the backup repository is already there, you should modify the `repositoryConfig` field. The new changes will be applied to the backup repository at the due time, it doesn't require Velero server to restart.

View File

@@ -3,20 +3,23 @@ title: "Node Selection for Data Movement Backup"
layout: docs
---
Velero node-agent is a daemonset hosting the data movement modules to complete the concrete work of backups/restores.
Varying from the data size, data complexity, resource availability, the data movement may take a long time and remarkable resources (CPU, memory, network bandwidth, etc.) during the backup and restore.
Velero node-agent is a daemonset hosting the data movement modules to complete the concrete work of backups/restores.
Varying from the data size, data complexity, resource availability, the data movement may take a long time and remarkable resources (CPU, memory, network bandwidth, etc.) during the backup and restore.
Velero data movement backup supports to constrain the nodes where it runs. This is helpful in below scenarios:
- Prevent the data movement backup from running in specific nodes because users have more critical workloads in the nodes
- Constrain the data movement backup to run in specific nodes because these nodes have more resources than others
- Constrain the data movement backup to run in specific nodes because the storage allows volume/snapshot provisions in these nodes only
Velero data movement backup supports to constrain the nodes where it runs. This is helpful in below scenarios:
- Prevent the data movement backup from running in specific nodes because users have more critical workloads in the nodes
- Constrain the data movement backup to run in specific nodes because these nodes have more resources than others
- Constrain the data movement backup to run in specific nodes because the storage allows volume/snapshot provisions in these nodes only
Velero introduces a new section in the node-agent configMap, called ```loadAffinity```, through which you can specify the nodes to/not to run data movement backups, in the affinity and anti-affinity flavors.
If it is not there, a configMap should be created manually. The configMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one configMap in each namespace which applies to node-agent in that namespace only. The name of the configMap should be specified in the node-agent server parameter ```--node-agent-config```.
Node-agent server checks these configurations at startup time. Therefore, you could edit this configMap any time, but in order to make the changes effective, node-agent server needs to be restarted.
Velero introduces a new section in the node-agent ConfigMap, called ```loadAffinity```, through which you can specify the nodes to/not to run data movement backups, in the affinity and anti-affinity flavors.
If it is not there, a ConfigMap should be created manually. The ConfigMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one ConfigMap in each namespace which applies to node-agent in that namespace only. The name of the ConfigMap should be specified in the node-agent server parameter ```--node-agent-configmap```.
Node-agent server checks these configurations at startup time. Therefore, you could edit this ConfigMap any time, but in order to make the changes effective, node-agent server needs to be restarted.
The users can specify the ConfigMap name during velero installation by CLI:
`velero install --node-agent-configmap=<ConfigMap-Name>`
### Sample
Here is a sample of the configMap with ```loadAffinity```:
Here is a sample of the ConfigMap with ```loadAffinity```:
```json
{
"loadAffinity": [
@@ -45,31 +48,31 @@ Here is a sample of the configMap with ```loadAffinity```:
]
}
```
To create the configMap, save something like the above sample to a json file and then run below command:
To create the ConfigMap, save something like the above sample to a json file and then run below command:
```
kubectl create cm node-agent-config -n velero --from-file=<json file name>
kubectl create cm <ConfigMap name> -n velero --from-file=<json file name>
```
To provide the configMap to node-agent, edit the node-agent daemonset and add the ```- --node-agent-config``` argument to the spec:
1. Open the node-agent daemonset spec
To provide the ConfigMap to node-agent, edit the node-agent daemonset and add the ```- --node-agent-configmap``` argument to the spec:
1. Open the node-agent daemonset spec
```
kubectl edit ds node-agent -n velero
```
2. Add ```- --node-agent-config``` to ```spec.template.spec.containers```
2. Add ```- --node-agent-configmap``` to ```spec.template.spec.containers```
```
spec:
template:
spec:
containers:
- args:
- --node-agent-config=<configMap name>
- --node-agent-configmap=<ConfigMap name>
```
### Affinity
Affinity configuration means allowing the data movement backup to run in the nodes specified. There are two ways to define it:
- It could be defined by `MatchLabels`. The labels defined in `MatchLabels` means a `LabelSelectorOpIn` operation by default, so in the current context, they will be treated as affinity rules. In the above sample, it defines to run data movement backups in nodes with label `beta.kubernetes.io/instance-type` of value `Standard_B4ms` (Run data movement backups in `Standard_B4ms` nodes only).
- It could be defined by `MatchExpressions`. The labels are defined in `Key` and `Values` of `MatchExpressions` and the `Operator` should be defined as `LabelSelectorOpIn` or `LabelSelectorOpExists`. In the above sample, it defines to run data movement backups in nodes with label `kubernetes.io/hostname` of values `node-1`, `node-2` and `node-3` (Run data movement backups in `node-1`, `node-2` and `node-3` only).
- It could be defined by `MatchLabels`. The labels defined in `MatchLabels` means a `LabelSelectorOpIn` operation by default, so in the current context, they will be treated as affinity rules. In the above sample, it defines to run data movement backups in nodes with label `beta.kubernetes.io/instance-type` of value `Standard_B4ms` (Run data movement backups in `Standard_B4ms` nodes only).
- It could be defined by `MatchExpressions`. The labels are defined in `Key` and `Values` of `MatchExpressions` and the `Operator` should be defined as `LabelSelectorOpIn` or `LabelSelectorOpExists`. In the above sample, it defines to run data movement backups in nodes with label `kubernetes.io/hostname` of values `node-1`, `node-2` and `node-3` (Run data movement backups in `node-1`, `node-2` and `node-3` only).
### Anti-affinity
Anti-affinity configuration means preventing the data movement backup from running in the nodes specified. Below is the way to define it:
- It could be defined by `MatchExpressions`. The labels are defined in `Key` and `Values` of `MatchExpressions` and the `Operator` should be defined as `LabelSelectorOpNotIn` or `LabelSelectorOpDoesNotExist`. In the above sample, it disallows data movement backups to run in nodes with label `xxx/critial-workload`.
Anti-affinity configuration means preventing the data movement backup from running in the nodes specified. Below is the way to define it:
- It could be defined by `MatchExpressions`. The labels are defined in `Key` and `Values` of `MatchExpressions` and the `Operator` should be defined as `LabelSelectorOpNotIn` or `LabelSelectorOpDoesNotExist`. In the above sample, it disallows data movement backups to run in nodes with label `xxx/critial-workload`.

View File

@@ -16,7 +16,7 @@ operation could perform better. Specifically:
However, it doesn't make any sense to keep replicas when an intermediate volume used by the backup. Therefore, users should be allowed
to configure another storage class specifically used by the `backupPVC`.
Velero introduces a new section in the node agent configuration configMap (the name of this configMap is passed using `--node-agent-config` velero server argument)
Velero introduces a new section in the node agent configuration ConfigMap (the name of this ConfigMap is passed using `--node-agent-configmap` velero server argument)
called `backupPVC`, through which you can specify the following
configurations:
@@ -26,7 +26,10 @@ default the source PVC's storage class will be used.
- `readOnly`: This is a boolean value. If set to `true` then `ReadOnlyMany` will be the only value set to the backupPVC's access modes. Otherwise
`ReadWriteOnce` value will be used.
A sample of `backupPVC` config as part of the configMap would look like:
The users can specify the ConfigMap name during velero installation by CLI:
`velero install --node-agent-configmap=<ConfigMap-Name>`
A sample of `backupPVC` config as part of the ConfigMap would look like:
```json
{
"backupPVC": {
@@ -47,7 +50,7 @@ A sample of `backupPVC` config as part of the configMap would look like:
**Note:**
- Users should make sure that the storage class specified in `backupPVC` config should exist in the cluster and can be used by the
`backupPVC`, otherwise the corresponding DataUpload CR will stay in `Accepted` phase until timeout (data movement prepare timeout value is 30m by default).
- If the users are setting `readOnly` value as `true` in the `backupPVC` config then they must also make sure that the storage class that is being used for
- If the users are setting `readOnly` value as `true` in the `backupPVC` config then they must also make sure that the storage class that is being used for
`backupPVC` should support creation of `ReadOnlyMany` PVC from a snapshot, otherwise the corresponding DataUpload CR will stay in `Accepted` phase until
timeout (data movement prepare timeout value is 30m by default).
- If any of the above problems occur, then the DataUpload CR is `canceled` after timeout, and the backupPod and backupPVC will be deleted, and the backup

View File

@@ -4,35 +4,38 @@ layout: docs
---
Velero node-agent is a daemonset hosting modules to complete the concrete tasks of backups/restores, i.e., file system backup/restore, CSI snapshot data movement.
Varying from the data size, data complexity, resource availability, the tasks may take a long time and remarkable resources (CPU, memory, network bandwidth, etc.). These tasks make the loads of node-agent.
Varying from the data size, data complexity, resource availability, the tasks may take a long time and remarkable resources (CPU, memory, network bandwidth, etc.). These tasks make the loads of node-agent.
Node-agent concurrency configurations allow you to configure the concurrent number of node-agent loads per node. When the resources are sufficient in nodes, you can set a large concurrent number, so as to reduce the backup/restore time; otherwise, the concurrency should be reduced, otherwise, the backup/restore may encounter problems, i.e., time lagging, hang or OOM kill.
Node-agent concurrency configurations allow you to configure the concurrent number of node-agent loads per node. When the resources are sufficient in nodes, you can set a large concurrent number, so as to reduce the backup/restore time; otherwise, the concurrency should be reduced, otherwise, the backup/restore may encounter problems, i.e., time lagging, hang or OOM kill.
To set Node-agent concurrency configurations, a configMap should be created manually. The configMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one configMap in each namespace which applies to node-agent in that namespace only. The name of the configMap should be specified in the node-agent server parameter ```--node-agent-config```.
Node-agent server checks these configurations at startup time. Therefore, you could edit this configMap any time, but in order to make the changes effective, node-agent server needs to be restarted.
To set Node-agent concurrency configurations, a configMap should be created manually. The configMap should be in the same namespace where Velero is installed. If multiple Velero instances are installed in different namespaces, there should be one configMap in each namespace which applies to node-agent in that namespace only. The name of the configMap should be specified in the node-agent server parameter ```--node-agent-configmap```.
Node-agent server checks these configurations at startup time. Therefore, you could edit this configMap any time, but in order to make the changes effective, node-agent server needs to be restarted.
The users can specify the ConfigMap name during velero installation by CLI:
`velero install --node-agent-configmap=<ConfigMap-Name>`
### Global concurrent number
You can specify a concurrent number that will be applied to all nodes if the per-node number is not specified. This number is set through ```globalConfig``` field in ```loadConcurrency```.
The number starts from 1 which means there is no concurrency, only one load is allowed. There is no roof limit. If this number is not specified or not valid, a hard-coded default value will be used, the value is set to 1.
You can specify a concurrent number that will be applied to all nodes if the per-node number is not specified. This number is set through ```globalConfig``` field in ```loadConcurrency```.
The number starts from 1 which means there is no concurrency, only one load is allowed. There is no roof limit. If this number is not specified or not valid, a hard-coded default value will be used, the value is set to 1.
### Per-node concurrent number
You can specify different concurrent number per node, for example, you can set 3 concurrent instances in Node-1, 2 instances in Node-2 and 1 instance in Node-3.
The range of Per-node concurrent number is the same with Global concurrent number. Per-node concurrent number is preferable to Global concurrent number, so it will overwrite the Global concurrent number for that node.
The range of Per-node concurrent number is the same with Global concurrent number. Per-node concurrent number is preferable to Global concurrent number, so it will overwrite the Global concurrent number for that node.
Per-node concurrent number is implemented through ```perNodeConfig``` field in ```loadConcurrency```.
Per-node concurrent number is implemented through ```perNodeConfig``` field in ```loadConcurrency```.
```perNodeConfig``` is a list of ```RuledConfigs``` each item of which matches one or more nodes by label selectors and specify the concurrent number for the matched nodes.
Here is an example of the ```perNodeConfig``:
```
"nodeSelector: kubernetes.io/hostname=node1; number: 3"
"nodeSelector: beta.kubernetes.io/instance-type=Standard_B4ms; number: 5"
```
The first element means the node with host name ```node1``` gets the Per-node concurrent number of 3.
The second element means all the nodes with label ```beta.kubernetes.io/instance-type``` of value ```Standard_B4ms``` get the Per-node concurrent number of 5.
At least one node is expected to have a label with the specified ```RuledConfigs``` element (rule). If no node is with this label, the Per-node rule makes no effect.
If one node falls into more than one rules, e.g., if node1 also has the label ```beta.kubernetes.io/instance-type=Standard_B4ms```, the smallest number (3) will be used.
The first element means the node with host name ```node1``` gets the Per-node concurrent number of 3.
The second element means all the nodes with label ```beta.kubernetes.io/instance-type``` of value ```Standard_B4ms``` get the Per-node concurrent number of 5.
At least one node is expected to have a label with the specified ```RuledConfigs``` element (rule). If no node is with this label, the Per-node rule makes no effect.
If one node falls into more than one rules, e.g., if node1 also has the label ```beta.kubernetes.io/instance-type=Standard_B4ms```, the smallest number (3) will be used.
### Sample
A sample of the complete configMap is as below:
A sample of the complete ConfigMap is as below:
```json
{
"loadConcurrency": {
@@ -58,23 +61,21 @@ A sample of the complete configMap is as below:
}
}
```
To create the configMap, save something like the above sample to a json file and then run below command:
To create the ConfigMap, save something like the above sample to a json file and then run below command:
```
kubectl create cm node-agent-config -n velero --from-file=<json file name>
kubectl create cm <ConfigMap name> -n velero --from-file=<json file name>
```
To provide the configMap to node-agent, edit the node-agent daemonset and add the ```- --node-agent-config``` argument to the spec:
1. Open the node-agent daemonset spec
To provide the ConfigMap to node-agent, edit the node-agent daemonset and add the ```- --node-agent-configmap``` argument to the spec:
1. Open the node-agent daemonset spec
```
kubectl edit ds node-agent -n velero
```
2. Add ```- --node-agent-config``` to ```spec.template.spec.containers```
2. Add ```- --node-agent-configmap``` to ```spec.template.spec.containers```
```
spec:
template:
spec:
containers:
- args:
- --node-agent-config=<configMap name>
- --node-agent-configmap=<ConfigMap name>
```

View File

@@ -9,11 +9,14 @@ Before v1.14.0, Velero performs periodic maintenance on the repository within Ve
For repository maintenance jobs, there's no limit on resources by default. You could configure the job resource limitation based on target data to be backed up.
From v1.15 and on, Velero introduces a new ConfigMap, specified by `velero server --repo-maintenance-job-config` parameter, to set repository maintenance Job configuration, including Node Affinity and resources. The old `velero server` parameters ( `--maintenance-job-cpu-request`, `--maintenance-job-mem-request`, `--maintenance-job-cpu-limit`, `--maintenance-job-mem-limit`, and `--keep-latest-maintenance-jobs`) introduced in v1.14 are deprecated, and will be deleted in v1.17.
From v1.15 and on, Velero introduces a new ConfigMap, specified by `velero server --repo-maintenance-job-configmap` parameter, to set repository maintenance Job configuration, including Node Affinity and resources. The old `velero server` parameters ( `--maintenance-job-cpu-request`, `--maintenance-job-mem-request`, `--maintenance-job-cpu-limit`, `--maintenance-job-mem-limit`, and `--keep-latest-maintenance-jobs`) introduced in v1.14 are deprecated, and will be deleted in v1.17.
The users can specify the ConfigMap name during velero installation by CLI:
`velero install --repo-maintenance-job-configmap=<ConfigMap-Name>`
## Settings
### Resource Limitation and Node Affinity
Those are specified by the ConfigMap specified by `velero server --repo-maintenance-job-config` parameter.
Those are specified by the ConfigMap specified by `velero server --repo-maintenance-job-configmap` parameter.
This ConfigMap content is a Map.
If there is a key value as `global` in the map, the key's value is applied to all BackupRepositories maintenance jobs that cannot find their own specific configuration in the ConfigMap.
@@ -55,7 +58,7 @@ It's possible that the users want to choose nodes that match condition A or cond
For example, the user want to let the nodes is in a specified machine type or the nodes locate in the us-central1-x zones to run the job.
This can be done by adding multiple entries in the `LoadAffinity` array.
The sample of the ```repo-maintenance-job-config``` ConfigMap for the above scenario is as below:
The sample of the ```repo-maintenance-job-configmap``` ConfigMap for the above scenario is as below:
``` bash
cat <<EOF > repo-maintenance-job-config.json
{