From 08d4fc8b88a3c731a1973e2dfba70b43e5d6d51d Mon Sep 17 00:00:00 2001 From: Andy Goldstein Date: Tue, 20 Feb 2018 09:41:17 -0500 Subject: [PATCH 1/2] Move ark server & minio to heptio-ark-server ns Move ark server deployment & minio deployment to a separate namespace from the backups/schedules/restores/config because backups now have a finalizer. If everything lives in one namespace, you have to delete all the backups and wait for the GC controller to process them and remove the finalizer from each before deleting the namespace. By moving the server into a separate namespace, users can now delete the heptio-ark namespace the normal way (kubectl delete), and once that namespace is fully removed, they can delete the heptio-ark-server namespace. Signed-off-by: Andy Goldstein --- README.md | 12 +++++++ docs/aws-config.md | 2 +- docs/azure-config.md | 2 +- docs/gcp-config.md | 2 +- docs/namespace.md | 23 ++++++++++--- examples/azure/00-ark-deployment.yaml | 4 ++- examples/common/00-prereqs.yaml | 43 +++++-------------------- examples/common/10-deployment.yaml | 4 ++- examples/minio/00-minio-deployment.yaml | 8 ++--- examples/minio/10-ark-config.yaml | 2 +- 10 files changed, 53 insertions(+), 49 deletions(-) diff --git a/README.md b/README.md index 5ddede18c..c96e534af 100644 --- a/README.md +++ b/README.md @@ -133,6 +133,18 @@ For more information, see [the debugging information][18]. ### Clean up +Delete any backups you created: + +``` +kubectl delete -n heptio-ark backup --all +``` + +Before you continue, wait for the following to show no backups: + +``` +ark backup get +``` + To remove the Kubernetes objects for this example from your cluster, run: ``` diff --git a/docs/aws-config.md b/docs/aws-config.md index 0267f2aa2..968bf5891 100644 --- a/docs/aws-config.md +++ b/docs/aws-config.md @@ -90,7 +90,7 @@ Create a Secret. In the directory of the credentials file you just created, run: ```bash kubectl create secret generic cloud-credentials \ - --namespace \ + --namespace \ --from-file cloud=credentials-ark ``` diff --git a/docs/azure-config.md b/docs/azure-config.md index dd3cc733e..17c421807 100644 --- a/docs/azure-config.md +++ b/docs/azure-config.md @@ -115,7 +115,7 @@ Now you need to create a Secret that contains all the seven environment variable ```bash kubectl create secret generic cloud-credentials \ - --namespace \ + --namespace \ --from-literal AZURE_SUBSCRIPTION_ID=${AZURE_SUBSCRIPTION_ID} \ --from-literal AZURE_TENANT_ID=${AZURE_TENANT_ID} \ --from-literal AZURE_RESOURCE_GROUP=${AZURE_RESOURCE_GROUP} \ diff --git a/docs/gcp-config.md b/docs/gcp-config.md index 1556d85a7..44299480a 100644 --- a/docs/gcp-config.md +++ b/docs/gcp-config.md @@ -74,7 +74,7 @@ Create a Secret. In the directory of the credentials file you just created, run: ```bash kubectl create secret generic cloud-credentials \ - --namespace \ + --namespace \ --from-file cloud=credentials-ark ``` diff --git a/docs/namespace.md b/docs/namespace.md index d4ed2dfe4..e3f48e371 100644 --- a/docs/namespace.md +++ b/docs/namespace.md @@ -1,15 +1,30 @@ # Run in custom namespace -In Ark version 0.7.0 and later, you can run Ark in any namespace. To do so, you specify the namespace in the YAML files that configure the Ark server. You then also specify the namespace when you run Ark client commands. +In Ark version 0.7.0 and later, you can run Ark in any namespace. To do so, you specify the +namespace in the YAML files that configure the Ark server. You then also specify the namespace when +you run Ark client commands. ## Edit the example files -The Ark repository includes [a set of examples][0] that you can use to set up your Ark server. The examples specify only the default `heptio-ark` namespace. To run in another namespace, you edit the relevant files to specify your custom namespace. +The Ark repository includes [a set of examples][0] that you can use to set up your Ark server. The +examples place the server in the `heptio-ark-server` namespace, and backup/schedule/restore/config +data in the `heptio-ark` namespace. + +To run the server in another namespace, you edit the relevant files, changing `heptio-ark-server` to +your desired namespace. + +To store your backups, schedules, restores, and config in another namespace, you edit the relevant +files, changing `heptio-ark` to your desired namespace. + +WARNING: It is recommended to run the Ark server in one namespace, and place your backups, schedules, +restores, and config in a different namespace. You might encounter issues with deleting a single Ark +namespace that contains everything. For all cloud providers, edit `https://github.com/heptio/ark/blob/master/examples/common/00-prereqs.yaml`. This file defines: * CustomResourceDefinitions for the Ark objects (backups, schedules, restores, configs, downloadrequests) -* The Ark namespace +* The namespace where the Ark server runs +* The namespace where backups, schedules, restores, and the config are stored * The Ark service account * The RBAC rules to grant permissions to the Ark service account @@ -48,4 +63,4 @@ ark client config set namespace= -[0]: https://github.com/heptio/ark/tree/master/examples \ No newline at end of file +[0]: https://github.com/heptio/ark/tree/master/examples diff --git a/examples/azure/00-ark-deployment.yaml b/examples/azure/00-ark-deployment.yaml index 528bcd5b0..d97aceaad 100644 --- a/examples/azure/00-ark-deployment.yaml +++ b/examples/azure/00-ark-deployment.yaml @@ -16,7 +16,7 @@ apiVersion: apps/v1beta1 kind: Deployment metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: ark spec: replicas: 1 @@ -32,6 +32,8 @@ spec: image: gcr.io/heptio-images/ark:latest command: - /ark + - --namespace + - heptio-ark args: - server envFrom: diff --git a/examples/common/00-prereqs.yaml b/examples/common/00-prereqs.yaml index 6fe12889e..843e043e6 100644 --- a/examples/common/00-prereqs.yaml +++ b/examples/common/00-prereqs.yaml @@ -93,12 +93,18 @@ kind: Namespace metadata: name: heptio-ark +--- +apiVersion: v1 +kind: Namespace +metadata: + name: heptio-ark-server + --- apiVersion: v1 kind: ServiceAccount metadata: name: ark - namespace: heptio-ark + namespace: heptio-ark-server labels: component: ark @@ -111,42 +117,9 @@ metadata: component: ark subjects: - kind: ServiceAccount - namespace: heptio-ark + namespace: heptio-ark-server name: ark roleRef: kind: ClusterRole name: cluster-admin apiGroup: rbac.authorization.k8s.io - ---- -apiVersion: rbac.authorization.k8s.io/v1beta1 -kind: Role -metadata: - namespace: heptio-ark - name: ark - labels: - component: ark -rules: - - apiGroups: - - ark.heptio.com - verbs: - - "*" - resources: - - "*" - ---- -apiVersion: rbac.authorization.k8s.io/v1beta1 -kind: RoleBinding -metadata: - namespace: heptio-ark - name: ark - labels: - component: ark -subjects: - - kind: ServiceAccount - namespace: heptio-ark - name: ark -roleRef: - kind: Role - name: ark - apiGroup: rbac.authorization.k8s.io diff --git a/examples/common/10-deployment.yaml b/examples/common/10-deployment.yaml index 70d28566f..633a189b1 100644 --- a/examples/common/10-deployment.yaml +++ b/examples/common/10-deployment.yaml @@ -16,7 +16,7 @@ apiVersion: apps/v1beta1 kind: Deployment metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: ark spec: replicas: 1 @@ -34,6 +34,8 @@ spec: - /ark args: - server + - --namespace + - heptio-ark volumeMounts: - name: cloud-credentials mountPath: /credentials diff --git a/examples/minio/00-minio-deployment.yaml b/examples/minio/00-minio-deployment.yaml index dbcdbd528..fd62f93ae 100644 --- a/examples/minio/00-minio-deployment.yaml +++ b/examples/minio/00-minio-deployment.yaml @@ -16,7 +16,7 @@ apiVersion: apps/v1beta1 kind: Deployment metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: minio labels: component: minio @@ -54,7 +54,7 @@ spec: apiVersion: v1 kind: Service metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: minio labels: component: minio @@ -71,7 +71,7 @@ spec: apiVersion: v1 kind: Secret metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: cloud-credentials labels: component: minio @@ -85,7 +85,7 @@ stringData: apiVersion: batch/v1 kind: Job metadata: - namespace: heptio-ark + namespace: heptio-ark-server name: minio-setup labels: component: minio diff --git a/examples/minio/10-ark-config.yaml b/examples/minio/10-ark-config.yaml index 76d0c6c57..f46ef83b6 100644 --- a/examples/minio/10-ark-config.yaml +++ b/examples/minio/10-ark-config.yaml @@ -24,7 +24,7 @@ backupStorageProvider: config: region: minio s3ForcePathStyle: "true" - s3Url: http://minio.heptio-ark.svc:9000 + s3Url: http://minio.heptio-ark-server.svc:9000 backupSyncPeriod: 1m gcSyncPeriod: 1m scheduleSyncPeriod: 1m From a0111d875f115ecb09285c61263a7a322f07729e Mon Sep 17 00:00:00 2001 From: Andy Goldstein Date: Wed, 21 Feb 2018 10:46:08 -0500 Subject: [PATCH 2/2] Add troubleshooting doc for backups stuck deleting Signed-off-by: Andy Goldstein --- README.md | 3 ++- docs/troubleshooting.md | 38 ++++++++++++++++++++++++++++++++++++++ 2 files changed, 40 insertions(+), 1 deletion(-) create mode 100644 docs/troubleshooting.md diff --git a/README.md b/README.md index c96e534af..d82bf91c7 100644 --- a/README.md +++ b/README.md @@ -159,7 +159,7 @@ kubectl delete -f examples/nginx-app/base.yaml ## Troubleshooting -If you encounter any problems that the documentation does not address, [file an issue][4] or talk to us on the [Kubernetes Slack team][25] channel `#ark-dr`. +If you encounter any problems that the documentation does not address, review the [troubleshooting][30] page, [file an issue][4], or talk to us on the [Kubernetes Slack team][25] channel `#ark-dr`. ## Contributing @@ -209,3 +209,4 @@ See [the list of releases][6] to find out about feature changes. [27]: /docs/hooks.md [28]: /docs/plugins.md [29]: https://heptio.github.io/ark/ +[30]: /docs/troubleshooting.md diff --git a/docs/troubleshooting.md b/docs/troubleshooting.md new file mode 100644 index 000000000..e3a905109 --- /dev/null +++ b/docs/troubleshooting.md @@ -0,0 +1,38 @@ +# Troubleshooting + +## heptio-ark namespace stuck terminating / unable to delete backups + +Ark v0.7.0 added the ability to delete backups by adding what is called an Ark "finalizer" to each +backup. When you request the deletion of an object that has at least one finalizer, Kubernetes sets +the object's "deletion timestamp" (indicating the object has been marked for deletion), but it does +not immediately delete the object. Instead, Kubernetes only deletes the object when it no longer has +any finalizers. This means that something (Ark, in this case) must process the backup and then +remove the Ark finalizer from it. + +Ark versions before v0.7.1 place the Ark server pod in the same namespace as backups, restores, +schedules, and the Ark config. If you try to delete the namespace (`kubectl delete +namespace/heptio-ark`), it's possible that the Ark server pod is deleted before the backups, because +the order of deletions is arbitrary. If this happens, the remaining bacukps will be "stuck" +deleting, because the Ark server pod no longer exists to remove their finalizers. + +With v0.7.1, we strongly encourage you to run the Ark server pod in a different namespace than the +one used for backups, schedules, restores, and the Ark config. This is the default configuration as +of v0.7.1. + +If you encounter this problem, here is how to fix it. First, make sure you have `jq` installed. Then +run: + +``` +bash <(kubectl -n heptio-ark get backup -o json | jq -c -r $'.items[] | "kubectl -n heptio-ark patch backup/" + .metadata.name + " -p \'" + (({metadata: {finalizers: ( (.metadata.finalizers // []) - ["gc.ark.heptio.com"]), resourceVersion: .metadata.resourceVersion}}) | tostring) + "\' --type=merge"') +``` + +This retrieves a list of backups and uses it to generate and run a list of commands that look like: + +``` +kubectl -n heptio-ark patch backup/my-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461343"}}' --type=merge +kubectl -n heptio-ark patch backup/some-other-backup -p '{"metadata":{"finalizers":[],"resourceVersion":"461718"}}' --type=merge +``` + +If you receive errors that patching backups is not allowed, it's possible that the Ark +CustomResourceDefinitions (CRDs) were deleted. You'll need to recreate them (they're in +`examples/common/00-prereqs.yaml`), then follow the steps above.