Original error was:
```
ark -n <redacted> restore logs <redacted>
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=nodes logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=events logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=events.events.k8s.io logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=backups.ark.heptio.com logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Not including resource" groupResource=restores.ark.heptio.com logSource="pkg/restore/restore.go:124"
time="2019-03-06T18:31:06Z" level=info msg="Starting restore of backup backup/<redacted>" logSource="pkg/restore/restore.go:342"
time="2019-03-06T18:31:06Z" level=info msg="error unzipping and extracting: open /tmp/604421455/resources/rolebindings.rbac.authorization.k8s.io/namespaces/<redacted>/<redacted>: too many open files" logSource="pkg/restore/restore.go:346"
```
Downloading the directory from s3 and untarring it I found 1036 files. The ulimit -n output says 1024. This is our team's best guess at a root cause. But the code fixed in the PR definitely is holding all the files open until the method closes: https://blog.learngoprogramming.com/gotchas-of-defer-in-go-1-8d070894cb01
Please note my go code abilities are not great and I did not test this. I just edited the file in github. All I did was remove the defer and put fileClose after the copy is done. Theoretically this should only hold one file open at a time now. Let me know if you want me to do any further steps.
Thank you,
-Asaf
Signed-off-by: Asaf Erlich <aerlich@groupon.com>
If a PV already exists, wait for it, it's associated PVC, and associated
namespace to be deleted before attempting to restore it.
If a namespace already exists, wait for it to be deleted before
attempting to restore it.
Signed-off-by: Nolan Brubaker <brubakern@vmware.com>
Instead of only removing the default token from a service account when
it already exists in the cluster, always remove it. If the service
account already exists, continue to do the merging logic.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>
We were checking for nil, but were getting back an empty
*unstructured.Unstructured{} instead, along with a NotFound error.
Change the logic to check for the NotFound error instead of a nil
object.
Signed-off-by: Andy Goldstein <andy.goldstein@gmail.com>