Backup
Ensuring business continuity in Kubernetes deployments requires a robust backup and disaster recovery strategy. This is vital for safeguarding critical data and configurations.
For cloud-managed Kubernetes services like Google Kubernetes Engine (GKE), providers offer integrated backup capabilities, leveraging their native snapshotting and storage solutions.
Alternatively, for both cloud-based and on-premises clusters, third-party tools such as Velero provide a flexible, platform-agnostic approach to the backup and restoration of Kubernetes resources and persistent volumes.
Google Kubernetes Engine
For Google Kubernetes Engine (GKE), Google Cloud offers a dedicated service called Backup for GKE. This managed service enables users to backup and restore both Kubernetes resource manifests (cluster state) and persistent volume data.
Enable Backup for GKE for an existing cluster
Enable the backup api (official doc):
gcloud services enable gkebackup.googleapis.com \ --project <PROJECT_ID>Enable the backup addon in the cluster (official doc):
gcloud container clusters update <CLUSTER_NAME> \ --project=<PROJECT_ID> \ --region=<REGION> \ --update-addons=BackupRestore=ENABLEDCreate a schedule that will backup all namespaces every day at 3:00 AM UTC, with a retention of 7 days (official doc):
gcloud beta container backup-restore backup-plans create <BACKUP_PLAN_NAME> \ --project=<PROJECT_ID> \ --location=<REGION> \ --cluster="projects/<PROJECT_ID>/locations/<LOCATION>/clusters/<CLUSTER_ID>" \ --cron-schedule="0 3 * * *" \ --backup-retain-days=7 \ --all-namespacesYou can also specify --include-volume-data or --include-secrets to include persistent volumes or secrets in the backup plan:
gcloud beta container backup-restore backup-plans create <BACKUP_PLAN_NAME> \ --project=<PROJECT_ID> \ --location=<REGION> \ --cluster="projects/<PROJECT_ID>/locations/<LOCATION>/clusters/<CLUSTER_ID>" \ --cron-schedule="0 3 * * *" \ --backup-retain-days=7 \ --all-namespaces \ --include-secrets \ --include-volume-dataVelero
Velero is an open-source tool for backing up and restoring Kubernetes cluster resources and persistent volumes, enabling disaster recovery, and migrating workloads. It is widely used in production and is part of the CNCF cloud native landscape.
Velero consists of:
- A server that runs on your cluster
- A command-line client that runs locally
The official documentation is available here.
Prerequisites
Before you begin, make sure you have:
- A running Kubernetes cluster (version 1.16 or later).
kubectlinstalled and configured to communicate with your cluster.- Access to an object storage bucket (e.g., GCP Cloud Storage, AWS S3, …) where your backups will be stored
Install the Velero CLI
The Velero command-line interface (CLI) is used to interact with the Velero server deployed in your cluster.
Download the latest release’s tarball corresponding to your operating system and desired Velero version:
wget https://github.com/vmware-tanzu/velero/releases/download/v<VERSION>/velero-v<VERSION>-<OS>-<ARCH>.tar.gzExtract the tarball:
tar -xvf velero-v<VERSION>-<OS>-<ARCH>.tar.gzMove the extracted velero binary to somewhere in your $PATH (/usr/local/bin for most users):
sudo mv velero-v<VERSION>-<OS>-<ARCH>/velero /usr/local/binVerify the installation, this should display the Velero client version:
velero version --client-onlyInstall and configure the server components
Velero uses storage provider plugins to integrate with a variety of storage systems to support backup and snapshot operations. The steps to install and configure the server components along with the appropriate plugins are specific to the chosen storage provider. Below is an example for AWS S3.
Create a file named credentials-velero with your object storage access keys.
cat > credentials-velero <<EOF[default]aws_access_key_id=<AWS_ACCESS_KEY_ID>aws_secret_access_key=<AWS_SECRET_ACCESS_KEY>EOFInstall Velero on Kubernetes
velero install \ --provider aws \ --plugins velero/velero-plugin-for-aws:v1.10.0 \ --bucket <BUCKET> \ --secret-file ./credentials-velero \ --backup-location-config region=<REGION> \ --snapshot-location-config region=<REGION> \ --kubeconfig ~/.kube/configVerify Velero installation in the cluster:
kubectl get pods -n veleroBackup
Backup the entire cluster (all namespaces and cluster-scoped resources):
velero backup create cluster-backup --include-cluster-resources=trueBackup only a specific namespace:
velero backup create app-backup --include-namespaces app-namespaceVelero allows you to schedule recurring backups using a cron expression:
velero schedule create daily-backup --schedule="0 3 * * *" --include-cluster-resources=trueThis creates a schedule named daily-backup that will run every day at 3:00 AM UTC, backing up the entire cluster.
You can modify the --schedule flag using standard cron syntax.
See the backup reference for all options.
Restore
Get a list of available backups:
velero backup getThe output should be like the following:
NAME STATUS STARTED COMPLETED EXPIRES STORAGE LOCATION SELECTORfull-cluster-backup Completed 2025-06-23 10:00:00 +0000 UTC 2025-06-23 10:05:00 +0000 UTC 2025-07-23 10:00:00 +0000 UTC default <none>app-backup Completed 2025-06-24 14:30:00 +0000 UTC 2025-06-24 14:31:00 +0000 UTC 2025-07-24 14:30:00 +0000 UTC default <none>Create a restore operation:
velero restore create --from-backup <BACKUP_NAME>If your backup contains multiple namespaces, you can choose to restore only a subset:
velero restore create --from-backup <BACKUP_NAME> --include-namespaces app1-namespace,app2-namespaceMonitor the restore process:
velero restore getvelero restore describe <RESTORE_NAME>velero restore logs <RESTORE_NAME>See the restore reference for detailed options.