Restoring a Cloud instance

SOC2/CI-110

Follow break glass process to ensure you have the proper access to perform this playbook.

Restoring Cloud SQL

Use cases:

  • Cloud SQL data is corrupted by a broken database migration
  • Cloud SQL data is deleted

Restore from automated backup

Below process is derived from GCP documentation

The restoration process will be performed with gcloud. Learn more about why not terraform?.

List all backups, note the id of the latest (or the one right before database state is corrupted) SUCCESSFUL backup as SQL_BACKUP_ID

mi2 instance sql-backup list --slug $SLUG -e $ENVIRONMENT

Restore the backup to the current instance.

mi2 instance sql-restore create --backup-id $SQL_BACKUP_ID --slug $SLUG -e $ENVIRONMENT

List operations to watch for progress

mi2 instance sql-restore list --slug $SLUG -e $ENVIRONMENT

Restore GKE cluster application(s)

Use cases (tested scenarios):

  • GKE cluster was deleted
  • application namespaces was deleted
  • single application was deleted
  • PV from single application was deleted

Backup and restore uses native GKE mechanism.

  1. List available backups
  2. Assess the damage
    1. GKE cluster is gone
    2. The namespace is gone
    3. Stateful services is corrupted or PV/PVC/disk is gone
    4. Missing stateless deployment
    5. Missing stateful services deploymnet

List backups

mi2 instance backup list --slug $SLUG -e $ENVIRONMENT

note the backup name, you will need it later.

Restore cluster and applications from backup

WARNING: this will spin up a new GKE cluster and replace it with the previous backup

cd sourcegraph/cloud
cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID/terraform/infra
terraform init
terraform apply
mi2 instance workon -e $ENVIRONMENT --slug $SLUG
cd sourcegraph/cloud
mi2 instance restore create --backup-name $BACKUP_NAME --restore-type full-replace --slug $SLUG -e $ENVIRONMENT

Restore the full namespace

WARNING: this will restore both application statue and disk to the state of backup

cd sourcegraph/cloud
mi2 instance restore create --backup-name <BACKUP_NAME> --restore-type full-replace --slug $SLUG -e $ENVIRONMENT

Restore stateless application

e.g. sourcegraph-frontend

WARNING: make sure no disk or statefulset application is deleted, or we will risk data loss

cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID/kubernetes
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm . | kubectl apply -f -

Restore statefull application from disk backup

e.g. gitserver, zoekt

cd sourcegraph/cloud
mi2 instance restore create --backup-name $BACKUP_NAME --restore-type [gitserver|indexed-search] --slug $SLUG -e $ENVIRONMENT

Restore statefull application with empty disk

e.g. gitserver, zoekt

WARNING: this assumes deletion of StatefulSet and/or application PVC/PV. GCP will create new empty disk and attach it. You probally want to restore with disk backup

cd environments/$ENVIRONMENT/deployments/$INSTANCE_ID/kubernetes
kustomize build --load-restrictor LoadRestrictionsNone --enable-helm . | kubectl apply -f -