Playbooks for deployments

General

Debugging

See debugging.

Check what version of Sourcegraph is deployed

Install sg, the Sourcegraph developer tool, and using the sg live command you can see the version currently deployed for a specific environment:

sg live <environment|url>

Sourcegraph.com

To learn more about this deployment, see instances.

Deploying to sourcegraph.com

Every commit to the release branch (the default branch) on deploy-sourcegraph-cloud deploys the Kubernetes YAML in this repository to our dot-com cluster in CI (i.e. if CI is green then the latest config in the release branch is deployed).

Deploys on sourcegraph.com are currently handled by Renovate. The Renovate dashboard shows logs for previous runs and allows you to predict when the next run will happen.

If you want to expedite a deploy, you can manually create and merge a PR that updates the Docker image tags in deploy-sourcegraph-cloud. You can find the desired Docker image tags by looking at the output of the Docker build step in CI on sourcegraph/sourcegraph main branch or by looking at Docker Hub.

Deploying to sourcegraph.com during code freeze

Note: sample for code freeze.

Image build

To ensure stability during a code freeze, a separate release/YYYY-MM-dd branch will be created from main, with only approved commits to be cherry-picked onto release/YYYY-MM-dd branch for release. To ensure any compability between the main and release/YYYY-MM-dd branches, ALL commits must first be merged to main and pass CI for being cherry-picked. All tests will be run on the release/YYYY-MM-dd branch and must pass before docker images are published to docker hub. When creating a hotfix PR in sourcegraph/sourcegraph, it is important to create a branch with prefix main-dry-run/ (it enables CI pipeline similar to the pipeline which is run against every commit in main branch). This can be done with:

sg ci build main-dry-run

More about pipeline run types.

Deploy

During the code freeze, Renovate will be disabled on YYYY-MM-dd + and no automatic updates to Kubernetes manifests will be made. To deploy your changes, you can manually create and merge (requires approval from CloudDevops) a PR that updates the Docker image tags in deploy-sourcegraph-cloud. You can find the desired Docker image tags by looking at the output of the Docker build step in CI on sourcegraph/sourcegraph release/YYYY-MM-dd branch or by looking at Docker Hub.

Once your PR has been merged, you can follow the deployment via CI on the release branch.

Manually deploying a service to sourcegraph.com

Sometimes you need to manually deploy a service to sourcegraph.com instead of relying on our CD process i.e. hotfix during code freeze.

Usually you’ll know the build from which you’d like to deploy, we’ll use a specific build of gitserver as an example:

  1. Find the green build in Buildkite

  2. Find the step that built the Docker image for your service

  3. Find the image, which will have the format index.docker.io/sourcegraph/{SERVICE}:{TIMESTAMP}@sha256:{HASH}

  4. Pull the latest from deploy-sourcegraph-cloud

  5. Check out the release branch

  6. Create a new branch

  7. Run the update-images.py script using the image URL from step 3. For example:

    ./update-images.py index.docker.io/sourcegraph/gitserver:118059_2021-11-29_05fcc11@sha256:0c8a862e7977a830e2fa8a690ac243eea1255c150766a44b6c6c86df959d224f
    
  8. Commit, push your changes and have them reviewed either by CloudDevops.

  9. Once merged, the CD process will take over and deploy the image(s) you’ve updated

Rolling back sourcegraph.com

To roll back soucegraph.com, push a new commit to the release branch in deploy-sourcegraph-cloud that reverts the image tags and configuration to the desired state.

# Ensure that you're up-to-date
git checkout release
git pull

# Rollback the release branch to $COMMIT
# See https://stackoverflow.com/a/21718540 if you want more context
git revert --no-commit $COMMIT..HEAD

# Push the revert commit back to the release branch
git commit
git push origin release

Buildkite will deploy the working commit to sourcegraph.com.

🚨 You also need to disable auto-deploys to prevent Renovate from automatically merging in image digest updates so that the site doesn’t roll-forward.

Disable Renovate

  1. Go to renovate.json and comment out the file.
  2. Ensure that no Renovate PRs are currently pending to update the images here
  3. After the incident, revert your commit and uncomment the file.

Backing up & restoring a Cloud SQL instance (production databases)

Before any potentially risky operation you should ensure the databases have recent ( < 1 hour) backups. We currently have daily backups enabled.

You can create a backup of a Cloud SQL instance via gcloud sql backups create --instance=${instance_name} --project=sourcegraph-dev

To restore a Cloud SQL instance to a previous revision you can use gcloud sql backups restore $BACKUP_ID --restore-instance=${instance_name}

You can also perform these commands from the Google Cloud SQL UI

🚨 You should notify the #dev-ops channel if an situation arises when a restore my be required. It should also be filed in our ops-incident log.

Invalidating all user sessions

If all user sessions need to be invalidated, you can run this on the frontend database to force all users to log in again.

UPDATE users SET invalidated_sessions_at=now(), updated_at=now();

Accessing sourcegraph.com database

Via the CLI

Sourcegraph.com utilizes an external HA database. You will need to connect to it directly. The easiest way to do this is through the gcloud cli.

To connect to the production database:

  gcloud beta sql connect sg-cloud-732a936743 --user=sg -d sg --project sourcegraph-dev

However, if you want to use any other SQL client, you’ll have to run the cloud_sql_proxy utility, which authenticates with you local gcloud credentials automatically.

  cloud_sql_proxy -instances=sourcegraph-dev:us-central1:sg-cloud-732a936743=tcp:5555

Once the proxy connects successfully, you can use any client to connect to the local 5555 port (you can choose any other port you want).

The password of the sg user is in our shared 1Password under Google Cloud SQL

Via BigQuery (for read-only operations)

You can also query the production database via BigQuery as an external data source.

See an example query to get started.

Note: This method only permits read-only access

Restarting docs.sourcegraph.com

To restart the services powering docs.sourcegraph.com:

Configure kubectl context to the dotcom cluster

gcloud container clusters get-credentials cloud --zone us-central1-f --project sourcegraph-dev

Rollout a restart

kubectl -n default rollout restart deploy docs-sourcegraph-com

Wait a moment, and check https://docs.sourcegraph.com/.

Creating banners for maintenance tasks

For database upgrades or other tasks that might cause some aspects of Sourcegraph to be unavailable, we provide a banner across the top of the site. This can be done via editing Global Settings with the following snippet.

"notices": [
    {
"dismissible": false,
"location": "top",
"message": "🚀 Sourcegraph is undergoing a scheduled upgrade.
You may be unable to perform some write actions during this time, such as updating your user settings."
    },
]

k8s.sgdev.org

To learn more about this deployment, see instances.

Manage users in k8s.sgdev.org

To create an account on k8s.sgdev.org, log in with your Sourcegraph Google account via OpenID Connect.

To promote a user to site admin (required to make configuration changes), use the admin user credentials available in 1password (titled k8s.sgdev.org admin user) to log in to k8s.sgdev.org, and go to the users page to promote the desired user.

PostgreSQL

See PostgreSQL

Cloudflare Configuration

For documentation on how to configure Cloudflare’s WAF and rate limiter, see the security documentation.