Run the Workbench CLI from a container

How to build a Workbench CLI container image and use it to run Workbench CLI commands

Introduction

At times you may find it useful to use the Workbench CLI from within a docker container. This might be the case, for example, if you wanted to automate some aspect of workspace management programmatically using the CLI.

This tutorial walks through the process of creating and using such a container image. It is assumed that the container will run on a (Google Cloud) node that is configured to use a Verily Workbench workspace service account. The Workbench CLI will be authenticated in the running container using that service account.

We give an example of using the container image for a dsub task, which is configured to use a Workbench workspace service account, and takes as input a script of terra commands to run.

You can follow the instructions here, using either the Cloud Build or the docker instructions, for the details of how to build and push a container image.

0. Create a Verily Workbench workspace

If you haven’t already done so, create a Workbench workspace to use with this tutorial. You can do this from the Workbench UI, or alternately via the Workbench CLI

1. Create a Dockerfile

We’ll first define a Dockerfile for the container image. Create a new subdirectory, and write the Dockerfile shown below to that directory.

Filename: Dockerfile
FROM ubuntu:20.04

ARG DEBIAN_FRONTEND=noninteractive
ENV TZ=Etc/UTC

RUN apt-get update -y
RUN apt-get install -y wget zip tzdata curl jq

# install python
RUN apt-get update && apt-get install -y \
    python3 \
    python3-pip

RUN python3 -m pip --no-cache-dir install --upgrade \
    pip \
    setuptools

# install gcloud sdk
RUN wget -nv https://dl.google.com/dl/cloudsdk/release/google-cloud-sdk.zip && \
    unzip -qq google-cloud-sdk.zip -d tools && \
    rm google-cloud-sdk.zip && \
    tools/google-cloud-sdk/install.sh --usage-reporting=false \
        --path-update=false --bash-completion=false \
        --disable-installation-options && \
    tools/google-cloud-sdk/bin/gcloud -q components update \
        gcloud core gsutil && \
    tools/google-cloud-sdk/bin/gcloud config set component_manager/disable_update_check true && \
    touch /tools/google-cloud-sdk/lib/third_party/google.py

# Java install

RUN apt-get install -y openjdk-17-jre
RUN apt-get install -y openjdk-17-jdk

# terra-cli install

RUN curl -L https://storage.googleapis.com/workbench-public/workbench-cli/download-install.sh | bash
RUN mkdir -p /bin
RUN mv terra /bin

ENV PATH $PATH:/bin
ENV PATH $PATH:/tools/google-cloud-sdk/bin

# Set browser manual login
RUN terra config set browser MANUAL
# set to use the production server
RUN terra server set --name=verily

# A script that runs in this container must first auth with app-default-credentials before
# running any other terra cli commands:
# terra auth login --mode=APP_DEFAULT_CREDENTIALS

The definition is straightforward: starting from an Ubuntu base image, we’ll install some utilities, Python, the gcloud sdk, and Java, then the Workbench CLI. Then, we’ll run terra config set browser MANUAL, which sets up terra auth to work in the container context.

In this example, we’re not adding an ENTRYPOINT or CMD to the container definition, as we’ll instead pass dsub a script to run in the container. However, if you’re using the container image in a different context, you could add what fits your use case, as well as do any additional installations you need.

2. Build and push the container image

Follow the instructions here to build and push your container image to a Workbench workspace Artifact Registry repository. You can do this from either your local machine or from a Workbench notebook environment.

Note the URI for the container image; you’ll use it later in the tutorial when configuring dsub.

3. Create a GCS bucket resource as necessary

Next, if you don’t already have a controlled resource bucket in your workspace to use for this tutorial, create one now.

This notebook, when run in a workspace notebook environment, sets up some workspace resources for you– including creating a “temporary” GCS resource that autodeletes its contents after two weeks. You can also create workspace buckets via the UI, as described here.

4. Create a script for dsub to run in the container

Next, we’ll create a small example script for dsub to run in the container.

The first line is always required— it auths the terra install. As noted above, this line assumes that the VM in which the container is running will be configured to use a Workbench service account.

The rest of the script can be what you like. This example sets a workspace to use, and then lists the resources in that workspace. We’ll set the workspace ID as an environment variable in the dsub call, shown below.

(For this example script, the workspace must already exist. You could alternately define a script to create a new workspace, then use it.)

Filename: test-script.sh
terra auth login --mode=APP_DEFAULT_CREDENTIALS
terra workspace set --id=${WORKSPACE_ID}
terra resource list
# add additional commands... e.g., create a new resource

Save your script to a file (e.g. test-script.sh).

5. Use dsub to run some terra commands

Now we are set up to run a dsub task. dsub is already installed in Workbench notebook environments, or you can install it locally if you like. The dsub command needs specification of a logging bucket. Use a workspace-controlled GCS bucket as described in Step 3 above. (If you ran the workspace_setup notebook to create a bucket that autodeletes old content, you may want to use that for the logging bucket.)

For the dsub command-line arguments, it’s necessary to specify info about your workspace project id, the service account associated with your workspace, GCS bucket URI(s) to use for the logging directory and the input script, and the URI for the docker container you pushed to the workspace’s Artifact Registry.

We’re also passing the workspace ID as an environment variable, to use it in the script.

Run the dsub command in a notebook environment

If you’re running dsub in a workspace notebook environment, environment variables will be set for your project ID and workspace service account. Edit the following command with the rest of your details:

dsub --project $GOOGLE_CLOUD_PROJECT --provider google-cls-v2 --logging gs://<YOUR_BUCKET>/logs \
    --env WORKSPACE_ID=<YOUR_WORKSPACE_ID> \
    --service-account $GOOGLE_SERVICE_ACCOUNT_EMAIL \
    --network network --subnetwork subnetwork \
    --image us-central1-docker.pkg.dev/${GOOGLE_CLOUD_PROJECT}/<YOUR_REPO_NAME>/<CONTAINER_NAME>:<TAG> \
    --script test-script.sh --wait

You can get the GCS URI of a GCS resource via
terra resource resolve --name <resource_name>, or from the “details” panel for that resource in the Workbench UI.

You can find the workspace ID in the Overview panel in the UI, or via
terra status or terra workspace describe.

Run the dsub command on your local machine

If you’re running the command on your local machine, you will also need to specify your workspace’s project ID and service account. You can find the project ID in the Workbench UI, in the “Overview” panel for your workspace.

You can find your workspace’s service account address by running the following command (ensure first that terra is set to the correct workspace via terra status):

terra app execute env | grep GOOGLE_SERVICE_ACCOUNT_EMAIL

Then edit the following with your details:

dsub --project <YOUR_PROJECT_ID> --provider google-cls-v2 --logging gs://<YOUR_BUCKET>/logs \
    --env WORKSPACE_ID=<YOUR_WORKSPACE_ID> \
    --service-account <YOUR_WORKSPACE_SERVICE_ACCOUNT> \
    --network network --subnetwork subnetwork \
    --image us-central1-docker.pkg.dev/<YOUR_PROJECT_ID>/<YOUR_REPO_NAME>/<CONTAINER_NAME>:<TAG> \
    --script test-script.sh --wait

Summary

This tutorial showed how to build and use a container image that can run the Workbench CLI. We used dsub for this example since it provides a convenient way to set up a node using the workspace’s service account, and run a script in the container.

However, the Workbench CLI container could be used in many other contexts. The Dockerfile could be edited to include other installs or include an ENTRYPOINT specification.

Last Modified: 16 November 2023