Using the Workbench CLI

How to use the Workbench CLI tool.

This document gives an overview of the Workbench CLI tool’s commands and how to use them.

Commands description

Usage: terra [COMMAND]
Workbench CLI
Commands:
  app        Run applications in the workspace.
  auth       Retrieve and manage user credentials.
  bq         Call bq in the Terra workspace.
  config     Configure the CLI.
  cromwell   cromwell Generate a Cromwell configuration file.
  folder     Commands related to folder.
  gcloud     Call gcloud in the Terra workspace.
  git        Call git in the Terra workspace.
  group      Manage groups of users.
  gsutil     Call gsutil in the Terra workspace.
  nextflow   Call nextflow in the Terra workspace.
  notebook   Use GCP Notebooks in the workspace.
  resolve    Resolve a resource to its cloud id or path.
  resource   Manage resources in the workspace.
  server     Connect to a Terra server.
  spend      Manage spend profiles.
  status     Print details about the current workspace and server.
  user       Manage users.
  version    Get the installed version.
  workspace  Setup a Terra workspace.

The status command prints details about the current workspace and server.

The version command prints the installed version string.

The gcloud, git, gsutil, bq, and nextflow commands call third-party applications in the context of a Verily Workbench workspace.

The resolve command is an alias for the terra resource resolve command.

The other commands are groupings of sub-commands, described in the sections below.

Applications

Usage: terra app [COMMAND]
Run applications in the workspace.
Commands:
  execute  [FOR DEBUG] Execute a command in the application container for the
             Terra workspace, with no setup.
  list     List the supported applications.

The Workbench CLI allows running supported third-party tools within the context of a workspace. To see supported tools, run terra app list.

The app-launch configuration property controls how tools are run: in a Docker container, or a local child process.

If you pass --workspace flag, it must come immediately after the tool:

# Works
> terra bq --workspace=<workpspace-id> ls

# Doesn't work, --workspace is passed to bq instead of Workbench
> terra bq ls --workspace=<workpspace-id>

For creating resources such as BigQuery dataset or GCS bucket, you must create them through Workbench rather than GCP tools. This is because Workbench configures permissions for you.

# Works
> terra resource create gcs-bucket --name=<resource-name>

# Doesn't work
> terra gsutil mb gs://<bucket-name>

Authentication

Usage: terra auth [COMMAND]
Retrieve and manage user credentials.
Commands:
  login   Authorize the CLI to access Terra APIs and data with user credentials.
  revoke  Revoke credentials from an account.
  status  Print details about the currently authorized account.

Only one user can be logged in at a time. Call terra auth login to login as a different user.

Login uses the Google OAuth 2.0 installed application flow .

You don’t need to login again after switching workspaces. You will need to login again after switching servers, because different Workbench deployments may have different OAuth flows.

By default, the CLI opens a browser window for the user to click through the OAuth flow. For some use cases (e.g. Cloud Shell, cloud environment), this is not practical because there is no default (or any) browser on the machine. The CLI has a browser option that controls this behavior. terra config set browser MANUAL means the user can copy the URL into a browser on a different machine (e.g. their laptop), complete the login prompt, and then copy/paste the response token back into a shell on the machine where they want to use the Workbench CLI. Example usage:

> terra config set browser MANUAL
Browser launch mode for login is MANUAL (CHANGED).

> terra auth login
Please open the following address in a browser on any machine:
  https://accounts.google.com/o/oauth2/auth?access_type=offline&approval_prompt=force&client_id=[...]
Please enter code: *****
Login successful: testuser@gmail.com

Config

Usage: terra config [COMMAND]
Configure the CLI.
Commands:
  get   Get a configuration property value.
  list  List all configuration properties and their values.
  set   Set a configuration property value.

These commands are property getters and setters for configuring the Workbench CLI. Currently the available configuration properties are:

OPTION                VALUE                                          DESCRIPTION
app-launch            DOCKER_CONTAINER                               app launch mode
browser               AUTO                                           browser launch for login
image                 gcr.io/terra-cli-dev/terra-cli/0.246.0:stable  docker image id
resource-limit        1000                                           max number of resources to allow per workspace
console-logging       OFF                                            logging level for printing directly to the terminal
file-logging          INFO                                           logging level for writing to files
server                broad-dev-cli-testing                          (unset)
workspace             (unset)                                        (unset)
format                TEXT                                           output format

Cromwell

Utility commands for using the Cromwell workflow engine with Workbench.

Usage: terra cromwell [COMMAND]
Commands related to Cromwell workflows.
Commands:
  generate-config  Generate a Cromwell configuration file (cromwell.conf) for use on a workspace cloud environment.

To use Cromwell to run a WDL workflow in a cloud environment:

  • Run:
    terra cromwell generate-config \
        (--workspace-bucket-name=bucket_name | --google-bucket-name=gs://my-bucket) \
        [--dir=my/path]
    
  • One of workspace-bucket-name or google-bucket-name is required to specify the bucket used by Cromwell for workflow orchestration.
    • workspace-bucket-name is a Workbench resource name.
    • google-bucket-name is a Google Cloud Storage bucket. If google-bucket-name does not begin with the gs:// prefix, it will be automatically added.
  • Run java -Dconfig.file=path/to/cromwell.conf -jar cromwell/cromwell-81.jar server. This starts Cromwell server on localhost:8000.
  • In another terminal window, run cromshell. Enter localhost:8000 for cromwell server.
  • Start workflow through cromshell: e.g. cromshell submit workflow.wdl inputs.json [options.json] [dependencies.zip]

For more information, see https://github.com/broadinstitute/cromshell.

Git

Usage: terra git [COMMAND]
Call git command in the terra workspace. Besides calling normal Git operation, this command allow cloning git-repo resources in the workspace.
Commands:
  all        Clone all the git-repo resources in the workspace. Usage: terra git clone --all
  resource   Clone specified git-repo resources in the workspace. Usage: terra git clone --resource=<repoResource1Name> --resource=<repoResource2Name>

To add a git repo:

> terra resource add-ref git-repo --name=<resource_name> --repo-url=<repo_url>

Groups

Usage: terra group [COMMAND]
Manage groups of users.
Commands:
  add-user     Add a user to a group with a given policy.
  create       Create a new Terra group.
  delete       Delete an existing Terra group.
  describe     Describe the group.
  list         List the groups to which the current user belongs.
  list-users   List the users in a group.
  remove-user  Remove a user from a group with a given policy.

Workbench Groups are managed by SAM. These commands are utility wrappers around the group endpoints.

Say a Workbench Group’s email is mygroup@mydomain.com. name is mygroup, not mygroup@mydomain.com:

> terra group list-users --name=mygroup

Adding a member to a Workbench Group implicitly adds their pet service accounts. For example, say terra-user is added to mygroup@mydomain.com. When mygroup is granted access to a resource, terra-user is able to access that resource from any of their workspaces.

gsutil

You can run terra gsutil or terra gcloud alpha storage . gcloud alpha storage is a newer version of gsutil. It doesn’t support everything, but what it does support may be significantly faster .

Notebooks

Usage: terra notebook [COMMAND]
Use GCP Notebooks in the workspace.
Commands:
  start  Start a stopped GCP Notebook instance within your workspace.
  stop   Stop a running GCP Notebook instance within your workspace.

You can create a GCP Notebook controlled resource with terra resource create gcp-notebook --name=<resourcename> [--workspace=<id>] . These stop, start commands are provided for convenience. You can also stop and start the notebook using the gcloud notebooks instances start/stop commands.

Resources

Usage: terra resource [COMMAND]
Manage resources in the workspace.
Commands:
Usage: terra resource [COMMAND]
Manage resources in the workspace.
Commands:
  add-ref, add-referenced    Add a new referenced resource.
  check-access               Check if you have access to a referenced resource.
  create, create-controlled  Add a new controlled resource.
  credentials                Retrieve temporary credentials to access a cloud resource.
  delete                     Delete a resource from the workspace.
  describe                   Describe a resource.
  list                       List all resources.
  list-tree                  List all resources and folders in tree view.
  mount                      Mounts all workspace bucket resources.
  resolve                    Resolve a resource to its cloud id or path.
  unmount                    Unmounts all workspace bucket resources.
  update                     Update the properties of a resource.

A controlled resource is a cloud resource that is managed by Workbench. It exists within the current workspace context. For example, a bucket within the workspace Google project. You can create these with the create command.

A referenced resource is a cloud resource that is NOT managed by Workbench. It exists outside the current workspace context. For example, a BigQuery dataset hosted outside of Workbench or in another workspace. You can add these with the add-ref command. The workspace currently supports the following referenced resource:

  • gcs-bucket
  • gcs-object
  • bq-dataset
  • bq-table
  • git-repo

The check-access command lets you see whether you have access to a particular resource. This is useful when a different user created or added the resource and subsequently shared the workspace with you. check-access currently always returns true for git-repo reference type because workspaces don’t support authentication to external git services yet.

The list of resources in a workspace is maintained on the Workbench Workspace Manager server. The CLI caches this list of resources locally. Third-party tools can access resource details via environment variables (e.g. $TERRA_mybucket holds the gs:// URL of the workspace bucket resource named mybucket). The CLI updates the cache on every call to a terra resource command. So, if you are working in a shared workspace, you can run terra resource list (for example) to pick up any changes that your collaborators have made.

GCS bucket lifecycle rules

GCS bucket lifecycle rules are specified by passing a JSON-formatted file path to the terra resource create gcs-bucket command. The expected JSON structure matches the one used by the gsutil lifecycle command. This structure is a subset of the GCS resource specification . Below are some example file contents for specifying a lifecycle rule.

(1) Change the storage class to ARCHIVE after 10 days.

{
  "rule": [
    {
      "action": {
        "type": "SetStorageClass",
        "storageClass": "ARCHIVE"
      },
      "condition": {
        "age": 10
      }
    }
  ]
}

(2) Delete any objects with storage class STANDARD that were created before December 3, 2007.

{
  "rule": [
    {
      "action": {
        "type": "Delete"
      },
      "condition": {
        "createdBefore": "2007-12-03",
        "matchesStorageClass": [
          "STANDARD"
        ]
      }
    }
  ]
}

(3) Delete any objects that are more than 365 days old.

{
  "rule": [
    {
      "action": {
        "type": "Delete"
      },
      "condition": {
        "age": 365
      }
    }
  ]
}

There is also a command shortcut for specifying this type of lifecycle rule (3).

terra resource create gcs-bucket --name=mybucket --bucket-name=mybucket --auto-delete=365

GCS bucket object reference

A reference to an GCS bucket object can be created by calling

terra resource add-ref gcs-object --name=referencename --bucket-name=mybucket --object-name=myobject
Reference to a file or folder

A file or folder is treated as an object in GCS bucket. By either creating a folder through the cloud console UI or copying an existing folder of files to the GCS bucket, a user can create a folder object. So the user can create a reference to the folder if they have at least READER access to the bucket and/or READER access to the folder. Same with a file.

Reference to multiple objects under a folder

Different from other referenced resource type, there is also support for creating a reference to objects in the folder. For instance, a user may create a a foo/ folder with bar.txt and secret.txt in it. If the user have at least READ access to foo/ folder, they have access to anything in the foo/ folder. So they can add a reference to foo/bar.txt, foo/\* or foo/\*.txt.

NOTE Be careful to provide the correct object name when creating a reference. We only check if the user has READER access to the provided path, we do not check whether the object exists. This is helpful because when referencing to foo/*, it is actually not a real object! So a reference to fooo/ (where object fooo does not exist) can be created if the user has READER access to the bucket or foo/\*.png (where there is no png files) if they have access to the foo/ folder.

Update a Reference resource

User can update the name and description of a reference resource. User can also update a reference resource to another of the same type. For instance, if a user creates a reference resource to Bq dataset foo and later on wants to point to Bq dataset bar in the same project, one can use terra resource udpate --name=<fooReferenceName> --new-dataset-id=bar to update the reference. However, one is not allowed to update the reference to a different type (e.g. update a dataset reference to a data table reference is not allowed).

Mounting workspace Resources

Users can mount GCS buckets and referenced folder objects locally to the user’s home directory in $HOME/workspace/ by running terra resource mount.

Users can specify the --name flag with the name of a GCS bucket or GCS object resource to only mount that individual resource. This flag is useful for remounting a resource that had failed to mount or has been moved to a different folder in the workspace.

By default, controlled GCS buckets and referenced folder objects created by the user will be mounted with read-write permissions while controlled buckets created by other users and referenced bucket folders will be mounted with read-only permissions. Users can override this default behavior by specifying the --read-only flag. Ex: terra resource mount --read-only for all mounts to be read-only or terra resource mount --name=mybucket --read-only=false for all mounts to be read-write.

Users can specify the --disable-cache flag. This will disable file metadata caching and file type caching for objects in the mounted buckets. List operations such as ls will be slower, but will reflect the most up-to-date state of the bucket. This is useful when working with collaborators in a shared workspace. See more details in the gcsfuse repository.

Mount Failures

If a mount has failed, an empty directory will be left at mount point with the resource name and a suffix error string indicating the failure. Users can remount the bucket with terra resource mount after resolving bucket access or bucket reference issues.

Unmounting a single resource can fail if the resource has been renamed or moved to a different workspace folder. In this case, users can either run terra resource unmount to unmount all mounted resources in $HOME/workspace/. Or, users can directly list out all mounted filesytems with mount and then unmount the resource using its mount path with fusermount -u (for linux) or umount for (MacOS).

Server

Usage: terra server [COMMAND]
Connect to a Terra server.
Commands:
  list    List all available Terra servers.
  set     Set the Terra server to connect to.
  status  Print status and details of the Terra server context.

A Workbench server or environment is a set of connected Workbench services (e.g. Workspace Manager, Data Repo, SAM).

Workspaces exist on a single server, so switching servers will change the list of workspaces available to you.

Spend

These commands are intended for admin users. Admins, see ADMIN.md for more details.

User

ssh-key

Ensure you have the latest CLI version. To install a new CLI version, first manually uninstall the existing CLI via rm -R ~/.terra, and then install the latest CLI.

terra user ssh-key helps Workbench support source control in a notebook environment. It handles the ssh key of the current user. There is one Workbench SSH key per user. With this SSH key, you can perform source control in a Workbench-managed notebook instance using git.

To set up an SSH key, run terra user ssh-key add to add the Workbench SSH key to your local machine. You should see in the output an SSH public key starting with ssh-rsa. Then copy the public key from the command output and add it to GitHub, as per these instructions.

If you think your key is compromised (e.g. the private key on your local machine is leaked to another user), you must delete the key from your GitHub account and run terra user ssh-key generate to generate a new Workbench SSH key. Once a new key is generated, you need to associate this new key with your GitHub account again.

Workspace

Usage: terra workspace [COMMAND]
Set up a Terra workspace.
Commands:
  add-user         Add a user or group to the workspace.
  break-glass      Grant break-glass access to a workspace user.
  clone            Clone an existing workspace.
  create           Create a new workspace.
  delete           Delete an existing workspace.
  delete-property  Delete the workspace properties.
  describe         Describe the workspace.
  list             List all workspaces the current user can access.
  list-users       List the users of the workspace.
  remove-user      Remove a user or group from the workspace.
  set              Set the workspace to an existing one.
  set-property     Set the workspace properties.
  update           Update an existing workspace.

A Workbench workspace is backed by a Google project. Creating/deleting a workspace also creates/deletes the project.

The break-glass command is intended for admin users. Admins, see ADMIN.md for more details.

Folder

Usage: terra folder [COMMAND]
Commands related to folder.
Commands:
  tree          show the folder hierarchy.
  set-property  Set the folder properties.
  create        Create a folder.
  delete        Delete a folder.
  update        Update a folder.

Example usage

The commands below walk through a brief demo of the existing commands.

Fetch the user’s credentials. Check the authentication status to confirm the login was successful.

terra auth login
terra auth status

Ping the Terra server.

terra server status

Create a new Terra workspace and backing Google project. Check the current context to confirm it was created successfully.

terra workspace create --id=<my-workspace-id>
terra status

List all workspaces the user has read or write access to.

terra workspace list

If you want to use an existing Terra workspace, use the set command instead of create.

terra workspace set --id=eb0753f9-5c45-46b3-b3b4-80b4c7bea248

Set the Gcloud user and application default credentials.

gcloud auth login
gcloud auth application-default login

Nextflow Examples

Run a Nextflow hello world example (requires Docker image set and container running, or Nextflow to be installed locally. For Docker support, export TERRA_CLI_DOCKER_MODE=DOCKER_AVAILABLE before installing terra):

terra nextflow run hello

Run an example Nextflow workflow in the context of the a workspace (i.e. in the workspace’s backing Google project). This is the same example workflow used in the GCLS tutorial.

  • Download the workflow code from GitHub.

    git clone https://github.com/nextflow-io/rnaseq-nf.git
    cd rnaseq-nf
    git checkout v2.0
    cd ..
    
  • Create a bucket in the workspace for Nextflow to use.

    terra resource create gcs-bucket --name=mybucket --bucket-name=mybucket
    
  • Update the gls section of the rnaseq-nf/nextflow.config file to point to the workspace project and bucket we just created.

      gls {
          params.transcriptome = 'gs://rnaseq-nf/data/ggal/transcript.fa'
          params.reads = 'gs://rnaseq-nf/data/ggal/gut_{1,2}.fq'
          params.multiqc = 'gs://rnaseq-nf/multiqc'
          process.executor = 'google-lifesciences'
          process.container = 'nextflow/rnaseq-nf:latest'
          workDir = "$TERRA_mybucket/scratch"
    
          google.region  = 'us-east1'
          google.project = "$GOOGLE_CLOUD_PROJECT"
    
          google.lifeSciences.serviceAccountEmail = "$GOOGLE_SERVICE_ACCOUNT_EMAIL"
          google.lifeSciences.network = 'network'
          google.lifeSciences.subnetwork = 'subnetwork'
      }
    
  • Do a dry-run to confirm the config is set correctly.

    terra nextflow config rnaseq-nf/main.nf -profile gls
    
  • Kick off the workflow. (This takes about 10 minutes to complete.)

    terra nextflow run rnaseq-nf/main.nf -profile gls
    
  • To send metrics about the workflow run to a Nextflow Tower server, first define an environment variable with the Tower access token. Then specify the -with-tower flag when kicking off the workflow.

    export TOWER_ACCESS_TOKEN=*****
    terra nextflow run hello -with-tower
    terra nextflow run rnaseq-nf/main.nf -profile gls -with-tower
    
  • Call the Gcloud CLI tools in the current workspace context. This means that Gcloud is configured with the backing Google project and environment variables are defined that contain workspace and resource properties (e.g. bucket names, pet service account email).

    terra gcloud config get-value project
    terra gsutil ls
    terra bq version
    
  • See the list of supported third-party tools. The CLI runs these tools in a Docker image, if app-launch mode is DOCKER_CONTAINER. If the app-launch mode is LOCAL_PROCESS, the CLI will assume the tools are available in the current shell environment and launch them there.

    terra app list
    
  • Print the image tag that the CLI is currently using.

    terra config get image
    

Last Modified: 16 November 2023