Create batch jobs with the Workbench CLI
Categories:
Prior reading: Use the Cromwell engine to run WDL workflows
Purpose: This document provides detailed instructions for creating, running, and monitoring WDL workflow batch jobs using the Workbench CLI.
Introduction
If you want to run a workflow against multiple sets inputs, you'll need to use the Workbench CLI. The Workbench UI currently only supports creating single-job workflows.
Workbench currently supports WDL workflow batch runs on GCP workspaces.
Terminology
The following concepts are important to understand when it comes to running workflows:
A batch run refers to multiple workflows that are launched in parallel. These batch jobs are orchestrated through Workbench.
A job is a general term that describes a unit of work or execution in computing. Within Verily Workbench, a job refers to a running instance of a workflow.
A job takes in the workflow configuration (the WDL file(s)) and a key-value pair input. Optional inputs include enable call caching, delete intermediate output files, and retry with more memory. Jobs can also be referred to as **runs**.A task refers to a stage or activity executed within a workflow. Multi-stage workflows are divided into a series of tasks, while a single-stage workflow is one task itself.
In Cromwell, tasks are referred to as [sub-workflows](https://cromwell.readthedocs.io/en/stable/SubWorkflows/).Create a workflow and run a batch job
Prerequisites
Run wb version
to confirm you're running at least version 0.422.99 of the Workbench CLI.
Set your workspace to the one you want to use for your workflow by running wb workspace set --id=<workspace-name>
.
Create a workflow
To create a workflow, you can either use the Workbench UI or the Workbench CLI. See Add workflows for UI instructions.
To use the Workbench CLI, run the following, replacing the values in <>
. Note that Git repositories aren't currently supported as repos :
wb workflow create \
--bucket-id=<bucket-id>\
--path=<path/to/your/main.wdl> \
--workflow=<workflow-name>
Note
The Workbench UI will only list workflows that were created in the UI. To list all workflows created via the UI and/or the CLI, run <code>wb workflow list</code>.
Run a batch job
Note
The following steps are related to batch jobs. Currently, all actions related to batch jobs must be done via the Workbench CLI.
The following example command runs six jobs. All options are required except for output-path
. If output-path
isn't defined, it will use a value of the job display name + job
+ a timestamp:
wb workflow job run \
--workflow=cram-to-bam \
--output-bucket-id=example_wdl_output \
--output-path=example-workflow-execution \
--batch-input-bucket-id=workflows-testing-folder \
--batch-input-csv-path=1000genomes_6_high_cov_cram_to_bam.csv \
--column-mapping-uri=gs://workflows-testing-folder-vwb-xxx-xxxxxx-xxxxx-1234/cram-to-bam-columns.json
Monitor and debug batch jobs
Monitor your batch job
To list all batch jobs, run the following command:
wb workflow job batch list
You can also list batch jobs by user:
wb workflow job batch list --created-by=<user-email>
To see summaries of the runs in a specific batch job, run the following command, replacing <uuid>
with your batch job ID:
wb workflow job list --batch-job-id=<uuid>
You can print the result of a single job using the wb workflow job describe
command with a job-id
specified:
wb workflow job describe --job-id=00ca8098-2275-413d-bea8-3b94afbbd4b0
You'll see output like this:
Job ID: 00ca8098-2275-413d-bea8-3b94afbbd4b0
Display name: cram-to-bam_job_2025-06-23T182802757928323Z
Description: null
Workflow id: cram-to-bam
Status: RUNNING
Status message: Batch run cram-to-bam_job_2025-06-23T182802757928323Z-00ca8098-2275-413d-bea8-3b94afbbd4b0 starting
Output bucket path: gs://workflows-testing-folder-vwb-xxx-xxxxxx-xxxxx-1234/pxx-workflow-executions/CramToBamFlow/00ca8098-2275-413d-bea8-3b94afbbd4b0
Created by: pxx@verily.health
Created date: Mon Jun 23 14:28:26 EDT 2025
Engine type: CROMWELL
Batch run attributes:
bucket name: workflows-testing-folder-vwb-xxx-xxxxxx-xxxxx-1234
table: yp_1000genomes_13highcov_crams_ref_fixed.csv
row selection: null
input mapping:
CramToBamFlow.ref_dict: ref_dict
CramToBamFlow.ref_fasta: ref_fasta
CramToBamFlow.input_cram: input_cram
CramToBamFlow.sample_name: sample_name
CramToBamFlow.ref_fasta_index: ref_fasta_index
Batch job summary:
Total batch runs: 1000
Success runs: 987
Failed runs: 8
Pending runs: 0
Starting runs: 0
Ongoing runs: 5
Stopping runs: 0
Cancelled runs: 0
Deleted runs: 0
In the example above, you can see the various attributes of the batch job, as well as the batch job summary listing successful, failed, and ongoing runs.
Debug failed runs
Get the status of your batch job runs with the wb workflow job list
command:
wb workflow job list --batch-job-id=00ca8098-2275-413d-bea8-3b94afbbd4b0
You'll see output like this:
JOB_ID STATUS DISPLAY_NAME SUBMITTED_BY SUBMITTED_TIME
7b404ff3-1e48-40e6-8c7c-fcd5c1ea2b05 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
5bfa901c-d7b4-4ca3-9029-3713f134da9c COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
62fdebf4-6b06-4605-b850-51300a6af0d5 FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
df67f770-584e-434c-8fab-bab5a7028085 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
105cb63d-4233-4d26-9756-284a6b957647 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
451a3394-d74a-4b2a-9dfb-95889df66387 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
e0f3fd5c-42c7-497d-83c3-aae20b9d1226 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:43 EDT 2025
6af2467a-815a-4088-b8d6-b3d6371bc169 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:43 EDT 2025
3a445dcc-7ddb-453b-9ca7-3fec0ab1b4c9 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:43 EDT 2025
1671204f-c2c1-435a-8999-df1663525a92 COMPLETED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:43 EDT 2025
We can see the FAILED
status for job ID 62fdebf4-6b06-4605-b850-51300a6af0d5
. To debug further, you can run the wb workflow job describe
command with that particular job ID. The status message provides more insight into the failed run:
wb workflow job describe --job-id=62fdebf4-6b06-4605-b850-51300a6af0d5
Job ID: 62fdebf4-6b06-4605-b850-51300a6af0d5
Display name: cram-to-bam_job_2025-06-23T182802757928323Z_1003
Description: null
Workflow id: cram-to-bam
Status: FAILED
Status message: CramToBamFlow.CramToBamTask: Task CramToBamFlow.CramToBamTask:NA:1 failed. The job was stopped before the command finished. GCP Batch task exited with VMRecreatedDuringExecution(50006).
Output bucket path: gs://workflows-testing-folder-vwb-xxx-xxxxxx-xxxxx-1234/pxx-workflow-executions/CramToBamFlow/62fdebf4-6b06-4605-b850-51300a6af0d5
Created by: pxx@verily.health
Created date: Mon Jun 23 14:31:44 EDT 2025
End time: Mon Jun 23 16:38:21 EDT 2025
Engine type: CROMWELL
Batch run id: 00ca8098-2275-413d-bea8-3b94afbbd4b0
You can also quickly view a list of all of the failed runs in a batch job by filtering on runs with FAILED
status:
wb workflow job list --batch-job-id=00ca8098-2275-413d-bea8-3b94afbbd4b0 --status=FAILED
JOB_ID STATUS DISPLAY_NAME SUBMITTED_BY SUBMITTED_TIME
62fdebf4-6b06-4605-b850-51300a6af0d5 FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:31:44 EDT 2025
9df44098-904d-4f57-8c16-09a1ba88bf44 FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:30:49 EDT 2025
ba94048c-24e9-4d13-9f19-276679864bcc FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:30:48 EDT 2025
6201ca06-a2a8-482d-a2a8-c51436a31b9a FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:30:39 EDT 2025
54533078-1060-485c-bad1-a578bb52ba86 FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:30:09 EDT 2025
4c03906a-a74c-4f62-ae25-af172eccb11e FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:29:58 EDT 2025
369e353d-6369-4326-af00-7d851075da0e FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:29:27 EDT 2025
276b4208-c450-4b3a-830b-57b126442250 FAILED cram-to-bam_job_2025-06-23T1828027579... pxx@verily.health Mon Jun 23 14:29:10 EDT 2025
Last Modified: 25 June 2025