Use the Cromwell engine to run WDL workflows

Using the Cromwell engine to run WDL workflows on Verily Workbench

Prior reading: Workflows in Verily Workbench: Cromwell, dsub, and Nextflow

Purpose: This document provides detailed instructions for running and monitoring Cromwell workflows in Verily Workbench.



Running and monitoring Cromwell workflows

Verily Workbench provides built-in support for running and monitoring WDL-based workflows via the Cromwell workflow engine. Right within the UI, you can add workflows, run them with a set of inputs, and monitor their execution.

Adding workflows

Before running a workflow on Verily Workbench, it first needs to be added to the workspace. The system currently supports WDL files stored in Google Cloud Storage, and in the future will support GitHub as a storage type as well.

To add a workflow, you first need a Google Cloud Storage bucket to add the source WDL file to. Create a bucket if one does not already exist. To do so, navigate to the Resources tab in a workspace, click the + New resource button, and select New Cloud Storage bucket.

Screenshot of '+ New resource' button, highlighting 'New Cloud Storage bucket' option.

Next, upload the source WDL file to the bucket with gsutil. Some alternative methods for workflow upload include:

Verily Workbench supports the execution of workflows with sub-workflows, so if your workflow is composed of multiple WDLs, add them all to the bucket.

gsutil cp path/to/workflow.wdl gs://some-bucket-name/

The final step is to create the workflow. First navigate to the Workflows tab. If this is your first workflow, click the Add your first workflow link. Otherwise, click the Add workflow button.

Screenshot of a Workflows overview page, with Workflows tab and 'Add your first workflows' link highlighted.

On the first page of the dialog, select the bucket that contains your WDL(s), then select the main WDL file. Click Next.

Screenshot of the Provide source dialog, the first step when adding a new workflow.

Add a display name for the workflow. Click Add to workspace.

Screenshot of the Review details dialog, the last step when adding a new workflow.

You'll now see the workflow listed in the Setup sub-tab.

Screenshot of the 'Setup' tab, showing the newly created cramToBam workflow.

Running workflow jobs

To run a workflow via the Verily Workbench UI, first navigate to the Workflows tab, and select an existing workflow in the Setup sub-tab. If one does not exist, please follow the instructions in the above section.

Screenshot of the 'Setup' tab, showing the newly created cramToBam workflow.

Click the New job button in the workflow details tab. A Creating new job dialog will open.

Screenshot of the cramToBam workflow's details panel with '+ New job' button highlighted.

On the Enter job details step, the UI will provide a default display name for the job, which you may optionally adjust. Click Next.

Screenshot of Enter job details dialog, the first step when creating a new job.

On the Prepare inputs step, the UI will show two sections: Run options and Input form.

Screenshot of Prepare inputs dialog showing run options and input form, the second step when creating a new job.

The Run options section allows the user to configure a few options specific to running workflows in Cromwell:

  • Enable call caching - Allows the user to cache job results to speed up future executions of the same job run with the same inputs. On by default.
  • Delete intermediate output files - Removes intermediate files created by Cromwell during workflow execution. On by default.
  • Retry with more memory - If selected, allows the user to specify a factor which the VM's memory will be multiplied by if the workflow fails and Cromwell retries execution. For example, if a factor of 1.2 is provided, and the initial VM memory is 2GB, then when retried the VM will be upgraded to 2.4GB memory.

The Input form section is where a user will enter inputs for the workflow. Whenever you open this dialog to run a job, Verily Workbench parses the WDL(s) to determine what inputs can be provided. It then dynamically displays these inputs to the user as an input form, complete with validation specific to the WDL type for each field. Optional fields are hidden by default and can be adjusted by unchecking the "Show required inputs only" checkbox.

After adjusting options and inputs, click the Next button to continue.

Finally, the Set up outputs step will allow the user to choose a destination bucket to write the workflow outputs and intermediate files to. This will default to the bucket the WDL is located in, but the user can choose any bucket in the workspace. The top-level folder name in the bucket will default to the display job name; this can also be adjusted. Click the Run button when ready and your workflow will begin execution.

Screenshot of Set up outputs dialog, the last step when creating a new job.

Monitoring workflow jobs

To monitor a workflow that's previously been executed within a workspace, click the Job status sub-tab under the Workflows tab.

Screenshot of a workspace's Workflows tab, with 'Job status' sub-tab highlighted.

The Jobs table shows the name, status, and submission date of every executed workflow job in the workspace. To see more details for a workflow, select its row.

Screenshot of Job status page showing a table of executed workflow jobs.

The workflow details pane will show source information, as well as a table displaying status of each individual task executed by the workflow. When a task is done, you can click View in GCP to get a direct link to the workflow's execution folder in Google Cloud Storage.

Screenshot of an executed job's details with two tasks in 'done' status.

If a task or workflow fails, you can hover over the status to view the error text. Screenshot showing a 'workflow failed' error message.

Last Modified: 21 November 2024