Cloud Environments

What are cloud environments and how do you use them?

Goal: High-level understanding of what capabilities the cloud environments provide, what are the key components involved, what are the built-in vs. customization options, and what is involved in using them.

What are cloud environments and what capabilities do they provide?

A cloud environment is a configurable pool of cloud computing resources. Cloud environments consist of a virtual machine and a persistent disk, with some useful libraries and tools preinstalled. They’re ideal for interactive analysis and data visualization, and can be finely tuned to suit analysis needs.

Cost is incurred while the cloud environment is running, based on your configuration. You can pause the environment when it’s not in use, but there’s still a charge for maintaining your disk.

Interface: JupyterLab notebooks and terminal

Verily Workbench uses JupyterLab as the default interface for working in your cloud environments. JupyterLab is a third-party notebook authoring application and editing environment that provides powerful features for developing and executing code in a highly reproducible manner.

A notebook is a type of digital document that provides an interactive computational environment. Notebooks combine code inputs and outputs into a single file. One of the key advantages notebooks provide is the ability to display visualizations and modeling alongside your code. Notebooks support a diverse range of rich media and are a powerful tool for conducting interactive analysis.

You also have the option of using the JupyterLab Terminal to do work in your cloud environment.

Cloud environment operations

You can perform the following operations on workspaces:

  • List your cloud environments
  • Create a new cloud environment
  • Edit environment name and description
  • Start or stop cloud environment
  • Duplicate cloud environment
  • Delete cloud environment

For instructions on how to perform these operations through the web UI, from the Environments tab in a workspace, see Cloud environment operations.

For information on performing these operations using the Workbench CLI, see the Command Line Interface documentation.

Preconfigured environments and customization options

Workbench provides a set of preconfigured cloud environments intended to serve as a base for your work. For a GCP-based workspace, there are Vertex AI notebook, Dataproc cluster, and Compute Engine instances for custom applications. For an AWS-based workspace, there is SageMaker notebook.

Vertex AI notebook

Vertex AI notebook uses images selected from the Google Cloud Deep Learning VM Images.

You can further customize these by installing additional libraries into your environment, or you can bring in your own custom cloud environment by specifying a Docker container image.

Custom Applications

Workbench provides Visual Studio Code and RStudio as pre-configured apps. You can also create your own custom apps. For information on creating custom applications, see Create custom applications.

Compute profile options and cost

The compute profile of a cloud environment refers to the type and amount of computational resources allocated to that environment, including CPUs, GPUs and memory. This is analogous to the hardware configuration of a computer, and plays an important role in determining how fast your analyses will run within that cloud environment.

When you create a new cloud environment, Workbench proposes default compute profile settings calibrated for average usage. However, you can adjust the compute profile of your environment to suit the needs of your analyses.

Since the compute profile of a cloud environment determines its usage cost, we encourage you to evaluate your project needs carefully and adjust the compute profile of your environments in order to minimize your costs.

For a full list of compute profile options and instructions on how to adjust them, see Compute profile configuration options.

For more information on cost management best practices, see Cloud cost management.

For more information on how we calculate environment cost estimates, see Cloud environment cost estimates.

Accessing workspace files and folders

When you create a new cloud environment, any storage buckets and folders referenced as workspace resources are automatically mounted (connected) to the environment using the Cloud Storage FUSE protocol. This enables you to read any files or data stored in those resources directly from within your environment.

For more information about how to access these resources and customize the relevant system settings, see Accessing workspace files and folders from your cloud environment.

Git integrations

You can add your Git repositories to a workspace to support source control and collaboration. Any Git repositories added to a given workspace before a cloud environment’s creation will automatically be cloned to that cloud environment. You can then use the git CLI or JupyterLab git extension to commit and push changes.

For more information, see Git integrations with cloud environments.

Detailed documentation

Accessing workspace files and folders from your cloud environment

Accessing mounted workspace files and folders from your cloud environment

Cloud environment operations

Operations that can be performed on cloud environments through the Verily Workbench web UI

Compute profile configuration options

Options for configuring the compute profile of a cloud environment

Cloud environment cost estimates

How Workbench calculates cost estimates for cloud environments

Create custom applications

Creating custom applications in Verily Workbench

Git integrations with cloud environments

Working with Git and GitHub

Last Modified: 16 April 2024