Cloud environments overview
Categories:
Purpose: This document provides a high-level understanding of cloud environments in Verily Workbench and their capabilities, key components, and built-in vs. customization options.
What are cloud environments and what capabilities do they provide?
A cloud environment is a configurable pool of cloud computing resources. Cloud environments consist of a virtual machine and a persistent disk, with some useful libraries and tools preinstalled. They’re ideal for interactive analysis and data visualization, and can be finely tuned to suit analysis needs.
Cost is incurred while the cloud environment is running, based on your configuration. You can pause the environment when it’s not in use, but there’s still a charge for maintaining your disk.
Interface: JupyterLab notebooks and terminal
Verily Workbench uses JupyterLab as the default interface for working in your cloud environments. JupyterLab is a third-party notebook authoring application and editing environment that provides powerful features for developing and executing code in a highly reproducible manner.
A notebook is a type of digital document that provides an interactive computational environment. Notebooks combine code inputs and outputs into a single file. One of the key advantages notebooks provide is the ability to display visualizations and modeling alongside your code. Notebooks support a diverse range of rich media and are a powerful tool for conducting interactive analysis.
You also have the option of using the JupyterLab Terminal to do work in your cloud environment.
Cloud environment operations
You can perform the following operations on workspaces:
- List your cloud environments
- Create a new cloud environment
- Edit environment name and description
- Start or stop cloud environment
- Delete cloud environment
For instructions on how to perform these operations through the web UI, from the Environments tab in a workspace, see Cloud environment operations.
For information on performing these operations using the Workbench CLI, see the Command Line Interface documentation.
Preconfigured environments and customization options
Workbench provides a set of preconfigured cloud environments intended to serve as a base for your work. For a GCP-based workspace, there are Vertex AI notebook, Dataproc cluster, and Compute Engine instances for custom applications. For an AWS-based workspace, there is SageMaker notebook.
Vertex AI notebook
Vertex AI notebook uses images selected from the Google Cloud Deep Learning VM Images.
You can further customize these by installing additional libraries into your environment, or you can bring in your own custom cloud environment by specifying a Docker container image.
Custom applications
Workbench provides Visual Studio Code and R Analysis Environment as pre-configured apps. You can also create your own custom apps. For information on creating custom applications, see Create custom applications.
Compute profile options and cost
The compute profile of a cloud environment refers to the type and amount of computational resources allocated to that environment, including CPUs, GPUs and memory. This is analogous to the hardware configuration of a computer, and plays an important role in determining how fast your analyses will run within that cloud environment.
When you create a new cloud environment, Workbench proposes default compute profile settings calibrated for average usage. However, you can adjust the compute profile of your environment to suit the needs of your analyses.
Since the compute profile of a cloud environment determines its usage cost, we encourage you to evaluate your project needs carefully and adjust the compute profile of your environments in order to minimize your costs.
For a list of compute profile options and details on how to adjust them, see Compute profile configuration options.
For more information on cost management best practices, see Cloud cost management.
For more information on how we calculate environment cost estimates, see Cloud environment cost estimates.
Accessing workspace files and folders
When you create a new cloud environment, any storage buckets and folders referenced as workspace resources are automatically mounted (connected) to the environment using the Cloud Storage FUSE protocol. This enables you to read any files or data stored in those resources directly from within your environment.
For more information about how to access these resources and customize the relevant system settings, see Accessing workspace files and folders from your cloud environment.
Git integrations
You can add your Git repositories to a workspace to support source control and collaboration. Any Git repositories added to a given workspace before a cloud environment’s creation will automatically be cloned to that cloud environment. You can then use the git CLI or JupyterLab git extension to commit and push changes.
For more information, see Git integrations with cloud environments.
Last Modified: 11 September 2024