Run AlphaFold from a notebook
Categories:
Prior reading: Use cloud environments for analysis
Purpose: This document provides detailed instructions for setting up and running a Verily Workbench notebook instance using the AlphaFold container image.
AlphaFold is a groundbreaking neural network-based model from DeepMind for predicting protein structure. The source code for AlphaFold v2.0 is here.
A simplified version of AlphaFold has been packaged as a container image for a Vertex AI Workbench notebook instance (used by Verily Workbench under the hood), along with an an example notebook. The prebuilt container image lets you get started without doing lots of additional installation. The notebook shows how you can predict the structure of a protein (or multiple proteins). For most targets, this method obtains predictions that are near-identical in accuracy compared to the full version.
This tutorial walks you through the process of setting up a Workbench notebook instance using the AlphaFold custom container image, and running the example notebook.
A blog post accompanies the example notebook. To learn more about how to correctly interpret these predictions, see the “Using the AlphaFold predictions” section of the post. The Supplementary Information article provides a more detailed description of the method.
Create a Workbench notebook instance
The AlphaFold custom container, created for the Vertex AI Workbench, is here:
us-west1-docker.pkg.dev/cloud-devrel-public-resources/alphafold/alphafold-on-gcp:latest
After you’ve installed and configured the Workbench command-line tool, run the following command to create a new notebook instance. The args indicate to use the AlphaFold container image, and specify that the notebook instance should use 8 cores and one NVIDIA Tesla V100 GPU.
In the following command, af_test
is the terra resource name. You may want to change the instance-id
argument, af-202203
, or you can omit the arg and let the system generate an ID for you; this is the string you’ll see listed for the notebook in the Notebooks panel in the GCP Cloud Console.
wb resource create gcp-notebook --id af_test --instance-id af-202203 \
--accelerator-core-count=1 --accelerator-type=nvidia-tesla-v100 --machine-type=n1-standard-8 \
--install-gpu-driver=true --location=us-central1-c \
--container-repository=us-west1-docker.pkg.dev/cloud-devrel-public-resources/alphafold/alphafold-on-gcp \
--container-tag=latest
This command may take a while to run. You’ll get a confirmation when it’s finished. You can view the running notebook instance in the GCP Cloud Console if you like.
Upload and run the example notebook
Once your notebook instance is running, you can upload the AlphaFold example notebook to it. An easy way to do this is to bring up a browser window that is logged in with your Workbench email, and visit this URL:
https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://github.com/GoogleCloudPlatform/vertex-ai-samples/raw/main/community-content/alphafold_on_workbench/AlphaFold.ipynb
You’ll see a dialog that lets you select an existing notebook instance or create a new one. Click Select, then select your new notebook instance.
Click CONTINUE, then click Confirm in the next dialog.
The notebook will be automatically imported and ready for you to run. The notebook example walks you through the process of generating predictions for one or more protein sequences.
Tip
In the imported notebook, there is some code that tries to copy a file calledstereo_chemical_props.txt
to a directory under /opt/conda/lib/…
. If this copy fails due to permissions issues, then at the top of the “Run AlphaFold” cell of the notebook, try setting
run_relax = False
; or alternately edit the code to read the stereo_chemical_props.txt
file from
another location.
Download the generated predictions
You can download the generated predictions via the prediction.zip
file, which includes a .pdb file.
You should see this archive listed in the left sidebar; right-click on the file to see the Download
option.
Shut down the notebook instance when you’re done
Because this notebook instance uses a powerful GPU, it is fairly expensive to run. Shut it down via the Workbench CLI when you’re not using it, as follows, where af_test
is the resource name that you defined when you created the notebook instance.
wb notebook stop --id af_test
When you’re ready to use the notebook instance again, restart it via:
wb notebook start --id af_test
You can also delete the notebook resource if you’re entirely done with it.
Last Modified: 16 July 2024