Accessing workspace files and folders from your cloud environment
Prior reading: Overview of Cloud Environments
Purpose: This document provides detailed instructions for accessing mounted workspace files and folders from your cloud environment.
Workspace buckets and referenced folder resources automatically mount onto newly created Cloud Environments. The mounted resources include controlled Cloud Storage buckets, referenced Cloud Storage buckets where the user has at least READ permissions, and referenced Cloud Storage objects that reference a folder inside a Cloud Storage bucket. (You can read more about workspace resources here.) This feature uses Cloud Storage FUSE.
- Referenced folder: GCS object resource pointing to an existing bucket prefix
- Workspace resource tree: the tree of folders and resources in the “Resources” tab in the UI
- Mounted resource tree: the subset of directories that mapped from the workspace folders and bucket resources, where each bucket resource directory is the mount point for the associated bucket.
Bucket resources mount into the
$HOME/workspace/ directory in a cloud environment, and include any parent workspace folders of the resource in the workspace resource tree. Non-bucket resources (such as BigQuery datasets) and workspace folders without any bucket resources inside them will be excluded.
By default, controlled Cloud Storage buckets created by the user are mounted with read-write permissions. Referenced resources and controlled resources created by other users are mounted with read-only permissions. To override this behavior, see Setting Mount Permissions in Mounting with the CLI.
For example, here is a workspace with the following resources:
In a newly created cloud environment, a workspace directory is created containing a subset of the resource tree, including any folders that have bucket resources in them, and the bucket directories themselves.
The “logs” controlled bucket and “platinum-genomes” referenced objects are both mounted along with their contents. Notice that the “BQ Datasets folder and the “Dataset-1” Big Query dataset resource are not mounted.
By default, only controlled buckets created by the user mount with read-write permissions. Referenced buckets and controlled buckets created by other users mount with read-only permissions.
If a bucket mount is in an unrecoverable state where unmounting fails, you can manually unmount your bucket with
fusermount -u /path/to/bucket.
If any resource fails to mount, the directory for which the resource would have mounted if it were successful will be empty, with an error state appended to the directory name.
Here are the following error states, using a “mybucket” resource as an example:
mybucket_NO_ACCESS- The user does not have at least read permissions to the bucket.
mybucket_NOT_FOUND- For referenced buckets, the bucket url is invalid.
mybucket_MOUNT_FAILED- for any other failures during mount. See the Workbench CLI output logs to further troubleshoot.
Mounting files using the CLI
You can manage and further control your mounted buckets with the Workbench CLI. For GCP workspaces, buckets are mounted with Cloud Storage FUSE.
Inside a terminal in a cloud environment, users can run
terra resource mount
to mount their bucket resources. This command will unmount any existing buckets mounted inside the
$HOME/workspace directory before mounting.
Mounting Individual Resources
Users can specify the
--name flag with the name of their resource. This flag is useful for remounting a resource that had failed to mount or has been moved to a different folder in the workspace.
For example, “terra resource mount –name=logs” mounts logs and creates a path of parent directories for the folders that logs is nested under. So if there are no other resources mounted, the “$HOME/workspace” directory will contain:
Setting Mount Permissions
To override the default permissions set on mounts, use the
--read-only flag to control the read/write access to their mounted buckets.
terra resource mount --read-only will mount all resources as read-only, regardless of their stewardship or creator.
If you want to mount buckets with read-write permissions, you can run
terra resource mount --read-only=false. If users don’t have write permissions to a bucket, it only mounts with read permissions.
When used in tandem with the
--name flag, it specifies the permission on the single mounted bucket.
Disable File Metadata Caching
By default, Cloud Storage FUSE enables file metadata for one minute to speed up the performance of listing objects (e.g
ls) and getting file metadata (e.g.
stat). This breaks read consistency if multiple users in a workspace are both working in the same mounted bucket in their cloud environments.
If read consistency is a requirement, you can optionally remount your buckets with the
--disable-cache flag to disable file metadata caching.
If you want to unmount resources, run
terra resource unmount.
If you want to unmount a specific resource, run
terra resource unmount --name=<resource-name>.
These commands expect that no files exist outside of mounted buckets and that no process is currently using the mounted buckets. They will fail to preserve any local user files inside
Customizing Automount Behavior
You can disable automounting by removing or commenting out the
/usr/bin/terra resource mount line in the workbench configuration folder in your cloud environment,
$HOME/.terra/instance-boot.sh, or customize the resource mounting behavior.
Here is an example:
# My custom automount script terra resource mount --name=resource-a --read-only # a controlled Cloud storage folder terra resource mount --name=resource-b --read-only=false # a referenced Cloud storage bucket
Any changes to this file are reflected in the mounted resource tree the next time the cloud environment is started.
The mounted resource tree does not automatically sync changes made to the workspace resource tree or the underlying Cloud Storage buckets. This behavior is intentional so that users are guaranteed that their local resource tree does not change while accessing their bucket resources.
The following changes will not be automatically reflected in mounted resource trees:
- A new resource is created
- An existing resource is renamed, moved, or deleted
- A referenced resource bucket or object is updated
- The underlying bucket for a referenced resource is deleted
If you stop and then start your cloud environment, the latest workspace resource tree will be mounted with any updates since your last mount.
If you make a change to a resource and want to immediately sync that change, e.g. fixing an invalid Cloud Storage folder reference, you can individually mount that resource again. See Mounting Individual Resources.
Last Modified: 16 November 2023