ct_gdrive is an open source copytool enabling secure and transparent Google Drive cloud storage tier in Lustre using Robinhood.
-
Lustre is a parallel distributed file system, generally used for high performance computing (HPC).
-
Google Drive is a Cloud-based object storage system.
-
Robinhood policy engine is used to setup and apply policies for backup and tiering.
ct_gdrive transfers files from Lustre to Google Drive, and vice versa. Used either for Backup/Disaster Recovery or as an HSM tier to expand your Lustre filesystem, ct_gdrive directly uses the Google API Client Library for Python to minimize the number of remote cloud storage API requests needed to archive/restore files using Lustre/HSM.
Each Lustre file is stored in Google Drive using its Lustre FID as the name. When archived for the first time, a file description with some info like the original file path is added. ct_gdrive supports the upload of multiple versions of a same file to Google Drive. However, managing these versions is not supported from Lustre.
ct_gdrive should be used with lhsmtool_cmd, an open source and generic Lustre/HSM agent that launches a command when an action (archive, restore) is requested by the Lustre/HSM coordinator.
Please see the HOWTO.
ct_gdrive is published under the GPLv3 license.
As of today, ct_gdrive works with a single Google Drive account, but a version supporting multiple accounts is planned.
ct_gdrive is maintained and used in production by Stanford Research Computing where we successfully pushed several millions of files and more than 1 PB to Google Drive in about a month.
Have questions? Feel free to contact us at srcc-support@stanford.edu