Caper (Cromwell Assisted Pipeline ExecutoR) is a wrapper Python package for Cromwell. Caper wraps Cromwell to run pipelines on multiple platforms like GCP (Google Cloud Platform), AWS (Amazon Web Service) and HPCs like SLURM, SGE, PBS/Torque and LSF. It provides easier way of running Cromwell server/run modes by automatically composing necessary input files for Cromwell. Caper can run each task on a specified environment (Docker, Singularity or Conda). Also, Caper automatically localizes all files (keeping their directory structure) defined in your input JSON and command line according to the specified backend. For example, if your chosen backend is GCP and files in your input JSON are on S3 buckets (or even URLs) then Caper automatically transfers s3://
and http(s)://
files to a specified gs://
bucket directory. Supported URIs are s3://
, gs://
, http(s)://
and local absolute paths. You can use such URIs either in CLI and input JSON. Private URIs are also accessible if you authenticate using cloud platform CLIs like gcloud auth
, aws configure
and using ~/.netrc
for URLs.
See this for details.
See this for details.
-
Make sure that you have Java (>= 11) and Python>=3.6 installed on your system and
pip
to install Caper.$ pip install caper
-
If you see an error message like
caper: command not found
after installing then add the following line to the bottom of~/.bashrc
and re-login.export PATH=$PATH:~/.local/bin
-
Choose a backend from the following table and initialize Caper. This will create a default Caper configuration file
~/.caper/default.conf
, which have only required parameters for each backend.caper init
will also install Cromwell/Womtool JARs on~/.caper/
. Downloading those files can take up to 10 minutes. Once they are installed, Caper can completely work offline with local data files.Backend Description local local computer without a cluster engine slurm SLURM (e.g. Stanford Sherlock and SCG) sge Sun GridEngine pbs PBS cluster lsf LSF cluster IMPORTANT:
sherlock
andscg
backends have been deprecated. Useslurm
backend instead and following instruction comments in the configuration file.$ caper init [BACKEND]
-
Edit
~/.caper/default.conf
and follow instructions in there. CAREFULLY READ INSTRUCTION AND DO NOT LEAVE IMPORTANT PARAMETERS UNDEFINED OR CAPER WILL NOT WORK CORRECTLY
For local backends (local
, slurm
, sge
, pbs
and lsf
), you can use --docker
, --singularity
or --conda
to run WDL tasks in a pipeline within one of these environment. For example, caper run ... --singularity docker://ubuntu:latest
will run each task within a Singularity image built from a docker image ubuntu:latest
. These parameters can also be used as flags. If used as a flag, Caper will try to find a default docker/singularity/conda in WDL. e.g. All ENCODE pipelines have default docker/singularity images defined within WDL's meta section (under key caper_docker
or default_docker
).
IMPORTANT: Docker/singularity/conda defined in Caper's configuration file or in CLI (
--docker
,--singularity
and--conda
) will be overriden by those defined in WDL task'sruntime
. We provide these parameters to define default/base environment for a pipeline, not to override on WDL task'sruntime
.
For Conda users, make sure that you have installed pipeline's Conda environments before running pipelines. Caper only knows Conda environment's name. You don't need to activate any Conda environment before running a pipeline since Caper will internally run conda run -n ENV_NAME TASK_SHELL_SCRIPT
for each task.
Take a look at the following examples:
$ caper run test.wdl --docker # can be used as a flag too, Caper will find a default docker image in WDL if defined
$ caper run test.wdl --singularity docker://ubuntu:latest # define default singularity image in the command line
$ caper hpc submit test.wdl --singularity --leader-job-name test1 # submit to job engine and use singularity defined in WDL
$ caper submit test.wdl --conda your_conda_env_name # running caper server is required
An environemnt defined here will be overriden by those defined in WDL task's runtime
. Therefore, think of this as a base/default environment for your pipeline. You can define per-task docker, singularity images to override those defined in Caper's command line. For example:
task my_task {
...
runtime {
docker: "ubuntu:latest"
singularity: "docker://ubuntu:latest"
}
}
For cloud backends (gcp
and aws
), Caper will automatically try to find a base docker image defined in your WDL. For other pipelines, define a base docker image in Caper's CLI or directly in each WDL task's runtime
.
Use --singularity
or --conda
in CLI to run a pipeline inside Singularity image or Conda environment. Most HPCs do not allow docker. For example, caper hpc submit ... --singularity
will submit Caper process to the job engine as a leader job. Then Caper's leader job will submit its child jobs to the job engine so that both leader and child jobs can be found with squeue
or qstat
.
Use caper hpc list
to list all leader jobs. Use caper hpc abort JOB_ID
to abort a running leader job. DO NOT DIRECTLY CANCEL A JOB USING CLUSTER COMMAND LIKE SCANCEL OR QDEL then only your leader job will be canceled, not all the child jobs.
Here are some example command lines to submit Caper as a leader job. Make sure that you correctly configured Caper with caper init
and filled all parameters in the conf file ~/.caper/default.conf
.
There is an extra set of parameters --file-db [METADATA_DB_PATH_FOR_CALL_CACHING]
to use call-caching (restarting workflows by re-using previous outputs). If you want to restart a failed workflow then use the same metadata DB path then pipeline will start from where it left off. It will actually start over but will reuse (soft-link) previous outputs.
# make a new output directory for a workflow.
$ cd [OUTPUT_DIR]
# Example with Singularity without using call-caching.
$ caper hpc submit [WDL] -i [INPUT_JSON] --singularity --leader-job-name GOOD_NAME1
# Example with Conda and using call-caching (restarting a workflow from where it left off)
# Use the same --file-db PATH for next re-run then Caper will collect and softlink previous outputs.
# If you see any DB connection error then replace it with "--db in-memory" then call-cahing will be disabled
$ caper hpc submit [WDL] -i [INPUT_JSON] --conda --leader-job-name GOOD_NAME2 --file-db [METADATA_DB_PATH]
# List all leader jobs.
$ caper hpc list
# Check leader job's STDOUT file to monitor workflow's status.
# Example for SLURM
$ tail -f slurm-[JOB_ID].out
# Cromwell's log will be written to cromwell.out* on the same directory.
# It will be helpful for monitoring your workflow in detail.
$ ls -l cromwell.out*
# Abort a leader job (this will cascade-kill all its child jobs)
# If you directly use job engine's command like scancel or qdel then child jobs will still remain running.
$ caper hpc abort [JOB_ID]
Caper uses Cromwell's call-caching to restart a pipeline from where it left off. Such database is automatically generated on local_out_dir
defined in the configuration file ~/.caper/default.conf
. The DB file name is simply consist of WDL's basename and input JSON file's basename so you can simply run the same caper run
command line on the same working directory to restart a workflow.
# for standalone/client
$ caper run ... --db in-memory
# for server
$ caper server ... --db in-memory
If you see any DB connection timeout error, that means you have multiple caper/Cromwell processes trying to connect to the same file DB. Check any running Cromwell processes with ps aux | grep cromwell
and close them with kill PID
. If that does not fix the problem, then use caper run ... --db in-memory
to disable Cromwell's metadata DB. You will not be able to use call-caching.
If default settings of Caper does not work with your HPC, then see this document to manually customize resource command line (e.g. sbatch ... [YOUR_CUSTOM_PARAMETER]
) for your chosen backend.
See details.