To improve the efficiency and reproducibility of this project's analyses, the long running computations are performed in pipelines. Thus, the most computationally-intensive parts of the project are parallelizable and scalable. The 'Snakemake' workflow management system ensures that only the required computations are performed when a pipeline is run and is able to monitor the different jobs over a HPC (such as O2).
The main pipeline in this project fits the models for each cell line lineage. Overview diagnostic results of the model fitting pipeline are saved to the "reports/" directory.
See the primary README for how to setup the development environment.
This pipeline fits the Bayesian models according to the specifications in "models/model-configs.yaml". The results are stored in the same directory for later analysis.
Below are the descriptions of the relevant files:
010_010_model-fitting-pipeline.smk
: Snakemake pipeline010_011_smk-config.json
: SLURM configuration010_012_run-model-fitting-pipeline.sh
: Bash script to run the pipeline
The pipeline can be run using the following make command.
make fit
On O2, I linked the "temp/" directory to Scratch.
ln -s $JHC_SCRATCH/speclet-temp temp