1. Installation

⚠️ This wiki is currently being updated together with the dev branch, so migh not reflect usage for previous versions!

Installing (Sanger farm22)

Add the following to your .bashrc

# Set default LSF group, important for NF to work, change for your group here
export LSB_DEFAULTGROUP=teamtrynka

# Make conda available
module load ISG/conda

# Add the blipper executable to the path
export PATH="$PATH:/software/teamtrynka/installs/sc-blipper/"

Then

source ~/.bashrc

And thats it, you are good to go!

Installing (other configurations)

In short, these are the steps for installing the pipeline on an HPC cluster. While I strongly recommend running on an HPC, if you have a single good machine, you can use the profile 'local' to run the pipeline locally.

Brief overview of the install process

Make sure you have nextflow (>=25.04.6) and conda available on your path
Clone the repo
Create a conda envs following the instructions below (needs some manual pacakges, future will add singularity containers)
Update nextflow.config or override with your config the path to the conda env (params.rn_conda=/path/to/env")
Add a profile to work with your cluster configuration (can be put in './conf' folder). Also check if any enviroment variables need to be set for your scheduler, and if you need to update the process labels (particularly for the GPU process).
Add the new profile to the nextflow.config profiles{} block
(optional) Update the runner script sc-blipper as the primary entry point (By default works with LSF, easy to update to SLURM)
(optional) Add the runner script sc-blipper to your path

Creating the base conda enviroment (required)

The pipeline runs with two conda enviroments. The first is used for most processes, the second is only used for scVI. This was done as GPU installations can be finicky with regards to versions and if you don't want scVI you can completely skip this step.

# Create conda environment
conda create -n sc-blipper

# Install the following conda and pip packages
# Python
python==3.8.19
h5ad (pip)
cnmf (pip)
starcatpy (pip)
pandas
scanpy

# R
r-seurat==5.0.0
r-anndata
r-reticulate
r-optparse
r-remotes
r-ggplotify
r-pheatmap
r-ggraph
r-igraph

Launch R and install the following packages from CRAN or Bioconductor

#biomaRt (was not in conda, installed manually)
install.packages("BiocManager")
BiocManager::install("biomaRt")

# Rutils, allows fread to read gzips directly (was not in conda, installed manually)
install.packages("R.utils") 

# To install fgsea need to follow this:
remotes::install_github("ctlab/fgsea")
remotes::install_github('saezlab/decoupleR')
remotes::install_github("saezlab/progeny")
remotes::install_github("saezlab/OmnipathR")

Note down the install path for the conda env, we will need it later for configuring the pipeline

echo $CONDA_PREFIX  # On Unix/Linux/macOS

# Exit conda env to get ready to create the next one
conda deactivate

Creating the scVI conda enviroment (optional)

To create the scVI conda environment:

#-----------------------------------------------
# Install scVI (optional)
#-----------------------------------------------
# We will make a new enviroment for scVI to avoid conflicts and keep scVI optional so the pipeline is lighter
# You will ideally need a GPU for scVI to run
# Follow instructions here: https://docs.scvi-tools.org/en/1.0.0/installation.html
conda create -n sc-blipper-scvi python=3.10
conda activate sc-blipper-scvi 

# The instructions for conda are from the scvi docs, they dint't work for me and I have more luck with pip for
# installing pytorch in general. https://docs.scvi-tools.org/en/1.0.0/installation.html
#conda install pytorch torchvision torchaudio pytorch-cuda=11.7 -c pytorch -c nvidia
#conda install jax jaxlib -c conda-forge
#conda install scvi-tools -c conda-forge

# Install Pytorch, JAX, they might need to be tweaked depending on your system and GPU
# Pytorch: https://pytorch.org/get-started/locally/
# JAX: https://docs.jax.dev/en/latest/installation.html#installation

# Cuda 12 on linux, make sure the CUDA version matches your GPU drivers
pip3 install torch torchvision
pip3 install -U "jax[cuda12]"

# scvi-tools
pip install scvi-tools scikit-misc

# (optional) Install ipython, this is fully optional but nice to have for interactive work
conda install ipython

# Note down the install path for the conda env, we will need it later for configuring the pipeline
echo $CONDA_PREFIX  # On Unix/Linux/macOS

# Exit conda env
conda deactivate

At this point I strongly recommend testing GPU is working. If on HPC, make sure to request a GPU node You can test by running python and typing:

import torch
print(torch.cuda.is_available()) # Should return True
import jax
print(jax.devices()) # Should show GPU devices

Updating the conda path

Next, you want to take the two conda paths you noted down and set params.rn_conda="</path/to/first/env>" and params.preprocess.scvi.conda="</path/to/scvi-env>". You can do this either in the main nextflow.config (this will ensure everybody who uses the install doesn't need to override it), or in your run config file.

Updating profile & process definitions

Not all HPC configurations are the same, to be able to use them, you may need to update the process defintions. Process requirements are handled in the pipeline through labels. A full list can be found in conf/processes.config. Its likely you will need to update the 'queue' arguments to match your HPC config and the 'clusterOptions' for the GPU tags. Updating the GPU process is only needed if you are using scVI.

There are two three you can do this:

Update the conf/processes.config directly
Override the 'conf/processes.config' by adding a 'process {}' block to your runs config file
Override the 'conf/processes.config' by creating a new config file and including (sourcing) it in your runs config file

You can find Nextflow profile configurations for most major research institutes here: https://nf-co.re/configs/ See more details on configuration of Nextflow here: https://www.nextflow.io/docs/latest/config.html

To add a profile to the pipeline you save the file in the conf folder, for instance 'conf/my_profile.config'. You can then add a profiles{} block to your run config (or add my_profile to the main nextflow.config).

profile {
    my_profile { includeConfig 'conf/my_profile.config' }
}

Updating the sc-blipper script (optional)

The script sc-blipper is a utility script with some sanity checking, log management and job submission. It is fully optional and you can run the pipeline through Nextflow directly as well. However, if you configure this and put it on your PATH, it makes future job submissions a breeze.

The bits you need to configure are at the start and end of the script, tagged by and anotated with inline comments.

#-----------------------------------------------------------------------
#                 Update these for your installation
#-----------------------------------------------------------------------

To use the script, you will need to update it to:

Update sourcing of nextflow "module load HGI/common/nextflow/25.04.6"
Point to the path of the main.nf file in this repo
Update any enviroment variables like NXF_SINGULARITY_CACHEDIR
Update the submit command if you don't have the LSF scheduler but something else replace the submit command for your scheduler at 'CMD="bsub -n 1 '

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

1. Installation

Installing (Sanger farm22)

Installing (other configurations)

Brief overview of the install process

Creating the base conda enviroment (required)

Creating the scVI conda enviroment (optional)

Updating the conda path

Updating profile & process definitions

Updating the sc-blipper script (optional)

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Clone this wiki locally