Skip to content

Juke34/AliNe

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

24 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

AliNe (Alignment in Nextflow)

AliNe is a pipeline written in nextflow that aims to efficiently align reads against a reference genome using the tools of your choice.

Genome + Reads => FastQC -> Alignment -> Sort -> MultiQC

Table of Contents

Foreword

AliNe is a pipeline written in nextflow that aims to efficienlty align reads against a reference genome.

A QC with FastQC is made at each step if option activated. A trimming is feasible before alignment if option activated. The pipeline deals with all quality encoding ('sanger', 'solexa', 'illumina-1.3+', 'illumina-1.5+', 'illumina-1.8+'). All fastq will be standardised in Phred+33 for downstream alignments by seqkit. You can choose to run one or several aligner in parallel.

Here is the list of implemented aligners:

Tool Single End (short reads) Paired end (short reads) Pacbio ONT
bbmap x x x x
bowtie2 x x
bwaaln x x R1 and R2 independently aligned then merged with bwa sampe
bwamem x x
bwasw x x
graphmap2 x x R1 and R2 independently aligned then merged with cat
hisat2 x x
minimap2 x x
nucmer x x R1 and R2 are concatenated then aligned
star x x
star 2pass mode x x
subread x x

Installation

The prerequisites to run the pipeline are:

AliNe

# clone the workflow repository
git clone https://github.com/Juke34/AliNe.git

# Move in it
cd AliNe

Nextflow

  • Via conda

    See here ``` conda create -n nextflow conda activate nextflow conda install nextflow ```
  • Manually

    See here Nextflow runs on most POSIX systems (Linux, macOS, etc) and can typically be installed by running these commands:
    # Make sure 11 or later is installed on your computer by using the command:
    java -version
    
    # Install Nextflow by entering this command in your terminal(it creates a file nextflow in the current dir):
    curl -s https://get.nextflow.io | bash 
    
    # Add Nextflow binary to your user's PATH:
    mv nextflow ~/bin/
    # OR system-wide installation:
    # sudo mv nextflow /usr/local/bin
    

Container platform

To run the workflow you will need a container platform: docker or singularity.

Docker

Please follow the instructions at the Docker website

Singularity

Please follow the instructions at the Singularity website

Usage

You can first check the available options and parameters by running: nextflow run aline.nf --help

To run the workflow you must select a profile according to the container platform you want to use:

  • singularity, a profile using Singularity to run the containers
  • docker, a profile using Docker to run the containers

The command will look like that:

nextflow run aline.nf -profile docker <rest of paramaters>

Another profile is available (/!\ actually not yet implemented):

  • slurm, to add if your system has a slurm executor (local by default)

The use of the slurm profile will give a command like this one:

nextflow run main.nf -profile docker,slurm <rest of paramaters>

Test the workflow

Test data are included in the AliNe repository in the test folder.

A typical command to run a test on single end data will look like that:

nextflow run -profile docker aline.nf --aligner hisat2,graphmap2,bwamem,nucmer --genome test/hpv16.fa --reads test/U2OS_A1_R1_sub100000.fastq --single_end true --reads_extension .fastq

On success you should get a message looking like this:

  AliNe Pipeline execution summary
    --------------------------------------
    Completed at : 2024-03-07T21:40:23.180547+01:00
    UUID         : e2a131e3-3652-4c90-b3ad-78f758c06070
    Duration     : 8.4s
    Success      : true
    Exit Status  : 0
    Error report : -

Parameters

Uninstall

You can simply remove the AliNe directory from your computer, and remove the nextflow conda environment:

conda remove -n nextflow