Skip to content

Commit

Permalink
Merge pull request #41 from 3d-omics/ci
Browse files Browse the repository at this point in the history
chore: update README
  • Loading branch information
jlanga authored Aug 6, 2024
2 parents 4903dcd + b873146 commit a5cfab5
Showing 1 changed file with 23 additions and 19 deletions.
42 changes: 23 additions & 19 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,40 +1,44 @@
# Snakemake workflow: `Bioinfo_Macro_Host_Genomics`

[![Snakemake](https://img.shields.io/badge/snakemake-≥6.3.0-brightgreen.svg)](https://snakemake.github.io)
[![GitHub actions status](https://github.com/3d-omics/Bioinfo_Macro_Host_Genomics/workflows/Tests/badge.svg?branch=main)](https://github.com/3d-omics/Bioinfo_Macro_Host_Genomics/actions?query=branch%3Amain+workflow%3ATests)
# Snakemake workflow: `hg_genotype`

[![Snakemake](https://img.shields.io/badge/snakemake-≥8-brightgreen.svg)](https://snakemake.github.io)
[![Tests](https://github.com/3d-omics/hg_genotype/actions/workflows/main.yml/badge.svg)](https://github.com/3d-omics/hg_genotype/actions/workflows/main.yml)

A Snakemake workflow for Short Variant Discovery in Host Genomes


## Usage

- Test that it works:
- Make sure you have installed snakemake, samtools and bcftools. Either
- install them with conda/mamba :`conda install -c bioconda samtools bcftools`).
- or create an environment (`conda create -n 3dohg -c bioconda snakemake samtools bcftools`), and activate it (`conda activate 3dohg`)
- Generate mock data with `bash workflow/scripts/generate_mock_data.sh`
- Run the pipeline: `snakemake --use-conda --jobs 8 all`. It will download all the necesary software through conda. It should take less than 5 minutes.
- Make sure you have installed `snakemake>=8`
- Run the pipeline: `snakemake --use-conda --profile profile/default --jobs 100`. It will download all the necesary software through conda. It should take less than 5 minutes.

- Run it with your own data:
- Edit `config/samples.tsv` and add your samples and where are they located.
- Edit `config/features.tsv` with information regarding the reference you are using.
- Run the pipeline: `snakemake --use-conda --jobs 8 all`.
- (slurm users): `./run_slurm`
- Run the pipeline: `snakemake --use-conda --profile profile/default --jobs 8 all`.
- If you are in a cluster with slurm, add `--executor slurm`.

## Features

- FASTQ processing with [`fastp`](https://github.com/OpenGene/fastp)
- Mapping with [`bowtie2`](https://github.com/BenLangmead/bowtie2)
- SAM/BAM/CRAM processing with [`samtools`](https://github.com/samtools/samtools) and [`picard`](https://github.com/broadinstitute/picard)
- Sample swap detection with [`gtcheck`](https://github.com/samtools/bcftools)
- SNP calling with [`GATK4`](https://github.com/broadinstitute/gatk)
- SNP annotation with [`SNPEff`](https://github.com/pcingola/SnpEff)
- FASTQ processing with `fastp`.
- Mapping with `bwa-mem2`
- SAM/BAM/CRAM processing with `samtools` and `GATK`.
- SNP calling with `GATK4`.
- SNP annotation with `SNPEff` and `VEP`
- Sample swap detection with `somalier`.
- Reporting with `MultiQC`.

## DAG

![host_genomics_pipeline](./rulegraph.svg?raw=true)
![host_genomics_pipeline](./schema.svg?raw=true)

## References

TBA
- [`fastp`](https://github.com/OpenGene/fastp)
- [`bwa-mem2`](https://github.com/bwa-mem2/bwa-mem2)
- [`samtools`](https://github.com/samtools/samtools)
- [`GATK`](https://github.com/broadinstitute/gatk)
- [`SNPEff`](https://github.com/pcingola/SnpEff)
- [`VEP`](https://github.com/Ensembl/ensembl-vep)
- [`somalier`](https://github.com/brentp/somalier)
- [`MultiQC`](https://github.com/MultiQC/MultiQC)

0 comments on commit a5cfab5

Please sign in to comment.