Overview

Data requirements

ARPEGGIO is a Snakemake workflow to analyze Whole Genome Bisulfite Sequencing (WGBS) data coming from allopolyploid species. In order to use this workflow you will need the following:

WGBS data (Paired-end or Single-end) from:
a. An allopolyploid species and its two parental species OR
b. An allopolyploid species in two different conditions/treatments OR
c. A diploid species in two different conditions/treatments

Note: you will always need at least two samples per condition/species in order to obtain differentially methylated regions (DMRs).

The assembled genomes from:
1a. and 1b. The two allopolyploid's parent species
1c. The diploid species

And that's it!

System requirements

ARPEGGIO was developed for linux systems and was tested on Debian and Ubuntu. Windows and macOS are not supported.

Skill requirements

Command line basics (cd, mv, cp, vim or nano, etc.) and patience :)

ARPEGGIO overview and aim

ARPEGGIO includes 7 main steps to analyze WGBS data and it's up to the user to select the desired steps:

Conversion check
Quality check
Trimming
Alignment
Read sorting
DMR analysis
Downstream analyses

Next section

Input files

Wiki index

Basic steps to run & get an idea of the workflow:

Advanced information to better understand & modify the workflow:

Provide feedback

Saved searches

Use saved searches to filter your results more quickly