Method description

Sep 27, 2019

This page outlines the steps included in the provided workflow. If you use the workflow for your analyses, please cite the original publications, and report the version numbers of the software you are using.

FastQC ( is used to perform quality control of the raw reads. Reads are then trimmed with TrimGalore! (, with a quality cutoff of 20 and a minimal length of 20 bp. A quasi-mapping transcriptome index is generated using Salmon (Patro et al., 2017), which is also used to estimate transcript abundances, incorporating sequence and GC content bias. Estimated abundances and feature annotation information are imported into R using the tximeta package (Love et al., 2019), which provides a wrapper around tximport (Soneson et al., 2016). In parallel, reads are mapped to the genome using STAR (Dobin et al., 2013), and bigWig files are created for visualization in genome browsers. The quasi-likelihood framework of edgeR (Robinson et al., 2010, Lun et al., 2016) is used to perform differential gene expression, accounting for differences in the average length of expressed transcripts between samples (Soneson et al., 2016), and gene set analysis is performed using the camera function (Wu and Smyth, 2012) from the limma package (Ritchie et al., 2015), using gene sets from mSigDB (, accessed via the msigdbr package ( Differential transcript usage analysis is performed using DRIMSeq (Nowicka and Robinson, 2016). Finally, MultiQC (Ewels et al., 2016) is used to summarize the output of FastQC, TrimGalore!, Salmon and STAR, and a SummarizedExperiment object with gene-level quantifications, sample and feature annotations as well as differential expression results is exported and can be used for further downstream analysis or explored visually with packages such as iSEE (Rue-Albrecht et al., 2018).


