R package for bcbio RNA-seq analysis.
Bioconductor method (recommended)
## try http:// if https:// URLs are not supported
source("https://bioconductor.org/biocLite.R")
biocLite("devtools")
biocLite("remotes")
biocLite("GenomeInfoDbData")
biocLite(
"hbc/bcbioRNASeq",
dependencies = c("Depends", "Imports", "Suggests")
)
F1000 paper version
# v0.2.4
biocLite("hbc/bcbioRNASeq", ref = "v0.2.4")
conda method
conda install -c bioconda r-bcbiornaseq
To avoid version issues, your .condarc file should only contain the following channels, in this order:
channels:
- bioconda
- conda-forge
- defaults
We recommend installing into a clean conda environment:
conda create --name bcbiornaseq_env
conda activate bcbiornaseq_env
Note that there is currently a bug with conda and libgfortran. You may need to install conda's libgfortran-ng to get the bcbioRNASeq package to load in R:
conda install libgfortran-ng
Load bcbio run
library(bcbioRNASeq)
bcb <- bcbioRNASeq(
uploadDir = "bcbio_rnaseq_run/final",
interestingGroups = c("genotype", "treatment"),
organism = "Homo sapiens"
)
# Back up all data inside bcbioRNASeq object
flat <- flatFiles(bcb)
saveData(bcb, flat)
This will return a bcbioRNASeq
object, which is an extension of the Bioconductor RangedSummarizedExperiment container class.
Parameters:
uploadDir
: Path to the bcbio final upload directory.interestingGroups
: Character vector of the column names of interest in the sample metadata, which is stored in thecolData()
accessor slot of thebcbioRNASeq
object. These values should be formatted in camelCase, and can be reassigned in the object after creation (e.g.interestingGroups(bcb) <- c("batch", "age")
). They are used for data visualization in the quality control utility functions.organism
: Organism name. Use the full latin name (e.g. "Homo sapiens").
Consult help("bcbioRNASeq", "bcbioRNASeq")
for additional documentation.
When loading a bcbio RNA-seq run, the sample metadata will be imported automatically from the project-summary.yaml
file in the final upload directory. If you notice any typos in your metadata after completing the run, these can be corrected by editing the YAML file. Alternatively, you can pass in a sample metadata file into bcbioRNASeq()
using the sampleMetadataFile
argument.
The samples in the bcbio run must map to the description
column. The values provided in description
must be unique. These values will be sanitized into syntactically valid names (see help("makeNames", "basejump")
), and assigned as the column names of the bcbioRNASeq
object. The original values are stored as the sampleName
column in colData()
, and are used for all plotting functions.
description | genotype |
---|---|
sample1 | wildtype |
sample2 | knockout |
sample3 | wildtype |
sample4 | knockout |
R Markdown templates
This package provides multiple R Markdown templates, including quality control, differential expression using DESeq2, and functional enrichment analysis.
These are available in RStudio at File
-> New File
-> R Markdown...
-> From Template
.
citation("bcbioRNASeq")
Steinbaugh MJ, Pantano L, Kirchner RD, Barrera V, Chapman BA, Piper ME, Mistry M, Khetani RS, Rutherford KD, Hoffman O, Hutchinson JN, Ho Sui SJ. (2017). bcbioRNASeq: R package for bcbio RNA-seq analysis. F1000Research 6:1976.
The papers and software cited in our workflows are available as a shared library on Paperpile.