iscream aims to efficiently read data from any BED file into formats usable by other packages. Using htslib, iscream can query genomic regions like tabix, summarize the queried data and make matrices, with specific support for WGBS BED files aligned by BISCUIT, Bismark and BSBolt.
Analysis and visualization of Whole Genome Bisulfite Sequencing (WGBS)1 data requires reading aligned sequencing data into formats that existing packages like BSseq and scMET can analyze. Getting the data from on-disk BED files to a matrix of methylation values can be difficult because, with nearly 30 million CpGs, WGBS data can be quite large. iscream makes importing WGBS data for targeted exploration and analysis faster and more memory efficient.
iscream depends on the htslib header files. These may be installed with your package manager:
- ubuntu/debian:
libhts-dev
- fedora/RHEL:
htslib-devel
- brew:
htslib
- nixpkgs:
htslib
- conda:
bioconda::htslib
or built manually: https://www.htslib.org/download/. iscream can use
Rhtslib as the htslib source but we recommend installing htslib with
libdeflate support for optimal performance - see
vignette("htslib")
for more information.
The header files may also be found among your HPC modules - make sure the
PKG_CONFIG_PATH
environment variable includes the pkgconfig
location for
your installation of htslib. You can verify that the htslib development
libraries are installed with pkg-config
:
# set path if necessary
export PKG_CONFIG_PATH=[path to htslib installation]
# verify that htslib can be found
pkg-config --cflags --libs htslib
Some htslib installations do not include the tabix executable (on Ubuntu you
need to install both libhts-dev and tabix). iscream will work without
tabix, but the tabix()
function will be faster if the executable is
installed.
GNU GCC must be installed for OpenMP support. This is usually installed by default on Linux systems, but may need to be manually installed on MacOS to use iscream with multiple threads2.
if (!requireNamespace("BiocManager", quietly = TRUE))
install.packages("BiocManager")
BiocManager::install("iscream")
You can install the development version from Github by cloning the repo and running
git clone https://github.com/huishenlab/iscream
R CMD INSTALL iscream
You can also use the R devtools
package:
devtools::install_github("huishenlab/iscream")
or pak
:
pak::pkg_install("huishenlab/iscream")
See the quick start guide for an overview of iscream's functionality and the function reference for all available functions. Bug reports may be submitted through GitHub issues.
Footnotes
-
The name iscream comes from "Integrating Single-Cell Results for Exploring and Analyzing Methylation" as it was originally developed to read BED files from WGBS. It was then generalized to work with any BED file. ↩
-
Using OpenMP is also possible with Clang on MacOS (https://mac.r-project.org/openmp/) but installing GCC with Homebrew may be easier (https://formulae.brew.sh/formula/gcc). ↩