EpiCHAOS is a metric for calculating cell-to-cell heterogeneity in single-cell epigenomics data. EpiCHAOS can be applied to any single cell epigenomics dataset (e.g. scATAC-seq, scChIP-seq, scNMT-seq, scTAM-seq) in binarised matrix form.
EpiCHAOS scores are assigned per cluster (or otherwise defined group of single cells, e.g. cell type, treatment condition). Scores will range from 0-1 where higher scores indicate higher cell-to-cell heterogeneity.
For details and examples of applications please refer to the epiCHAOS preprint: https://www.biorxiv.org/content/10.1101/2024.04.24.590899v1.
- Single-cell matrix is binarised if not already
- Pairwise chance-corrected Jaccard indices are calculated between all cells in each group/cluster
- The mean of all pairwise indices is computed per cluster as a raw heterogeneity score
- Raw heterogeneity scores are adjusted for differences in sparsity between groups by fitting a linear regression model
- Scores are normalised to a range of 0-1 and negated so that a higher score indicates higher cell-to-cell heterogeneity
library(devtools)
install_github("CompEpigen/epiCHAOS")
Calculation of epiCHAOS scores requires (i) a single cell epigenomics dataset in binarised matrix form, e.g. a peaks-by-cells or tiles-by-cells matrix in the case of scATAC-seq data, (ii) cell metadata annotation matching column names to clusters, cell types or other groups on which heterogenetiy scores should be computed.
The main function for computing epiCHAOS scores is "epiCHAOS()" which takes a counts matrix and cell annotation metadata as input and returns a dataframe of epiCHAOS scores for each group/cluster.
# compute epiCHAOS scores
heterogeneity <- epiCHAOS(counts = counts, meta = metadata, colname = "seurat_clusters", n = 50, index = NULL, plot = F, cancer = F, subsample = 1)