The GeneSpectra module performs gene classification using scRNA-seq data.
Read our preprint here: Context-aware comparison of cell type gene expression across species
Steps:
- Reduce sparsity by creating metacells or pseudobulking
- Normalize data and filter low count genes
- Multi-thread gene classification for gene specificity and distribution
- Compare ortholog classes between species
Note that the gene classes are modified based on Human Protein Atlas classifications by Karlsson, M. et al.
First pull source code from the repository:
git clone https://github.com/Papatheodorou-Group/GeneSpectra.git
cd GeneSpectra
Pixi is used for dependency management.
First install pixi. Then, run this command in the GeneSpectra/
directory to install project dependencies:
pixi install -a
Wrapper functions and helper functions to use metacells to create metacells based on scRNA-seq data. It is also recommended to follow the official metacells workflow to create the most tailored metacells anndata object (use the iterative vignette for brand-new data), as you have more freedom to adjust various parameters. Alternatively, when the dataset is unsuitable for metacell calculation, merge cells of the same annotation label to create cell pools.
Core module to perform gene filtering, normalization, and gene specificity and distribution classification. Uses multi-processing to parallize the processing of genes.
Cross species comparison of gene classes, and plotting. Using ensembl or eggNOG homology.
Developer / maintainer: Yuyao Song, ysong@ebi.ac.uk