Skip to content

MSigDB gene sets for multiple organisms in a tidy data format

License

Unknown, MIT licenses found

Licenses found

Unknown
LICENSE
MIT
LICENSE.md
Notifications You must be signed in to change notification settings

igordot/msigdbr

Repository files navigation

msigdbr: MSigDB Gene Sets for Multiple Organisms in a Tidy Data Format

CRAN CRAN downloads R-CMD-check Codecov test coverage

Overview

The msigdbr R package provides Molecular Signatures Database (MSigDB) gene sets typically used with the Gene Set Enrichment Analysis (GSEA) software:

  • in an R-friendly "tidy" format with one gene pair per row
  • for multiple frequently studied model organisms, such as mouse, rat, pig, zebrafish, fly, and yeast, in addition to the original human genes
  • as gene symbols as well as NCBI Entrez and Ensembl IDs
  • without accessing external resources and requiring an active internet connection

Installation

The package can be installed from CRAN.

install.packages("msigdbr")

Releases that are not available on CRAN can be installed from GitHub (specific release or version can be specified):

remotes::install_github("igordot/msigdbr", ref = "v2022.1.1")

Usage

The package data can be accessed using the msigdbr() function, which returns a data frame of gene sets and their member genes. For example, you can retrieve mouse genes from the C2 (curated) CGP (chemical and genetic perturbations) gene sets.

library(msigdbr)
genesets = msigdbr(species = "mouse", category = "C2", subcategory = "CGP")

Check the documentation website for more information.