African Americans and European Americans exhibit distinct gene expression patterns across tissues and tumors that are associated with immunologic and infectious functions and environmental exposures
This file will provide instructions on how to reproduce the results described in the paper "African Americans and European Americans exhibit distinct gene expression patterns across tissues and tumors that are associated with immunologic and infectious functions and environmental exposures". All analysis were performed using MetaOmGraph version 1.8.1. Violin plots were made using ggplot in R.
- Download MetaOmGraph from here. This will download a zip file. Unzip the file to get mog1.8.1.jar file.
- Download the Human cancer RNA-Seq mog project from here. This will download a zip file. Unzip the file to get a folder containing three files: Data file, Metadata file and MOG project file (.mog)
- MetaOmGraph user guide is available here
- Double click on the .jar file to start MetaOmGraph.
- Click on open new project and locate the .mog file for the Human cancer RNA-Seq mog project
- MetaOmGraph will open and display the project.
For more detailed explanation, please go through section 8 of the MetaOmGraph user manual.
- In the top menubar, go to
Tools --> Differential Expresion Analysis
- In the Differential Expression Analysis window, search the groups and perform the analysis.
- Select the features (genes) from the MetaOmGraph's
Feature Metadata
tab. Additionally select samplesSample Metadata
tab to filter out samples from particular tissues. - Go to
Plot --> Selected Rows --> Using R
- Browse to the R script in
rscripts/violinPlots.R
- Enter output directory name. NOTE: the output directory is relative to the project directory.
NOTE: Please have these packages installed in R: Please see full sessionInfo
- readr
- dplyr
- diptest
- plyr
- scales
- data.table
- ggplot2
- ggthemes
- ggpubr
- Apply log_2 transformation: In the main menubar, to
Edit --> Transform data --> Log_2
- Select the required row in the MetaOmGraph's
Feature Metadata
tab. - Go to
Statistical Analysis --> Correlation
or choose other appropriate option. - Click the green button next to
Statistical Analysis
to save theFeature Metadata
table containing the correlation values.
Here is a list of all the analysis performed along with code to reproduce results.
- Perform dip-test for all genes:
rscripts/diptest_allgenes.R
- Perform GSEA for DE lists from pooled GTEx and pooled TCGA samples and generate cnet/ridge plots:
rscripts/gsea_gtex_tcga.R
- Perform limma DE analysis in tissue/tumor-wise manner:
rscripts/limma_tissuewise.R
- Perform limma DE analysis for BRCA samples including molecular-subtyes in the model (uses TCGABiolinks):
rscripts/limma_tissuewise.R
- Compare Mann-Whitney and limma DE results; performs GSEA and compare enriched terms:
rscripts/limma_MW_compare_gsea.R
- Singh, Urminder, Manhoi Hur, Karin Dorman, and Eve Syrkin Wurtele. "MetaOmGraph: a workbench for interactive exploratory data analysis of large expression datasets." Nucleic acids research 48, no. 4 (2020): e23-e23.
- Ritchie, Matthew E., et al. "limma powers differential expression analyses for RNA-sequencing and microarray studies." Nucleic acids research 43.7 (2015): e47-e47.
- Yu, Guangchuang, et al. "clusterProfiler: an R package for comparing biological themes among gene clusters." Omics: a journal of integrative biology 16.5 (2012): 284-287.
- Colaprico, Antonio, et al. "TCGAbiolinks: an R/Bioconductor package for integrative analysis of TCGA data." Nucleic acids research 44.8 (2016): e71-e71.
Please see sessionInfo