TwosampleMR and MultivariableMR, perform all steps with simple command(s) without prior knowledge or having to go through lengthy boring protocols
A: exposure data: (i) read exposure data, (ii) perform SNP clumping and (iii) store data.
B: outcome data: (i) read outcome data, (ii) get proxy SNP(s)
C: harmonise
D: MR
E: sensitivity tests (heterogeneity, pleiotropy, singlesnp, leaveoneout, MR-PRESSO)
F: visualization (scatter plot, forest plot, leaveoneout and funnel plot)
G: compile all results into a file.
Install required R library: TwoSampleMR, stringr, tidyverse, LDlinkR, ggplot2, ieugwasr, dplyr, gwasvcf.
Download and install:
- R codes (MRsimplify.r) (before running add the path to (i) plink executable (line 9), (ii) local LD reference panel on line-20 and line-38).
- The LD reference panel can be downloaded from here (currently supporting GRCh37/hg19 genome built).
- The LD reference panel contains information of 5 super-populations (EUR = European; EAS = East Asian; AMR = Admixed American; SAS = South Asian; AFR = African).
Download full gwas summary stat:
- From either GWASCatalog or individual publications with necessary information: SNP, CHR, POS, A1 (effect_allele), A2 (other_allele), BETA, SE, Phenotype, Pval, EAF (effect_allele Freq), samplesize. (NOTE: all data must have GRCh37 coordinates for smooth processing and reliable results.)
- In case genomic coordinates change required, MungeSumstats can be used.
- Filter exposure data with above mentioned columns by pval (recommended: p<5e-08) whereas outcome data should be full length summary stats files without pval threshold.
- Note: To save time, (it is recommended to) include data of different exposure(s) into one file, however in all TwosampleMR subsequent steps each exposure-outcome MR is computed separately.
Rscript --vanilla MRsimplify.r exposure outcome
a folder will be geneated with outcome name containing all the results including sensitivity tests plus all the visualizations
-
exposure and outcome variants have same (i) genomic positions and,(ii) A1 and A2 alleles
-
(i) LD reference penal must be upto date (1k_v3) and (ii) using same population as of used for the generation of exposure and outcome data
additional readings: https://mrcieu.github.io/TwoSampleMR/index.html
citation: If you find repo useful please cite the link while manuscript is in preparation.
contact: ahmed.arslan@ulb.be or leave comments in issues page.