imputeqc is an R package and accompanied scripts to estimate the quality of imputation of genotypes that was made with fastPHASE and BEAGLE softwares. Any other tools that support *.inp fastPHASE or VCF files can be also considered. The package is based on masked data analysis.
-
Estimation of the error of gynotype imputation.
-
Optimization of the imputation model parameters, e.g., the number of haplotype clusters. The parameter can be further used for the search of signatures of selection with hapFLK test.
-
Testing different reference panels for imputation.
-
Benchmarking of different imputation softwares and strategies.
Run from R.
- Make sure you have devtools package installed.
install.packages("devtools")
- Install dependencies
install.packages("BiocManager")
BiocManager::install("VariantAnnotation")
- Install imputeqc
devtools::install_github("inzilico/imputeqc", build_vignettes = TRUE)
Read a vignette How to Select the Number of Clusters for fastPHASE.
- On a local machine, the vignette can be accessed as follow:
browseVignettes("imputeqc")
- On remote machine, the vignette can be opened in the "Help" tab of RStudio:
vignette("k_selection")
Khvorykh GV, Khrunin AV. imputeqc: an R package for assessing imputation quality of genotypes and optimizing imputation parameters. BMC Bioinformatics. 2020;21(Suppl 12):304. Published 2020 Jul 24. doi:10.1186/s12859-020-03589-0. [pubmed], [pdf]
Gennady Khvorykh, a bioinformatician, inzilico.com
Interested in contributing to the project? Suggestions, questions, and comments are open! Feel free to drop me the message.