Tip: run R with R -q --vanilla --args <arguments> < script.R
instead of using Rscript
, so R will print out a full log of all
commands and messages in your script.
Required R packages: argparser, SeqArray, SeqVarTools, SNPRelate, GENESIS
From beluga1, use runRscript.sh. Use the -v
flag to
pass the name of the R script (after R=
) and any arguments to the
script (after args=
).
qsub -N jobname -v R="myscript.R",args="--arg1 arg1 --arg2 arg2" runRscript.sh
To run a script by chromosome, use the -t
flag to submit each
chromosome as a separate task in an array job.
qsub -N ld_pruning -v R="/acct/sdmorris/code/assoc_pipeline/ld_pruning.R",args="myfile.gds --maf 0.01 --missing 0.01" -t 1-22 runRscript.sh
To run a script in parallel, use the -l
flag to request multiple
processors on a node, in addition to the --num_cores
argument to the
R script.
qsub -N sample_qc -v R="/acct/sdmorris/code/assoc_pipeline/sample_qc.R",args="myfile.gds --num_cores 12" -l nodes=1:ppn=12 runRscript.sh
Use the SeqArray package to convert files.
If you have plink files (BED/BIM/FAM), use seqBED2GDS
. If you have VCF, use seqVCF2GDS
.
Example for VCF: http://bioconductor.org/packages/release/bioc/vignettes/GENESIS/inst/doc/assoc_test_seq.html#convert-vcf-to-gds
Script for BED: plink_to_gds.R
Outline of recommended QC steps for GWAS data: http://bioconductor.org/packages/release/bioc/vignettes/GWASTools/inst/doc/DataCleaning.pdf
The above document uses GWASTools, but SeqVarTools can be used instead.
Ideally one would use X and Y intensity values (array data) or X and Y read depth (sequencing data) to infer genetic sex. If these are not available, can use X chromosome heterozygosity and Y chromosome missingness to check sex.
When a dataset contains samples from multiple populations with different allele frequencies, it is recommended to run HWE in each population separately.
As an additional QC step, compare pairwise relatedness estimates from PC-Relate with expected values from the pedigree. Example code: https://github.com/UW-GAC/analysis_pipeline/blob/master/R/pedigree_check.R