-
Notifications
You must be signed in to change notification settings - Fork 6
Hapmap creation workflow
Vladimir Gritsenko edited this page Apr 26, 2016
·
2 revisions
- hapmap.install_3.php - creates the hapmap folders and associated configuration and log files. Important parameter - referencePloidy. "One diploid" is 2, "two haploid" is 1.
- hapmap.install_4.sh:
- For two haploids:
- Copy SNP_CNV_v1.zip of both haploids into the hapmap folder, and unzip them as SNPdata_parent (they are indexed as 1 and 2).
- Run hapmap.preprocess_haploid_parents.py in hapmap mode. This script goes over both of the above datasets, and adds to the hapmap only those coordinates which are both homozygous (defined as allelic ratio >= 0.5) and different in the both datasets. Output is saved as SNPdata_parent.txt. Note: 0 in the phasing info column means that no correction is needed (per script).
- The original files from (i) are deleted.
- For one diploid:
- Copy the parent diploid's putative_SNPs_v4.zip file, and unzip it as SNPdata_parent.
- Run hapmap.preprocess_parent.py on the above file (changing it) in hapmap mode.
- Run hapmap.expand_definitions.py.
- Copy the child's SNP_CNV_v1.zip and unzip it as SNPdata_child (TODO: why is this indexed?).
- Run hapmap.process_child.py on SNPdata_child. This changes SNPdata_parent.txt. The child dataset is removed.