-
Notifications
You must be signed in to change notification settings - Fork 0
File Format
CYZ_Torry edited this page Sep 16, 2021
·
14 revisions
3dg file stores the spatial positions of each genomic bin. It formatted as (separated by \t)
chr start end x y z is_valid
chr is the chromosome. start, end is the genomic position. x, y, z is the spatial position. is_valid denoted if the bin is valid (1 for valid, 0 for invalid). Usually, bins with the least contacts in Hi-C map are regarded as invalid. If you cannnot tell the validity, add 1 at the end of each row.
Index file is used at entire work. It indexes the genomic bins for faster searching. It formatted as (separated by \t)
chr start end index
chr is the chromosome. start, end is the genomic position. index is the index for genomic bins. This file can be created using *bedtools, as followed
bedtools makewindows -g REF/GENOME/PATH -w RESOLUTION | awk 'BEGIN{i=0}{print $1, $2, $3, i; i+=1}' > OUT_FILE