Skip to content
Fidel Ramirez edited this page Feb 17, 2015 · 3 revisions

Tutorial

First, each of the fragment pairs is mapped individually. This is to avoid possible bias coming from the alignement software that may try to position read pairs close to each other.

Because a fraction of Hi-C reads contain the ligation site they will not align end-to-end to the reference genome. For this reason is advisable to use a local alignment instead. In bowtie2 this is achieved by adding the --local parmeter. Also, because both FASTQ files corresponding to the pair mates are going to be integrated afterwards it is important to keep the same ordering. In bowtie2 the --reorder parameter has to be given to output the reads in exactly the same order as the FASTQ files.

$ bowtie2  --local --reorder -x genome_index -U R1.fastq.gz 2>> R1.log | samtools view -Shb - > R1.bam
$ bowtie2  --local --reorder -x genome_index -U R2.fastq.gz 2>> R2.log | samtools view -Shb - > R2.bam

To produce restriction fragment resolution in the resulting Hi-C matrices a BED file with the coordinates of all positions containing the restriction enzyme motif is requires. HiCExplorer comes with a tool for this called findRestSite. For this example the HindIII restriction site AAGCTT is going to be used. The fasta sequence of the genome is needed for input.

$ findRestSite --fasta genome.fa --searchPattern AAGCTT --outFile hindIII.bed 

Now, using the two bam files and the hindIII.bed files a hic matrix is created

$ hicBuildMatrix -s R1.bam R2.bam -b R12.bam \
	--restrictionSequence AAGCTT \
	--minDistance 400 \
	--maxDistance 800 \
	--restrictionCutFile hindIII.bed \
	-o hic.npz > hic.log
Clone this wiki locally