-
Notifications
You must be signed in to change notification settings - Fork 11
Calculate the profile of ChIP peaks binding to specific TSS regions
We want to analyse the profile of ChIP peaks binding sites of the CREB (cAMP response element-binding protein) transcription factor for a all known TSS regions in mouse. Our protocol includes one sample: an CREB enriched sample and a control sample without enrichment.
We have already run the trimming, alignment steps. So, we have access to the aligned BAM file.
In R:
-
First, we load the metagene package:
library(metagene)
-
We then create a vector containing the alignment file (1 BAM file) used in the analysis:
bamFileCREB <- system.file("CREB.bam") bamFile <- c(bamFileCREB)
-
All TSS features have to be retrieve (this step might take a while) and associate to the read density from the alignment file. The distance around TSS to include in the plot is fixed to 10,000 by setting the maxDistance parameter.
groupsFeatures <- parseFeatures(bamFiles=bamFile, features, specie="mouse", maxDistance=10000)
The status of achievement of each steps of the function is printed out while processing:
Step 1: Prepare bam files... Done! Step 2: Prepare regions... Done! Step 3: Parse bam files... [1] "allTSS" [1] "Current bam: /home/CREB.bam" Step 3: Parse bam files... Done! Step 4: Merge matrix... Done!
-
To generate a plot, we first have to create a list containing the elements we wish to plot. The
groupsFeatures
holds the names of the elements which can be used. Since we only have one ChIP sample for the CREB transcription factor, thegroupsFeatures
contains only one element "allTSS". We create a list named CREB which contain the element "allTSS". The name of the list will be used as the plot title. Finaly, this list has to be embedded in a generic list as it is the formal format expected by theplotMatrices
function.names(groupsFeatures$matrix) [1] "allTSS" groupToPlot<-list(CREB=c("allTSS"))
-
The
plotMatrices
function is used to generate the plot. The list containing the elements to plot is passed to thematricesGroups
parameter while thegroupsFeatures
object, created sooner, is passed to thedata
parameter. ThebinSize
parameter sets the number of nucleotides included in each bin for the bootstrap step. The bootstrap step uses the data from all TSS to generate a confidence interval around the final profile. By default, a confidence interval of 95% is used. The smaller thebinSize
parameter is, the more refine the final plot will be and the more time consuming the bootstrap step will be.DF<-plotMatrices(matricesGroups = groupToPlot, data = groupsFeatures, binSize = 50)
The generated graph shows the profile of ChIP peaks binding sites of the CREB (cAMP response element-binding protein) transcription factor, with a confidence interval of 95%, for a all known TSS regions in mouse.