Skip to content
sonali-bioc edited this page Feb 10, 2014 · 6 revisions

CNAnorm: A normalization method for Copy Number Aberration in cancer samples Performs ratio, GC content correction and normalization of data obtained using low coverage (one read every 100-10,000 bp) high troughput sequencing. It performs a "discrete" normalization looking for the ploidy of the genome. It will also provide tumour content if at least two ploidy states can be found.

Authors: Stefano Berri, Henry M. Wood, Arief Gusnanto

We conducted an analysis using our sample datasets with CNAnorm.

  • The first step was to use a perl script to convert our given bam file into a suitable format
  • In a R session , we followed the following steps:
    a) We then did a selected a subset of the output file produced to restrict ourselves to "chr4"
    b) check if raw counts are similiar to counts from samtools file
    using samtools we get the following counts for 5 pre-chosen regions:
    samtools view tumorA.chr4.bam chr4:8000001-8010000 | wc -l 67
    samtools view tumorA.chr4.bam chr4:8010001-8020000 | wc -l 62
    samtools view tumorA.chr4.bam chr4:10000001-10010000 | wc -l 67
    samtools view tumorA.chr4.bam chr4:10010001-10020000 | wc -l 74
    samtools view tumorA.chr4.bam chr4:1000000-1010000 | wc -l # 47
    The reads using CNAnorm counts is [1] 64 54 66 74 42 (see gist below for details)
    c) create a CNAnorm object d) smooth the signal to decrease noise e) estimate peaks and ploidy - d) visualize the reads and ploidy by using peakPlots

![] (https://raw.github.com/Bioconductor/copy-number-analysis/master/image/cna-norm.png)

Rscipt implementing the above steps can be found at:cnv-CNAnorm.R

Clone this wiki locally