-
Notifications
You must be signed in to change notification settings - Fork 55
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
- Loading branch information
Showing
26 changed files
with
2,621 additions
and
2,166 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,30 +1,68 @@ | ||
svtyper | ||
SVTyper | ||
======= | ||
[![GitHub license](https://img.shields.io/badge/license-MIT-blue.svg)](https://raw.githubusercontent.com/hall-lab/svtyper/master/LICENSE) | ||
[![Build Status](https://travis-ci.org/hall-lab/svtyper.svg?branch=master)](https://travis-ci.org/hall-lab/svtyper) | ||
|
||
Bayesian genotyper for structural variants | ||
|
||
### Example workflow | ||
## Example | ||
|
||
#### Data | ||
``` | ||
wget http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.bam | ||
wget http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.bam.bai | ||
wget http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.splitters.bam | ||
wget http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.splitters.bam.bai | ||
wget http://colbychiang.com/hall/cshl_sv_2014/data/NA12878.20.vcf.gz | ||
svtyper \ | ||
-i sv.vcf \ | ||
-B sample.bam \ | ||
-l sample.bam.json \ | ||
> sv.gt.vcf | ||
``` | ||
|
||
#### Genotype with SVTyper | ||
## Overview | ||
|
||
SVTyper performs breakpoint genotyping of structural variants (SVs) using whole genome sequencing data. Users must supply a VCF file of sites to genotype (which may be generated by [LUMPY](https://github.com/arq5x/lumpy-sv)) as well as a BAM/CRAM file of Illumina paired-end reads aligned with [BWA-MEM](https://github.com/lh3/bwa). SVTyper assesses discordant and concordant reads from paired-end and split-read alignments to infer genotypes at each site. Algorithm details and benchmarking are described in [Chiang et al., 2015](http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html). | ||
|
||
![NA12878 heterozygous deletion](etc/het.png?raw=true "NA12878 heterozygous deletion") | ||
|
||
## Installation | ||
|
||
Requirements | ||
- Python 2.7 or newer | ||
- Pysam 0.8.1 or newer | ||
|
||
Clone the repository | ||
``` | ||
git clone git@github.com:hall-lab/svtyper.git | ||
``` | ||
|
||
Test the installation | ||
``` | ||
cd svtyper/test | ||
../svtyper \ | ||
-i example.vcf \ | ||
-B NA12878.target_loci.sorted.bam \ | ||
-l NA12878.bam.json | ||
> test.vcf | ||
``` | ||
zcat NA12878.20.vcf.gz \ | ||
| ./svtyper \ | ||
-B NA12878.20.bam \ | ||
-S NA12878.20.splitters.bam \ | ||
> NA12878.20.gt.vcf | ||
|
||
## Troubleshooting | ||
|
||
Many common issues are related to abnormal insert size distributions in the BAM file. SVTyper provides methods to assess and visualize the characteristics of sequencing libraries. | ||
|
||
Running SVTyper with the `-l` flag creates a JSON file with essential metrics on a BAM file. SVTyper will sample the first N reads for the file (1 million by default) to parse the libraries, read groups, and insert size histograms. This can be done in the absence of a VCF file. | ||
``` | ||
svtyper \ | ||
-B my.bam \ | ||
-l my.bam.json | ||
``` | ||
|
||
The [lib_stats.R](scripts/lib_stats.R) script produces insert size histograms from the JSON file | ||
``` | ||
scripts/lib_stats.R my.bam.json my.bam.json.pdf | ||
``` | ||
#### Warning | ||
2015-10-05 | ||
![Insert size histogram](etc/my.bam.json.png?raw=true "Insert size histogram") | ||
|
||
|
||
## Citation | ||
|
||
C Chiang, R M Layer, G G Faust, M R Lindberg, D B Rose, E P Garrison, G T Marth, A R Quinlan, and I M Hall. SpeedSeq: ultra-fast personal genome analysis and interpretation. Nat Meth 12, 966–968 (2015). doi:10.1038/nmeth.3505. | ||
|
||
As of commit [2c2ef7f91698a6d2929430f0865402ad421a8e3d](https://github.com/hall-lab/svtyper/commit/2c2ef7f91698a6d2929430f0865402ad421a8e3d), SVTyper assumes that BAMs were aligned with BWA MEM **without** the "-M" flag. If you used the "-M" flag in your alignment, then you should also use the "-M" flag when running SVTyper. | ||
http://www.nature.com/nmeth/journal/vaop/ncurrent/full/nmeth.3505.html |
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Binary file not shown.
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
Oops, something went wrong.