-
Notifications
You must be signed in to change notification settings - Fork 54
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request #272 from hall-lab/develop
Pull in changes for svtools 0.4.0
- Loading branch information
Showing
102 changed files
with
67,026 additions
and
8,725 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -2,4 +2,5 @@ | |
*~ | ||
.travis.yml.old | ||
.travis.yml.new | ||
*.old | ||
*.old | ||
geno_refine_scripts/* |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,6 +1,5 @@ | ||
#include external scripts | ||
include svtools/bin/bedpesort | ||
include svtools/bin/vcfsort | ||
include svtools/bin/svtyper/svtyper | ||
include versioneer.py | ||
include svtools/_version.py |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,17 @@ | ||
# Support Guidelines | ||
Support is requested by submitting an issue to our github repository. | ||
|
||
## Before you submit | ||
Please ensure you've read the existing [README.md](README.md) and | ||
[Tutorial.md](Tutorial.md) as this is our best documentation of | ||
how to use the software. | ||
|
||
## Information to include in your issue | ||
The following information is extremely helpful: | ||
* The version of svtools you're using. | ||
* The precise command line you used when you encountered the issue. | ||
* Any error output from svtools. | ||
|
||
**NOTE** — While we understand that this is not feasible in all | ||
cases, please provide a small file that reproduces the error. This gives us | ||
a quick way to reproduce the error and start debugging. |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,31 @@ | ||
# Tiered merging for large cohorts | ||
For very large SV callsets, we recommend a tiered approach to merging the individual Lumpy VCFs. This is useful to keep compute requirements modest and also to help smooth out batch effects between cohorts. For our own large cohorts, we've adopted a merging strategy whereby we merge groups of up to 1000 samples per cohort (larger cohorts will have multiple batches of 1000 or less) and then sort and merge the subsequent merged files again. | ||
|
||
## Initial Per-batch sorting and merging | ||
|
||
### Construct files containing the paths to each input VCF in a batch | ||
`svtools lsort` can accept a file where each line is a path to an input VCF. For example, | ||
|
||
``` | ||
/path/to/sample1.vcf | ||
/path/to/sample2.vcf | ||
``` | ||
Since there are a large number of samples (up to 1000!) in each batch, using these files can make your command line smaller. | ||
|
||
### Sort and merge each batch | ||
For each input file you constructed in the previous step, sort and merge the SV VCFs as in the Tutorial. | ||
|
||
``` | ||
svtools lsort -f batch_of_lumpy_vcfs.txt | ||
| svtools lmerge -i /dev/stdin -f 20 | ||
| bgzip -c > batch.merged.vcf.gz | ||
``` | ||
|
||
## Final sorting and merging | ||
After this step you will have one output file per batch. However, these files will _not_ contain genotypes so you'll need to specify additional options to ensure that they are properly combined. In the example below, we assume the input is a file containing the paths to each merged batch. **NOTE:** This step _requires_ that the SNAME field be present in your input files in order to weight the merging correctly. | ||
|
||
``` | ||
svtools lsort -r -f file_of_merged_batches --batch-size 1 | ||
| svtools lmerge -i /dev/stdin -f 20 -w carrier_wt | ||
| bgzip -c > final_output.merged.vcf.gz | ||
``` |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Oops, something went wrong.