diff --git a/commonoptions.md b/commonoptions.md index 8b22adef0..5095e36e6 100644 --- a/commonoptions.md +++ b/commonoptions.md @@ -31,6 +31,7 @@ harpy align bwa --threads 20 --directory samples/trimmedreads --quiet harpy align bwa -t 20 -d samples/trimmedreads -q ``` +--- ## The `workflow` folder When you run one of the main Harpy modules, the output directory will contain a `workflow` folder. This folder is @@ -45,6 +46,8 @@ and the contents therein also allow you to rerun the workflow manually. The `wor | `report/*.Rmd` | RMarkdown files used to generate the fancy reports | useful to understand math behind plots/tables or borrow code from | | `*.workflow.summary` | Plain-text overview of the important parts of the workflow | useful for bookkeeping and writing Methods | +--- + ## The `Genome` folder You will notice that many of the workflows will create a `Genome` folder in the working diff --git a/haplotagdata.md b/haplotagdata.md index ec224189e..10ae0d423 100644 --- a/haplotagdata.md +++ b/haplotagdata.md @@ -83,4 +83,11 @@ same barcode (on the same contig), then we'll consider them as originating from you are being more strict and indicating that alignments sharing barcodes must be closer together to be considered originating from the same DNA molecule. Conversely, a higher threshold indicates you are being more lax and indicating barcodes can be further away from each other and still be considered originating from the same DNA molecule. A threshold of 50kb-150kb is considered a decent balance, but you should choose -larger/smaller values if you have evidence to support them. \ No newline at end of file +larger/smaller values if you have evidence to support them. + +![Molecule origin is determined by the distance between alignments with the same barcode relative to the specified threshold](/static/bc_threshold.png) + +| Alignment distance | Inferred origin | +|:-----------------------|:--------------------| +| less than threshold | same molecule | +| greater than threshold | different molecules | \ No newline at end of file diff --git a/static/bc_threshold.png b/static/bc_threshold.png new file mode 100644 index 000000000..f8f47ac68 Binary files /dev/null and b/static/bc_threshold.png differ diff --git a/static/bc_threshold.svg b/static/bc_threshold.svg new file mode 100644 index 000000000..b88ee0c14 --- /dev/null +++ b/static/bc_threshold.svg @@ -0,0 +1,759 @@ + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + + distance between alignments (with same barcode on same contig) + + + + + distance between alignments (with same barcode on same contig) + + if distance < threshold: same origin molecule + + + if distance ≥ threshold: different origin molecules + + + + + + + + + + + + contig + alignment 1 + BX: A11C22B33D44 + alignment 2 + BX: A11C22B33D44 + + + + + + + contig + alignment 1 + BX: A11C22B33D44 + alignment 2 + BX: A11C22B33D44 + + + + + + distance threshold + + + + distance threshold + + +