From cdcc0f02d39fed0e3707ecdc8df8be22ab643a4e Mon Sep 17 00:00:00 2001 From: pdimens Date: Wed, 6 Nov 2024 14:02:16 -0500 Subject: [PATCH] fix assembly --- Workflows/SV/leviathan.md | 6 +- Workflows/SV/naibr.md | 8 +- Workflows/Simulate/simulate-linkedreads.md | 24 ++--- Workflows/Simulate/simulate-variants.md | 108 ++++++++++----------- Workflows/assembly.md | 33 ++++--- Workflows/deconvolve.md | 8 +- Workflows/phase.md | 4 +- Workflows/qc.md | 6 +- Workflows/snp.md | 4 +- 9 files changed, 106 insertions(+), 95 deletions(-) diff --git a/Workflows/SV/leviathan.md b/Workflows/SV/leviathan.md index c849a99f9..5bdb08c19 100644 --- a/Workflows/SV/leviathan.md +++ b/Workflows/SV/leviathan.md @@ -69,9 +69,9 @@ In addition to the [!badge variant="info" corners="pill" text="common runtime op | `--contigs` | | file path or list | | | [Contigs to plot](/commonoptions.md#--contigs) in the report | | `--extra-params` | `-x` | string | | | Additional naibr arguments, in quotes | | `--genome` | `-g` | file path | | ‼️ | Genome assembly that was used to create alignments | -| `--iterations` | `-i` | integer | 50 | | Number of iterations to perform through index (reduces memory) | -| `--min-barcodes` | `-b` | integer | 2 | | Minimum number of barcode overlaps supporting candidate SV | -| `--min-sv` | `-m` | integer | 1000 | | Minimum size of SV to detect | +| `--iterations` | `-i` | integer | `50` | | Number of iterations to perform through index (reduces memory) | +| `--min-barcodes` | `-b` | integer | `2` | | Minimum number of barcode overlaps supporting candidate SV | +| `--min-sv` | `-m` | integer | `1000` | | Minimum size of SV to detect | | `--populations` | `-p` | file path | | | Tab-delimited file of sample\<*tab*\>group | ### Single-sample variant calling diff --git a/Workflows/SV/naibr.md b/Workflows/SV/naibr.md index 21d5866cc..f0d4f6a34 100644 --- a/Workflows/SV/naibr.md +++ b/Workflows/SV/naibr.md @@ -69,10 +69,10 @@ In addition to the [!badge variant="info" corners="pill" text="common runtime op | `--contigs` | | file path or list | | | [Contigs to plot](/commonoptions.md#--contigs) in the report | | `--extra-params` | `-x` | string | | | Additional naibr arguments, in quotes | | `--genome` | `-g` | file path | | ‼️ | Genome assembly for phasing bam files | -| `--min-barcodes` | `-b` | integer | 2 | | Minimum number of barcode overlaps supporting candidate SV | -| `--min-quality` | `-q` | integer (0-40) | 30 | | Minimum `MQ` (SAM mapping quality) to pass filtering | -| `--min-sv` | `-n` | integer | 1000 | | Minimum size of SV to detect | -| `--molecule-distance` | `-m` | integer | 100000 | | Base-pair distance threshold to separate molecules | +| `--min-barcodes` | `-b` | integer | `2` | | Minimum number of barcode overlaps supporting candidate SV | +| `--min-quality` | `-q` | integer (0-40) | `30` | | Minimum `MQ` (SAM mapping quality) to pass filtering | +| `--min-sv` | `-n` | integer | `1000` | | Minimum size of SV to detect | +| `--molecule-distance` | `-m` | integer | `100000` | | Base-pair distance threshold to separate molecules | | `--populations` | `-p` | file path | | | Tab-delimited file of sample\<*tab*\>group | | `--vcf` | `-v` | file path | | ❗ | Phased vcf file for phasing bam files ([see below](#optional-vcf-file)) | diff --git a/Workflows/Simulate/simulate-linkedreads.md b/Workflows/Simulate/simulate-linkedreads.md index 5f846c44f..d89069fd4 100644 --- a/Workflows/Simulate/simulate-linkedreads.md +++ b/Workflows/Simulate/simulate-linkedreads.md @@ -45,18 +45,18 @@ harpy simulate linkedreads -t 4 -n 2 -l 100 -p 50 data/genome.hap1.fasta data/ In addition to the [!badge variant="info" corners="pill" text="common runtime options"](/commonoptions.md), the [!badge corners="pill" text="simulate linkedreads"] module is configured using these command-line arguments: {.compact} -| argument | short name | type | default | required | description | -|:---------------|:----------:|:------------|:-------------:|:--------:|:------------------------------------------------------------------------------------------------| -| `HAP1_GENOME` | | file path | | ‼️ | Haplotype 1 of the diploid genome to simulate reads | -| `HAP2_GENOME` | | file path | | ‼️ | Haplotype 1 of the diploid genome to simulate reads | -| `--barcodes` | `-b` | file path | [10X barcodes](https://github.com/aquaskyline/LRSIM/blob/master/4M-with-alts-february-2016.txt) | | File of linked-read barcodes to add to reads | -| `--distance-sd` | `-s` | integer | 15 | | Standard deviation of read-pair distance | -| `--molecule-length` | `-l` | integer | 100 | | Mean molecule length (kbp) | -| `--molecules-per` | `-m` | integer | 10 | | Average number of molecules per partition | -| `--mutation-rate` | `-r` | number | 0.001 | | Random mutation rate for simulating reads (0 - 1.0) | -| `--outer-distance` | `-d` | integer | 350 | | Outer distance between paired-end reads (bp) | -| `--patitions` | `-p` | integer | 1500 | | Number (in thousands) of partitions/beads to generate | -| `--read-pairs` | `-n` | number | 600 | | Number (in millions) of read pairs to simulate | +| argument | short name | default | required | description | +| :------------------ | :--------: | :---------------------------------------------------------------------------------------------: | :------: | :---------------------------------------------------- | +| `HAP1_GENOME` | | | ‼️ | Haplotype 1 of the diploid genome to simulate reads | +| `HAP2_GENOME` | | | ‼️ | Haplotype 1 of the diploid genome to simulate reads | +| `--barcodes` | `-b` | [10X barcodes](https://github.com/aquaskyline/LRSIM/blob/master/4M-with-alts-february-2016.txt) | | File of linked-read barcodes to add to reads | +| `--distance-sd` | `-s` | `15` | | Standard deviation of read-pair distance | +| `--molecule-length` | `-l` | `100` | | Mean molecule length (kbp) | +| `--molecules-per` | `-m` | `10` | | Average number of molecules per partition | +| `--mutation-rate` | `-r` | `0.001` | | Random mutation rate for simulating reads (0 - 1.0) | +| `--outer-distance` | `-d` | `350` | | Outer distance between paired-end reads (bp) | +| `--patitions` | `-p` | `1500` | | Number (in thousands) of partitions/beads to generate | +| `--read-pairs` | `-n` | `600` | | Number (in millions) of read pairs to simulate | ## Mutation Rate The read simulation is two-part: first `dwgsim` generates forward and reverse FASTQ files from the provided genome haplotypes diff --git a/Workflows/Simulate/simulate-variants.md b/Workflows/Simulate/simulate-variants.md index 2f2a5680d..c23e2943c 100644 --- a/Workflows/Simulate/simulate-variants.md +++ b/Workflows/Simulate/simulate-variants.md @@ -32,12 +32,12 @@ harpy simulate inversion -n 10 --min-size 1000 --max-size 50000 path/to/genome. There are 4 submodules with very obvious names: {.compact} -| submodule | what it does | -|:----------|:-------------| -| [!badge corners="pill" text="snpindel"](#snpindel) | simulates single nucleotide polymorphisms (snps) and insertion-deletions (indels) | -| [!badge corners="pill" text="inversion"](#inversion) | simulates inversions | -| [!badge corners="pill" text="cnv"](#cnv) | simulates copy number variants | -| [!badge corners="pill" text="translocation"](#translocation) | simulates translocations | +| submodule | what it does | +| :----------------------------------------------------------- | :-------------------------------------------------------------------------------- | +| [!badge corners="pill" text="snpindel"](#snpindel) | simulates single nucleotide polymorphisms (snps) and insertion-deletions (indels) | +| [!badge corners="pill" text="inversion"](#inversion) | simulates inversions | +| [!badge corners="pill" text="cnv"](#cnv) | simulates copy number variants | +| [!badge corners="pill" text="translocation"](#translocation) | simulates translocations | ## :icon-terminal: Running Options While there are serveral differences between individual workflow options, each has available all the @@ -46,16 +46,16 @@ Each requires and input genome at the end of the command line, and each requires to randomly simulate, or a `--vcf` of specific variants to simulate. There are also these unifying options among the different variant types: {.compact} -| argument | short name | type | required | description | -| :-----|:-----|:-----|:---:|:-----| -| `INPUT_GENOME` | | file path | ‼️ | The haploid genome to simulate variants onto| -| `--centromeres` | `-c` | file path | | GFF3 file of centromeres to avoid | -| `--exclude-chr` | `-e` | file path | | Text file of chromosomes to avoid, one per line | -| `--genes` | `-g` | file path | | GFF3 file of genes to avoid simulating over (see `snpindel` for caveat) | -| `--heterozygosity` | `-z` | float between [0,1] | | [proportion of simulated variants to make heterozygous ](#heterozygosity) (default: `0`) | -| `--only-vcf` | | toggle | | When used with `--heterozygosity`, will create the diploid VCFs but will not simulate a diploid genome | -| `--prefix` | | string | | Naming prefix for output files (default: `sim.{module_name}`)| -| `--randomseed` | | integer | | Random seed for simulation | +| argument | short name | required | description | +| :----------------- | :--------- | :------: | :----------------------------------------------------------------------------------------------------- | +| `INPUT_GENOME` | | ‼️ | The haploid genome to simulate variants onto | +| `--centromeres` | `-c` | | GFF3 file of centromeres to avoid | +| `--exclude-chr` | `-e` | | Text file of chromosomes to avoid, one per line | +| `--genes` | `-g` | | GFF3 file of genes to avoid simulating over (see `snpindel` for caveat) | +| `--heterozygosity` | `-z` | | [proportion of simulated variants to make heterozygous ](#heterozygosity) (default: `0`) | +| `--only-vcf` | | | When used with `--heterozygosity`, will create the diploid VCFs but will not simulate a diploid genome | +| `--prefix` | | | Naming prefix for output files (default: `sim.{module_name}`) | +| `--randomseed` | | | Random seed for simulation | !!!warning simulations can be slow Given software limitations, simulating many variants **relative to the size of the input genome** will be noticeably slow. @@ -69,38 +69,38 @@ An indel, is a type of mutation that involves the addition/deletion of one or mo The snp and indel variants are combined in this module because `simuG` allows simulating them together. {.compact} -| argument | short name | type | default | description | -|:------------------|:----------:|:-----------|:-------:|:-------------------------------------------------------------| -| `--indel-count` | `-m` | integer | 0 | Number of random indels to simluate | -| `--indel-vcf` | `-i` | file path | | VCF file of known indels to simulate | -| `--indel-ratio` | `-d` | float | 1 | Insertion/Deletion ratio for indels | -| `--indel-size-alpha` | `-a` | float | 2.0 | Exponent Alpha for power-law-fitted indel size distribution| -| `--indel-size-constant` | `-l` | float | 0.5 | Exponent constant for power-law-fitted indel size distribution | -| `--snp-count` | `-n` | integer | 0 | Number of random snps to simluate | -| `--snp-gene-constraints` | `-y` | string | | How to constrain randomly simulated SNPs {`noncoding`,`coding`,`2d`,`4d`} when using `--genes`| -| `--snp-vcf`| `-s` | file path | | VCF file of known snps to simulate | -| `--titv-ratio` | `-r` | float | 0.5 | Transition/Transversion ratio for snps | +| argument | short name | default | description | +| :----------------------- | :--------: | :-----: | :--------------------------------------------------------------------------------------------- | +| `--indel-count` | `-m` | `0` | Number of random indels to simluate | +| `--indel-vcf` | `-i` | | VCF file of known indels to simulate | +| `--indel-ratio` | `-d` | `1` | Insertion/Deletion ratio for indels | +| `--indel-size-alpha` | `-a` | `2.0` | Exponent Alpha for power-law-fitted indel size distribution | +| `--indel-size-constant` | `-l` | `0.5` | Exponent constant for power-law-fitted indel size distribution | +| `--snp-count` | `-n` | `0` | Number of random snps to simluate | +| `--snp-gene-constraints` | `-y` | | How to constrain randomly simulated SNPs {`noncoding`,`coding`,`2d`,`4d`} when using `--genes` | +| `--snp-vcf` | `-s` | | VCF file of known snps to simulate | +| `--titv-ratio` | `-r` | `0.5` | Transition/Transversion ratio for snps | The ratio parameters for snp and indel variants and have special meanings when setting the value to either `0` or `9999` : {.compact} -| ratio | `0` meaning | `9999` meaning | -|:---- |:---|:---| -| `--indel-ratio` | deletions only | insertions only | -| `--titv-ratio` | transversions only | transitions only | +| ratio | `0` meaning | `9999` meaning | +| :-------------- | :----------------- | :---------------- | +| `--indel-ratio` | deletions only | insertions only | +| `--titv-ratio` | transversions only | transitions only | +++ 🔵 inversions ### inversion Inversions are when a section of a chromosome appears in the reverse orientation ([source](https://www.genome.gov/genetics-glossary/Inversion)). {.compact} -| argument | short name | type | default | description | -|:------------------|:----------:|:-----------|:-------:|:----------------| -| `--count`| `-n` | integer | 0 | Number of random inversions to simluate | -| `--max-size` | `-x` | integer | 100000 | Maximum inversion size (bp) | -| `--min-size` | `-m` | integer | 1000 | Minimum inversion size (bp) | -| `--vcf` | `-v` | file path | | VCF file of known inversions to simulate | +| argument | short name | default | description | +| :----------- | :--------: | :------: | :--------------------------------------- | +| `--count` | `-n` | `0` | Number of random inversions to simluate | +| `--max-size` | `-x` | `100000` | Maximum inversion size (bp) | +| `--min-size` | `-m` | `1000` | Minimum inversion size (bp) | +| `--vcf` | `-v` | | VCF file of known inversions to simulate | +++ 🟢 copy number variants ### cnv @@ -108,33 +108,33 @@ A copy number variation (CNV) is when the number of copies of a particular gene between individuals ([source](https://www.genome.gov/genetics-glossary/Copy-Number-Variation)). {.compact} -| argument | short name | type | default | description | -|:------------------|:----------:|:-----------|:-------:|:----------------| -| `--vcf` | `-v` | file path | | VCF file of known copy number variants to simulate | -| `--count` | `-n` | integer | 0 | Number of random cnv to simluate | -| `--dup-ratio` | `-d` | float | 1 | Tandem/Dispersed duplication ratio | -| `--gain-ratio` |`-l` | float | 1 | Relative ratio of DNA gain over DNA loss | -| `--max-size`| `-x` | integer |100000 | Maximum cnv size (bp) | -| `--max-copy` | `-y` | integer | 10 | Maximum number of copies | -| `--min-size` | `-m` | integer | 1000 | Minimum cnv size (bp) | +| argument | short name | default | description | +| :------------- | :--------: | :------: | :------------------------------------------------- | +| `--vcf` | `-v` | | VCF file of known copy number variants to simulate | +| `--count` | `-n` | `0` | Number of random cnv to simluate | +| `--dup-ratio` | `-d` | `1` | Tandem/Dispersed duplication ratio | +| `--gain-ratio` | `-l` | `1` | Relative ratio of DNA gain over DNA loss | +| `--max-size` | `-x` | `100000` | Maximum cnv size (bp) | +| `--max-copy` | `-y` | `10` | Maximum number of copies | +| `--min-size` | `-m` | `1000` | Minimum cnv size (bp) | The ratio parameters have special meanings when setting the value to either `0` or `9999` : {.compact} -| ratio | `0` meaning | `9999` meaning | -|:---- |:---|:---| -| `--dup-ratio` | dispersed duplications only | tandem duplications only | -| `--gain-ratio` | loss only | gain only | +| ratio | `0` meaning | `9999` meaning | +| :------------- | :-------------------------- | :----------------------- | +| `--dup-ratio` | dispersed duplications only | tandem duplications only | +| `--gain-ratio` | loss only | gain only | +++ 🟡 translocations ### translocation A translocation occurs when a chromosome breaks and the fragmented pieces re-attach to different chromosomes ([source](https://www.genome.gov/genetics-glossary/Translocation)). {.compact} -| argument | short name | type | default | description | -|:------------------|:----------:|:-----------|:-------:|:----------------| -| `--count`| `-n` | integer | 0 | Number of random inversions to simluate | -| `--vcf` | `-v` | file path | | VCF file of known inversions to simulate | +| argument | short name | default | description | +| :-------- | :--------: | :-----: | :--------------------------------------- | +| `--count` | `-n` | `0` | Number of random inversions to simluate | +| `--vcf` | `-v` | | VCF file of known inversions to simulate | +++ diff --git a/Workflows/assembly.md b/Workflows/assembly.md index ab4ee5b26..ae2198e2c 100644 --- a/Workflows/assembly.md +++ b/Workflows/assembly.md @@ -26,19 +26,30 @@ harpy metassembly --threads 20 -u prokaryote -k 13,51,75,83 FASTQ_R1 FASTQ_R2 ``` ## :icon-terminal: Running Options -In addition to the [!badge variant="info" corners="pill" text="common runtime options"](/commonoptions.md), the [!badge corners="pill" text="assembly"] module is configured using these command-line arguments: +In addition to the [!badge variant="info" corners="pill" text="common runtime options"](/commonoptions.md), the [!badge corners="pill" text="assembly"] +module is configured using the command-line arguments below. Since the assembly process consists of several distinct phases, +the options are shown with an extra column to reflect which part of the assembly process they correspond to. {.compact} -| argument | short name | type | default | required | description | -|:---------------|:----------:|:------------|:-------------:|:--------:|:------------------------------------------------------------------------------------------------| -| `FASTQ_R1` | | FASTQ file | | ‼️ | FASTQ file of forward reads | -| `FASTQ_R2` | | FASTQ file | | ‼️ | FASTQ file of reverse reads | -| `--bx-tag` | `-b` | string | `BX` | ‼️ | Which sequence header tag encodes the linked-read barcode (`BX` for `BX:Z` or `BC` for `BC:Z`) | -| `--extra-params` | `-x` | string | | | Additional spades parameters, in quotes | -| `--ignore-bx` | | toggle | | | Ignore linked-read info for initial spades assembly | | -| `--kmer-length` | `-k` | list of int | `auto` | | Kmer lengths to use for initial spades assembly. They must be **odd** and **<128**, separated by commas, and without spaces. (e.g. `13,23,51`) | -| `--max-memory` | `-r` | int > 1000 | `10000` | | Maximum memory for spades to use, given in megabytes | -| `--organism-type`| `-u` | string | `eukaryote` | | Organism type for assembly report. Options: `eukaryote`,`prokaryote`,`fungus` | | +| argument | short name | process | type | default | required | description | +|:----------------------|:----:|:-------------- :| :------------|:-------------:|:--------:|:-------------------------------------| +| `FASTQ_R1` | | | FASTQ file | | ‼️ | FASTQ file of forward reads | +| `FASTQ_R2` | | | FASTQ file | | ‼️ | FASTQ file of reverse reads | +| `--extra-params` | `-x` | [!badge variant="secondary" text="spades assembly"] | string | | | Additional spades parameters, in quotes | +| `--kmer-length` | `-k` | [!badge variant="secondary" text="spades assembly"] | list of int | `auto` | | Kmer lengths to use for initial spades assembly. They must be **odd** and **<128**, separated by commas, and without spaces. (e.g. `13,23,51`) | +| `--max-memory` | `-r` | [!badge variant="secondary" text="spades assembly"] | integer > 1000 | `10000` | | Maximum memory for spades to use, given in megabytes | +| `--arcs-extra` | `-y` | [!badge variant="secondary" text="arcs scaffold"] | string | | | Additional ARCS parameters, in quotes and `option=arg` format | +| `--contig-length` | `-c` | [!badge variant="secondary" text="arcs scaffold"] | integer | `500` | | Minimum contig length | +| `--links` | `-n` | [!badge variant="secondary" text="arcs scaffold"] | integer | `5` | | Minimum number of links to compute scaffold | +| `--min-aligned` | `-a` | [!badge variant="secondary" text="arcs scaffold"] | integer | `5` | | Minimum aligned read pairs per barcode | +| `--min-quality` | `-q` | [!badge variant="secondary" text="arcs scaffold"] | integer 0-40 | `0` | | Minimum mapping quality | +| `--mismatch` | `-m` | [!badge variant="secondary" text="arcs scaffold"] | integer | `5` | | Maximum number of mismatches | +| `--molecule-distance` | `-d` | [!badge variant="secondary" text="arcs scaffold"] | integer | `50000` | | Distance cutoff to split molecules (bp) | +| `--molecule-length` | `-l` | [!badge variant="secondary" text="arcs scaffold"] | integer | `2000` | | Minimum molecule length (bp) | +| `--seq-identity` | `-i` | [!badge variant="secondary" text="arcs scaffold"] | integer 0-100 | `98` | | Minimum sequence identity | +| `--span` | `-s` | [!badge variant="secondary" text="arcs scaffold"] | integer | `20` | | Minimum number of spanning molecules to be considered assembled | +| `--organism-type` | `-u` | [!badge variant="secondary" text="report"] | string | `eukaryote` | | Organism type for assembly report: `eukaryote`,`prokaryote`, or `fungus` | + ## :icon-tag: Deconvolved Inputs For linked-read assemblies, the barcodes need to be deconvolved in the sequence data, meaning that diff --git a/Workflows/deconvolve.md b/Workflows/deconvolve.md index 6d32ca5ac..346f22714 100644 --- a/Workflows/deconvolve.md +++ b/Workflows/deconvolve.md @@ -48,10 +48,10 @@ harpy deconvolve OPTIONS... INPUTS... | argument | short name | type | default | required | description | |:----------------------|:----------:|:----------------|:-------:|:--------:|:---------------------------------------------------------------------| | `INPUTS` | | file/directory paths | | ‼️ | Files or directories containing [input FASTQ files](/commonoptions.md#input-arguments) | -| `--density` | `-d` | integer | 3 | | On average, $\frac{1}{2^d}$ kmers are indexed | -| `--dropout` | `-a` | integer | 0 | | Minimum cloud size to deconvolve | -| `--kmer-length` | `-k` | integer | 21 | | Size of k-mers to search for similarities | -| `--window-size` | `-w` | integer | 40 | | Size of window guaranteed to contain at least one kmer | +| `--density` | `-d` | integer | `3` | | On average, $\frac{1}{2^d}$ kmers are indexed | +| `--dropout` | `-a` | integer | `0` | | Minimum cloud size to deconvolve | +| `--kmer-length` | `-k` | integer | `21` | | Size of k-mers to search for similarities | +| `--window-size` | `-w` | integer | `40` | | Size of window guaranteed to contain at least one kmer | ## Resulting Barcodes After deconvolution, some barcodes may have a hyphenated suffix like `-1` or `-2` (e.g. `A01C33B41D93-1`). diff --git a/Workflows/phase.md b/Workflows/phase.md index 6fba104c1..15bc3a2aa 100644 --- a/Workflows/phase.md +++ b/Workflows/phase.md @@ -43,8 +43,8 @@ In addition to the [!badge variant="info" corners="pill" text="common runtime op | `--extra-params` | `-x` | string | | | Additional Hapcut2 arguments, in quotes | | `--genome ` | `-g` | file path | | | Path to genome if wanting to also use reads spanning indels | | `--ignore-bx` | `-b` | toggle | | | Ignore haplotag barcodes for phasing | -| `--molecule-distance` | `-d` | integer | 100000 | | Base-pair distance threshold to separate molecules | -| `--prune-threshold` | `-p` | integer (0-100) | 7 | | PHRED-scale (%) threshold for pruning low-confidence SNPs | +| `--molecule-distance` | `-d` | integer | `100000` | | Base-pair distance threshold to separate molecules | +| `--prune-threshold` | `-p` | integer (0-100) | `7` | | PHRED-scale (%) threshold for pruning low-confidence SNPs | | `--vcf` | `-v` | file path | | ‼️ | Path to BCF/VCF file | | `--vcf-samples` | | toggle | | | [Use samples present in vcf file](#prioritize-the-vcf-file) for imputation rather than those found the directory | diff --git a/Workflows/qc.md b/Workflows/qc.md index 72c0fc19e..ce32b49db 100644 --- a/Workflows/qc.md +++ b/Workflows/qc.md @@ -36,11 +36,11 @@ In addition to the [!badge variant="info" corners="pill" text="common runtime op |:-----------------|:----------:|:------------|:-------:|:-------:|:--------------------------------------------------------------------------------------------------| | `INPUTS` | | file/directory paths | | ‼️ | Files or directories containing [input FASTQ files](/commonoptions.md#input-arguments) | | `--deconvolve` | `-c` | toggle | | | Resolve barcode clashes between reads from different molecules | -| `--deconvolve-params` | `-p` | (int,int,int,int) | (21,40,3,0) | | Accepts the [QuickDeconvolution parameters](/Workflows/deconvolve.md/#running-options) for `k`,`w`,`d`,`a`, in that order | +| `--deconvolve-params` | `-p` | int,int,int,int | `21,40,3,0` | | Accepts the [QuickDeconvolution parameters](/Workflows/deconvolve.md/#running-options) for `k`,`w`,`d`,`a`, in that order | | `--deduplicate` | `-d` | toggle | | | Identify and remove PCR duplicates [!badge variant="secondary" text="recommended"] | | `--extra-params` | `-x` | string | | | Additional fastp arguments, in quotes | -| `--min-length` | `-n` | integer | 30 | | Discard reads shorter than this length | -| `--max-length` | `-m` | integer | 150 | | Maximum length to trim sequences down to | +| `--min-length` | `-n` | integer | `30` | | Discard reads shorter than this length | +| `--max-length` | `-m` | integer | `150` | | Maximum length to trim sequences down to | | `--trim-adapters` | `-a` | toggle | | | Detect and remove adapter sequences [!badge variant="secondary" text="recommended"] | By default, this workflow will only quality-trim the sequences. You can also opt-in to: diff --git a/Workflows/snp.md b/Workflows/snp.md index aec55cc82..182b428de 100644 --- a/Workflows/snp.md +++ b/Workflows/snp.md @@ -64,9 +64,9 @@ In addition to the [!badge variant="info" corners="pill" text="common runtime op | `INPUTS` | | file/directory paths | | ‼️ | Files or directories containing [input BAM files](/commonoptions.md#input-arguments) | | `--extra-params` | `-x` | string | | | Additional mpileup/freebayes arguments, in quotes | | `--genome` | `-g` | file path | | ‼️ | Genome assembly for variant calling | -| `--ploidy` | `-n` | integer | 2 | | Ploidy of samples | +| `--ploidy` | `-n` | integer | `2` | | Ploidy of samples | | `--populations` | `-p` | file path | | | Tab-delimited file of sample\<*tab*\>group | -| `--regions` | `-r` | integer/file path/string | 50000 | | Regions to call variants on ([see below](#regions)) | +| `--regions` | `-r` | integer/file path/string | `50000` | | Regions to call variants on ([see below](#regions)) | ### ploidy