This pipeline is configured to run on the IKMB medCluster. More general support for other compute systems may follow later.
nextflow run ikmb/bacterial-annotation --samples samples.csv --tools 'prokka,dfast'
A samplesheet to provide the relevant data to the pipeline. This is the preferred option, if possible. Else, see --assemblies
below.
nextflow run ikmb/bacterial-annotation --samples samples.csv --email 'me@somewhere.org' --tools prokka
The samplesheet format looks as follows:
sample,genus,species,strain,fasta,busco
MySample,Escherichia,coli,K12,/path/to/assembly.fa,bacteria_odb10
Note that the last column, busco, refers to a busco reference database to perform completeness checks against. Youn can find a full list here. Note that you must only specify the first part of the name, up to and including "odb10".
A regular expression pointing to a list of bacterial assemblies in FASTA format. This option is discouraged since it cannot provide useful metadata about genus, species and so on. These fields will all be filled with the name of the assembly file.
nextflow run ikmb/bacterial-annotation --assemblies '/path/to/assemblies/*.fasta' --email 'me@somewhere.org' --tools prokka
This pipeline currently supports two annotation tools:
- DFast_core [dfast]
- Prokka [prokka]
Specify one or both tools - sperated by comma and enclosed by single quotes; results will be stored in the respective subfolder. Dfast tends to produce better results and is the recommended choice.
The directory where to store the results.
Set any compatible command line options for Dfast
Set any compatible command line options for Prokka
Send the pipeline report to this email. Leave empty if you do not wish to recieve this email.