-
Notifications
You must be signed in to change notification settings - Fork 60
Task: fixstart
This fixes the start position of each contig to be at a dnaA gene (if found), otherwise at the gene predicted by prodigal that is nearest the middle of the contig. Matches are found to dnaA genes by running promer. Circlator comes with a default set of dnaA genes (made with get_dnaa), but the user can specify an alternative FASTA file of genes instead.
The general usage is
circlator fixstart [options] <assembly.fasta> <outprefix>
There are the following options:
-
--genes_fa FILENAME
: FASTA file of nucleotide sequences of genes to search for to use as start point. -
--ignore FILENAME
: absolute path to file of contig names to not change. One contig name per line. By default, the start position of every input contig will be changed. -
--min_id FLOAT
: minimum percent identity of promer match to dnaA gene. Default: 70
The output file of rearranged contigs is called outprefix.fasta
and logging information is written to outprefix.log
. An example log file is:
[fixstart] id break_point gene_name gene_reversed new_name skipped
[fixstart] contig1 - - - - skipped
[fixstart] contig2 1234567 dnaa_1 no - -
[fixstart] contig3 1000 prodigal yes - -
Contig1 was skipped because it was named using the --ignore
option. Contig2 had a match to the gene dnaa_1, starting at position 1234567 and so was rearranged to start at that position. Contig3 had no match to a dnaA gene, so a gene predicted by prodigal was used - the match was on the reverse strand and the contig was rearranged so that it starts with that gene on its forward strand. The new start point of the contig is at position 1000. The column new_name
is not currently used by Circlator and can be ignored.