You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I noticed there is a bug in ariba that occurs when insertions or deletions are multiples of three and do not start at the first position of a codon. The bug seems to originate from the way insertions and deletions are handled in the source file assembly_variants.py. When an insertion or deletion is a multiple of three, but does not start at the first position of a codon, it will affect that codon in the reference and also the following codon in the reference.
I noticed the bug with a single amino-acid deletion that was incorrectly marked as a truncation by ariba. You can see for yourself by downloading the read set SRR850776 from NCBI SRA (corresponding assembly NZ_KI973283.1) and searching for the gene in the inserted fasta. >Enterobacter-NL68__wzy atgaatgataagagtttaaaaaataaccacttcaaaataagtgcgcatttagcgtttata tatttcctgcttacttcttctttattattgatttttttaacggaatcggcaagtgctaca ctttatggaactgtagaggatatttttgcggttttttgtgccattatattgtttggtgag atgatttacttctatatgcatagagtgaagtttatctcgttgcaattaatgtttgctttt gtattttctttaattataggtattccttctttttatttgtatttctttaaaaaagcttct gatggctttgaattgacttgtatatggggtatgttaataaatatcatactctatcttaca gctatcaaaaatgttcataggcaacaagcaaaaagtataaataatctatttaagattata ttttccattgttggtgtttgtcagttaattaaaattgttttttatctgaaatttatttta tcatcaggcttagggcatttagctatttatactgatagtgaagaattactttcaagtatt ccttttgctgtccgtgctattagtggcttttcttctataatggctttggcagtcttttat tataaatcatcgaaaaaatataagatgctagcatttattttgctcgcatctgaccttgtt attgggataagaaataaattcttttttgcttttatatgcattattattctctcgttatat tcaaatagaaagaaaataatagcaatattcgctagaatatccaaagtacactatttatta attggcttcgtcggtttttcaatgatttcatatcttcgtgaaggatatgaaatcaatttt attaattatcttggcgttgtacttgactctctgtcgtctacgcttgcaggtttacaagat ttatactatttgcccgatgaaaatggttgggcgttactaaacccccttacgatattatcg caagtgttgccgctcagtggttttggcttcataagcgatgcacaaattgctcatgaatat tcaacaattgtgcttggcagcgtgtctaatgggatagcgttgtcatcttctggtcttctt gaagcaagtataataagtttgcatttcaatttatttatttatcttgcctatctgttaatt atgatctcgataattcaaaaaggtttgaatagtaattatgttatttttaacttttttgcc ctggctatgatgactggtttcttctattctgttcgtggagaattaattttgccatttgct tatgttttaaaatcgtttccaataataataattgcaaatctattgactcaacaaaaaagt agaaattga
The mutation TAA1289. starts at the 3rd position of a codon and results in the deletion of a single amino acid (not affecting the amino-acid sequence further in this specific case). Ariba erroneously translates TAA to a stop codon instead.
Thank you for providing the software. Happy to answer any questions that you may have.
The text was updated successfully, but these errors were encountered:
Hi there,
Hope you're well.
I noticed there is a bug in ariba that occurs when insertions or deletions are multiples of three and do not start at the first position of a codon. The bug seems to originate from the way insertions and deletions are handled in the source file assembly_variants.py. When an insertion or deletion is a multiple of three, but does not start at the first position of a codon, it will affect that codon in the reference and also the following codon in the reference.
I noticed the bug with a single amino-acid deletion that was incorrectly marked as a truncation by ariba. You can see for yourself by downloading the read set SRR850776 from NCBI SRA (corresponding assembly NZ_KI973283.1) and searching for the gene in the inserted fasta.
>Enterobacter-NL68__wzy atgaatgataagagtttaaaaaataaccacttcaaaataagtgcgcatttagcgtttata tatttcctgcttacttcttctttattattgatttttttaacggaatcggcaagtgctaca ctttatggaactgtagaggatatttttgcggttttttgtgccattatattgtttggtgag atgatttacttctatatgcatagagtgaagtttatctcgttgcaattaatgtttgctttt gtattttctttaattataggtattccttctttttatttgtatttctttaaaaaagcttct gatggctttgaattgacttgtatatggggtatgttaataaatatcatactctatcttaca gctatcaaaaatgttcataggcaacaagcaaaaagtataaataatctatttaagattata ttttccattgttggtgtttgtcagttaattaaaattgttttttatctgaaatttatttta tcatcaggcttagggcatttagctatttatactgatagtgaagaattactttcaagtatt ccttttgctgtccgtgctattagtggcttttcttctataatggctttggcagtcttttat tataaatcatcgaaaaaatataagatgctagcatttattttgctcgcatctgaccttgtt attgggataagaaataaattcttttttgcttttatatgcattattattctctcgttatat tcaaatagaaagaaaataatagcaatattcgctagaatatccaaagtacactatttatta attggcttcgtcggtttttcaatgatttcatatcttcgtgaaggatatgaaatcaatttt attaattatcttggcgttgtacttgactctctgtcgtctacgcttgcaggtttacaagat ttatactatttgcccgatgaaaatggttgggcgttactaaacccccttacgatattatcg caagtgttgccgctcagtggttttggcttcataagcgatgcacaaattgctcatgaatat tcaacaattgtgcttggcagcgtgtctaatgggatagcgttgtcatcttctggtcttctt gaagcaagtataataagtttgcatttcaatttatttatttatcttgcctatctgttaatt atgatctcgataattcaaaaaggtttgaatagtaattatgttatttttaacttttttgcc ctggctatgatgactggtttcttctattctgttcgtggagaattaattttgccatttgct tatgttttaaaatcgtttccaataataataattgcaaatctattgactcaacaaaaaagt agaaattga
The mutation TAA1289. starts at the 3rd position of a codon and results in the deletion of a single amino acid (not affecting the amino-acid sequence further in this specific case). Ariba erroneously translates TAA to a stop codon instead.
Thank you for providing the software. Happy to answer any questions that you may have.
The text was updated successfully, but these errors were encountered: