You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, I'm looking for a good sequence comparison software.
When I tested the default parameters of kraken2, I used the sequences of more than 17,000 viruses (from the complete genome in the ncbi virus library and belonging to the refseq genome) and more than 100 sequences of Buchnera aphidicola (from the complete genome of NCBI taxid:9) as a test database.
Next, I randomly selected 100 genomes from the 100+ Buchnera aphidicola, and then randomly took 50bp sequences from the genomes as query data.
The problem is, I only identified 44 sequences classified (44.00%) using kraken2, and in the output file 40 sequences have a kmer ratio of 0:16, and there are also sequences that appear to have 16 kmer already classified, but the final classification result is 0=unclassified.
By the way, I put the unclassifiable sequences in the total fasta file by less and could see exact matches (perhaps excluding wrongly extracted sequences), also I made an attempt with blast, which would be slightly better, matching 80 sequences (there were multiple matches results, all Buchnera aphidicola), but still lower than the expected.
I am confused and not sure which step I have gone wrong, thank you for any help and reply!
The text was updated successfully, but these errors were encountered:
Hi, I'm looking for a good sequence comparison software.
When I tested the default parameters of kraken2, I used the sequences of more than 17,000 viruses (from the complete genome in the ncbi virus library and belonging to the refseq genome) and more than 100 sequences of Buchnera aphidicola (from the complete genome of NCBI taxid:9) as a test database.
Next, I randomly selected 100 genomes from the 100+ Buchnera aphidicola, and then randomly took 50bp sequences from the genomes as query data.
The problem is, I only identified 44 sequences classified (44.00%) using kraken2, and in the output file 40 sequences have a kmer ratio of 0:16, and there are also sequences that appear to have 16 kmer already classified, but the final classification result is 0=unclassified.
By the way, I put the unclassifiable sequences in the total fasta file by less and could see exact matches (perhaps excluding wrongly extracted sequences), also I made an attempt with blast, which would be slightly better, matching 80 sequences (there were multiple matches results, all Buchnera aphidicola), but still lower than the expected.
I am confused and not sure which step I have gone wrong, thank you for any help and reply!
The text was updated successfully, but these errors were encountered: