Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Test results did not meet expectations #849

Open
hedy-ella opened this issue Jul 3, 2024 · 0 comments
Open

Test results did not meet expectations #849

hedy-ella opened this issue Jul 3, 2024 · 0 comments

Comments

@hedy-ella
Copy link

hedy-ella commented Jul 3, 2024

Hi, I'm looking for a good sequence comparison software.
When I tested the default parameters of kraken2, I used the sequences of more than 17,000 viruses (from the complete genome in the ncbi virus library and belonging to the refseq genome) and more than 100 sequences of Buchnera aphidicola (from the complete genome of NCBI taxid:9) as a test database.
Next, I randomly selected 100 genomes from the 100+ Buchnera aphidicola, and then randomly took 50bp sequences from the genomes as query data.
The problem is, I only identified 44 sequences classified (44.00%) using kraken2, and in the output file 40 sequences have a kmer ratio of 0:16, and there are also sequences that appear to have 16 kmer already classified, but the final classification result is 0=unclassified.
By the way, I put the unclassifiable sequences in the total fasta file by less and could see exact matches (perhaps excluding wrongly extracted sequences), also I made an attempt with blast, which would be slightly better, matching 80 sequences (there were multiple matches results, all Buchnera aphidicola), but still lower than the expected.
I am confused and not sure which step I have gone wrong, thank you for any help and reply!

1719994160857

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant