Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

deletion discrepancy #96

Open
alantsangmb opened this issue Jan 24, 2025 · 2 comments
Open

deletion discrepancy #96

alantsangmb opened this issue Jan 24, 2025 · 2 comments

Comments

@alantsangmb
Copy link

I have two datasets generated by two different influenza virus libraries using the same sample. I used MIRA to assemble the genomes, and all segments are identical between the two libraries except for one deletion (4 As at position 29 in the PB1 gene), which is only present in one library. Upon reviewing the deletion and insertion tables, I noticed that the 4 As were classified as a deletion in Library A and as an insertion in Library B. Consequently, the 4 As are present in the final consensus sequence of Library A but absent in the final consensus sequence of Library B.

Could you please help me understand why this discrepancy occurred? Thank you very much for your assistance.

I have attached the tables and the consensus sequences for your reference.

PB1_deletion.zip

@kristinelacek
Copy link
Collaborator

Hi there, Thanks for opening an issue!

If the datasets are from two different viral libraries, it is not surprising to see the results slightly differ. You were sequencing completely different molecules. One library had a minor deletion while the other had a minor insertion. What do the coverage diagrams look like for these segments? It is possible one library had a higher proportion of DI Particles than the other, resulting in a deletion in the consensus sequence.

@alantsangmb
Copy link
Author

Hi,

MIRA reported a 0.67 deletion of 4 As at position 29 in the PB1 gene in library A, while it reported a 0.29 insertion of 4 As at the same position of PB1 in library B.

I am wondering why one instance is treated as a deletion and the other as an insertion. Is this due to the default Illumina QC thresholds for Flu, which are set at 25% for insertion frequency and 60% for deletion frequency?

Additionally, why were the 4 As retained in the consensus sequence if the deletion frequency exceeded 60%? Conversely, why were the 4 As removed from the consensus sequence if the insertion frequency exceeded 25%?

Thank you for your assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants