Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Signal/Noise (highQualityToLowQualityRatio) calculation misleading when no low/bad qual reads are present #381

Open
rollf opened this issue Feb 28, 2023 · 0 comments

Comments

@rollf
Copy link

rollf commented Feb 28, 2023

Hi,

the docs say

image

I don't understand the 0.5 in the formula and it seems to me this is not consistent with the actual implementation:

tvref.highQualityToLowQualityRatio = hicnt / (locnt != 0 ? locnt : 0.5d);

The above means that the ratio would be 0.702 for 73 high/good qual reads and 104 low/bad qual reads (73/104=0.7019....). On the other hand, if there are no low/bad qual reads (i.e. locnt == 0, this can be achieved with -q 0), the formula becomes hicnt / 0.5 which is effectively hicnt * 2. So in the example numbers mentioned above we'd have 73 + 104 = 177 high/good qual reads (i.e. the 104 became 'good' now) and thus a ratio of 177 * 2 = 354.0! The ratio will be different for all variants having different numbers of reads (on that position). Conceptually, the formula could be

tvref.highQualityToLowQualityRatio = locnt == 0 ? Double.POSITIVE_INFINITY : hicnt / locnt;

I'm not saying this is how it should be but this would be how I understand the result. In any case, I guess the docs could be improved.

(Note that the implementation is present in createInsertion() as well as createVariant())

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant