some queries about SViper #22

prasundutta87 · 2022-04-19T19:17:06Z

Hi,

I am working with trios ONT data for which I have Illumina short read data as well. I am using SViper 2.0.0 for polishing my SV breakpoints. I have a few queries and I would be grateful if I could be helped with them.

When I was testing the software with one set of sample, I found that most of the FAILs were FAIL5 ("The variant was polished away."). What is the reason behind this fail?
What is meant by FAIL3 (The long read regions do not fit)? Can this please be elaborated?
I am aware that no tags should be present. SViper skips the variants. With bcftools, I am getting the error that SKIP is not defined. If you are still developing the tool, can that please be added? Although I have changed the SVTYPE to INS, SViper checks the variant type by tags rather than SVTYPE.
I observed that the the SViper score is put on the QUAL field. How is the score calculated and does it have any biological significance? Should I filter my SVs based on SViper score again? Is there a threshold based on which I should remove SVs? I actually filter my SVs using QUAL value of the variant caller (cuteSV). So, replacing this value with the score effects my pipeline.

Regards,
Prasun

smehringer · 2022-04-22T06:50:25Z

HI @prasundutta87,

thanks for your interest in SViper!
I try to maintain SViper but often I have troubles finding the time.

When I was testing the software with one set of sample, I found that most of the FAILs were FAIL5 ("The variant was polished away."). What is the reason behind this fail?

This means that after polishing the variant is not visible anymore in the data (in the corrected long reads).

What is meant by FAIL3 (The long read regions do not fit)? Can this please be elaborated?

This means that for the given variant, no proper region from the long read could be extracted. E.g. although a long read has a desired deletion of 200, the flanking regions of this deletion are mapped very poorly, or the mapping indicates a complex variant. SViper can only polish simple deletions and insertions.

I am aware that no tags should be present. SViper skips the variants. With bcftools, I am getting the error that SKIP is not defined. If you are still developing the tool, can that please be added? Although I have changed the SVTYPE to INS, SViper checks the variant type by tags rather than SVTYPE.

I'll try to change this! But I can't promise that it will be in the next days.

I observed that the the SViper score is put on the QUAL field. How is the score calculated and does it have any biological significance?

The score is computed here:

SViper/include/sviper/evaluate_final_mapping.h

Lines 21 to 27 in 3b57a9c

    
           // Score computation 
        
           // ----------------- 
        
           double error_rate = ((double)length(record.cigar) - 1.0)/ (config.flanking_region * 2.0); 
        
           double fuzzyness = (1.0 - error_rate/0.15) * 100.0; 
        
           variant.quality = std::max(fuzzyness, 0.0); 
        
           record.mapQ = variant.quality;

It does not have a biological significance! As far as I remember my own code, it was experimentally derived and proved to work well on manual inspection.

Should I filter my SVs based on SViper score again? Is there a threshold based on which I should remove SVs?

Unfortunately this is very hard to answer and heavily depends on your use case. In general I can say that you should filter out variants with a FAIL tag. Those are very unlikely to be true. But variants with a low score might just mean that the polishing didn't work well. I might not filter by the score but only regard this as a confidence score.

I actually filter my SVs using QUAL value of the variant caller (cuteSV). So, replacing this value with the score effects my pipeline.

Can you filter the SVs before polishing them with SViper? Otherwise I might need to see if I can add an option that does not overwrite the quality scores but adds them in the INFO field.

Best,
Svenja

prasundutta87 · 2022-04-22T09:51:32Z

Hi @smehringer ,

Thank you so much to answer my queries. Is there a document anywhere where the algorithm or working on SViper is mentioned anywhere. I am just trying to get my head around the algorithm (not the numerical/quantification bit, but the general concept of polishing). For example, when you say polishing the variant is not visible anymore in the corrected long reads, can this please be elaborated?

Thanks for the suggestion to use SViper after my final filtering.

Also, do we need to sort the BAMs by name? It seems to be specifically mentioned for utilities_merge_split_alignments tool. Currently I have coordinate sorted my BAMs.

This may be trivial, but the master version of SViper is 2.0.0, but the most recent version is 2.1.0. At least it shows in the help that it is 2.0.0. Could you kindly clarify this?

Regards,
Prasun

smehringer · 2022-04-28T12:53:16Z

Is there a document anywhere where the algorithm or working on SViper is mentioned anywhere.

I've send you an email with my thesis that hopefully can answer most of your questions.

Also, do we need to sort the BAMs by name? It seems to be specifically mentioned for utilities_merge_split_alignments tool. Currently I have coordinate sorted my BAMs.

For sviper, sorting by coordinate is fine (even required).
Only when using the utility utilities_merge_split_alignments you need to have the BAM sorted by names. But do you want to use this utility at all? It's an rather advanced utility I used in my validation pipelines.

This may be trivial, but the master version of SViper is 2.0.0, but the most recent version is 2.1.0. At least it shows in the help that it is 2.0.0. Could you kindly clarify this?

I'm very sorry for the confusion! This is a documentatio bug. I forgot to change the version in the help page. I'll correct this.

This was referenced Aug 31, 2022

[FIX] Fix vcf output by adding the filter info for SKIP. #26

Merged

[FIX,DOC] Fix help page version to 2.1.0. #25

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

some queries about SViper #22

some queries about SViper #22

prasundutta87 commented Apr 19, 2022

smehringer commented Apr 22, 2022

prasundutta87 commented Apr 22, 2022

smehringer commented Apr 28, 2022

some queries about SViper #22

some queries about SViper #22

Comments

prasundutta87 commented Apr 19, 2022

smehringer commented Apr 22, 2022

prasundutta87 commented Apr 22, 2022

smehringer commented Apr 28, 2022