Skip to content

Why do most of my somatic SV calls turn out to be TATATATATATA insertions? #61

@HaHaLiang666

Description

@HaHaLiang666

1.Why do most of my somatic SV calls turn out to be TATATATATATA insertions?
2.I also found that the breakpoint_clusters*.tsv files in the somatic_SVs/ directory are empty, and the plots folder in somatic_SVs/ is also empty. Is this normal?
3.Would you mind checking if there’s anything incorrect in my pipeline?

my somatic_SVs/severus_somatic.vcf like this
#CHROM POS ID REF ALT QUAL FILTER INFO FORMAT T10.haplotagged
chr1 4835032 severus_INS8088 N TCTATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATAATATATATATATATATATATATA 60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=82;INSIDE_VNTR=TRUE;MAPQ=60.0;PHASESETID=4740369;HP=2 GT:VAF:hVAF:DR:DV 0/1:0.09:0.00,0.00,0.15:53:5
chr1 4932074 severus_INS8089 N TATATATATATATATATATATATATATATATATATATATATATATATATTATATATATATATATATATATA 60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=66;MAPQ=60.0;PHASESETID=4740369;HP=2 GT:VAF:hVAF:DR:DV 0/1:0.12:0.00,0.00,0.21:38:5
chr1 5722407 severus_INS8099 N ATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATAT 60.0 PASS PRECISE;SVTYPE=INS;SVLEN=137;INSIDE_VNTR=TRUE;MAPQ=60.0;PHASESETID=5161159;HP=1 GT:VAF:hVAF:DR:DV 0/1:0.29:0.00,0.56,0.00:24:10
chr1 7276304 severus_INS8112 N ATATATATATATATATAATATATATTATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATA 60.0 PASS PRECISE;SVTYPE=INS;SVLEN=133;INSIDE_VNTR=TRUE;MAPQ=60.0;PHASESETID=6502431;HP=1 GT:VAF:hVAF:DR:DV 0/1:0.18:0.00,0.38,0.00:41:9
chr1 7398200 severus_INS8113 N ATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATA60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=82;INSIDE_VNTR=TRUE;MAPQ=60.0 GT:VAF:hVAF:DR:DV 0/1:0.26:0.00,0.35,0.22:31:11
chr1 7806835 severus_INS8116 N CTATATATATATATATATATATATATATAGATATATATATATATATATATATATATATATATTAT 60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=65;INSIDE_VNTR=TRUE;MAPQ=60.0;PHASESETID=7730932;HP=2 GT:VAF:hVAF:DR:DV 0/1:0.23:0.00,0.00,0.39:47:14
chr1 8233612 severus_INS8120 N ATATATTATATATATATATATATATATATATATATATATATATATATATAATATAT 60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=64;INSIDE_VNTR=TRUE;MAPQ=60.0;PHASESETID=7730932;HP=1 GT:VAF:hVAF:DR:DV 0/1:0.19:0.00,0.36,0.00:17:4
chr1 10997262 severus_INS8144 N ATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATATAT 60.0 PASS IMPRECISE;SVTYPE=INS;SVLEN=104;INSIDE_VNTR=TRUE;MAPQ=60.0 GT:VAF:hVAF:DR:DV 0/1:0.26:0.00,0.21,0.33:26:9
chr1 12396029 severus_INS8152 N TATATATATATATATATATATATATATATATATATATATATATATATATATATATATAATATATATATATATT 60.0 PASS PRECISE;SVTYPE=INS;SVLEN=68;MAPQ=60.0;PHASESETID=12341811;HP=1 GT:VAF:hVAF:DR:DV 0/1:0.10:0.00,0.62,0.00:44:5

##############################
I also found that the breakpoint_clusters*.tsv files in the somatic_SVs/ directory are empty, and the plots folder in somatic_SVs/ is also empty. Is this normal?
Image

##############################
my log file like this
[2025-07-30 10:41:08] INFO: Starting Severus 1.5
[2025-07-30 10:41:11] INFO: Parsing reads from T10.haplotagged.bam
[2025-07-30 10:45:41] INFO: Total read length: 160974838497
[2025-07-30 10:45:41] INFO: Total aligned length: 151183182196 (0.94)
[2025-07-30 10:45:41] INFO: Read N50 / N90: 21622 / 16779
[2025-07-30 10:45:41] INFO: Alignments N50 / N90: 21584 / 16733
[2025-07-30 10:45:41] INFO: Read error rate (Q25 / Q50 / Q75): 0.0010 / 0.0030 / 0.0060
[2025-07-30 10:45:41] INFO: Read mismatch rate (Q25 / Q50 / Q75): 0.0000 / 0.0010 / 0.0010
[2025-07-30 10:46:49] INFO: Parsing reads from P10.haplotagged.bam
[2025-07-30 10:51:11] INFO: Total read length: 130966052206
[2025-07-30 10:51:11] INFO: Total aligned length: 124014350014 (0.95)
[2025-07-30 10:51:11] INFO: Read N50 / N90: 20018 / 15098
[2025-07-30 10:51:11] INFO: Alignments N50 / N90: 19977 / 15060
[2025-07-30 10:51:11] INFO: Read error rate (Q25 / Q50 / Q75): 0.0010 / 0.0030 / 0.0060
[2025-07-30 10:51:11] INFO: Read mismatch rate (Q25 / Q50 / Q75): 0.0000 / 0.0000 / 0.0010
[2025-07-30 10:52:11] INFO: Computing read quality
[2025-07-30 10:53:10] INFO: Annotating reads
[2025-07-30 10:53:35] INFO: Computing coverage histogram
[2025-07-30 10:53:39] INFO: Median coverage by PASS reads for T10.haplotagged.bam (H1 / H2 / H0): 23.0 / 23.0 / 0.0
[2025-07-30 10:53:40] INFO: Median coverage by PASS reads for P10.haplotagged.bam (H1 / H2 / H0): 19.0 / 19.0 / 0.0
[2025-07-30 10:53:40] INFO: Extracting split alignments
[2025-07-30 10:53:50] INFO: Extracting clipped reads
[2025-07-30 10:53:52] INFO: Starting breakpoint detection
[2025-07-30 10:54:14] INFO: Clustering unmapped insertions
/root/user/miniconda3/envs/LongRead/lib/python3.12/site-packages/numpy/core/fromnumeric.py:3504: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/root/user/miniconda3/envs/LongRead/lib/python3.12/site-packages/numpy/core/_methods.py:129: RuntimeWarning: invalid value encountered in scalar divide
ret = ret.dtype.type(ret / rcount)
[2025-07-30 10:56:00] INFO: Starting compute_bp_coverage
[2025-07-30 10:57:15] INFO: Filtering breakpoints
[2025-07-30 10:57:19] INFO: Writing breakpoints
[2025-07-30 10:57:20] INFO: Preparing outputs for all_SVs
[2025-07-30 10:57:20] INFO: Computing segment coverage
[2025-07-30 10:59:32] INFO: Total phased length: 2541429253
[2025-07-30 10:59:32] INFO: Phase blocks N50: 778539
[2025-07-30 10:59:32] INFO: Preparing graph
[2025-07-30 10:59:37] INFO: Writing vcf
[2025-07-30 10:59:37] INFO: Preparing outputs for somatic_SVs
[2025-07-30 10:59:37] INFO: Computing segment coverage
[2025-07-30 11:02:04] INFO: Total phased length: 2541429253
[2025-07-30 11:02:04] INFO: Phase blocks N50: 778539
[2025-07-30 11:02:04] INFO: Preparing graph
[2025-07-30 11:02:04] INFO: Writing vcf

I am currently using NumPy version 1.26.4, but when I upgraded to version 2.3.0, the same warning message persisted.

/root/user/miniconda3/envs/severus_env/lib/python3.12/site-packages/numpy/_core/fromnumeric.py:3859: RuntimeWarning: Mean of empty slice.
return _methods._mean(a, axis=axis, dtype=dtype,
/root/user/miniconda3/envs/severus_env/lib/python3.12/site-packages/numpy/_core/_methods.py:144: RuntimeWarning: invalid value encountered in scalar divide
ret = ret.dtype.type(ret / rcount)

#########################
Would you mind checking if there’s anything incorrect in my pipeline?

severus --target-bam $whatshap_out/${tumor_sample_name}.haplotagged.bam
--control-bam $whatshap_out/${normal_sample_name}.haplotagged.bam
--out-dir $severus_out
-t $THREADS --phasing-vcf $hiphase_out/${normal_sample_name}.hifi_hiphase.vcf.gz
--vntr-bed $path_to_ref/$repbed

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions