Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error in Scanning minor allele frequency data #31

Open
kawito opened this issue Oct 18, 2024 · 8 comments
Open

Error in Scanning minor allele frequency data #31

kawito opened this issue Oct 18, 2024 · 8 comments

Comments

@kawito
Copy link

kawito commented Oct 18, 2024

How can I fix the below error?

[2024-10-18][16:55:48][hificnv][INFO] Starting hificnv
[2024-10-18][16:55:48][hificnv][INFO] cmdline: hificnv --bam ./MFB_591C/MFB_591C_DV.haplotagged.bam --maf ./MFB_591C/MFB_591C_DV.phased.vcf.gz --ref /opt/biol/human_GRCh38_no_alt_analysis_set.fasta --exclude /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz --expected-cn /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed --threads 16 --output-prefix MFB_591C_CNV
[2024-10-18][16:55:48][hificnv][INFO] Running on 16 threads
[2024-10-18][16:55:48][hificnv][INFO] Reading reference genome from file '/opt/biol/human_GRCh38_no_alt_analysis_set.fasta'
[2024-10-18][16:55:55][hificnv][INFO] Reading excluded regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz'
[2024-10-18][16:55:55][hificnv][INFO] Reading expected CN regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed'
[2024-10-18][16:55:55][hificnv][INFO] Processing alignment file './MFB_591C/MFB_591C_DV.haplotagged.bam'
[2024-10-18][17:00:56][hificnv][INFO] Processed alignments on 1,611,114 of 3,099,922 ref genome kb (51%)
[2024-10-18][17:04:55][hificnv][INFO] Finished processing all alignments
[2024-10-18][17:04:55][hificnv][INFO] Getting depth bin GC content
[2024-10-18][17:04:57][hificnv][INFO] Writing depth track to bigwig file: 'MFB_591C_CNV.FF12761124.depth.bw'
[2024-10-18][17:04:58][hificnv][INFO] Scanning minor allele frequency data from file './MFB_591C/MFB_591C_DV.phased.vcf.gz'
thread 'main' panicked at src/maf_utils.rs:123:13:
assertion left == right failed
left: 1
right: 2
stack backtrace:
0: 0x63b736 - std::backtrace_rs::backtrace::libunwind::trace::hbee8a7973eeb6c93
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
1: 0x63b736 - std::backtrace_rs::backtrace::trace_unsynchronized::hc8ac75eea3aa6899
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x63b736 - std::sys_common::backtrace::_print_fmt::hc7f3e3b5298b1083
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
3: 0x63b736 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hbb235daedd7c6190
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
4: 0x7294c0 - core::fmt::rt::Argument::fmt::h76c38a80d925a410
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
5: 0x7294c0 - core::fmt::write::h3ed6aeaa977c8e45
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
6: 0x6396af - std::io::Write::write_fmt::h78b18af5775fedb5
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
7: 0x63b514 - std::sys_common::backtrace::_print::h5d645a07e0fcfdbb
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
8: 0x63b514 - std::sys_common::backtrace::print::h85035a511aafe7a8
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
9: 0x63cbe7 - std::panicking::default_hook::{{closure}}::hcce8cea212785a25
10: 0x63c949 - std::panicking::default_hook::hf5fcb0f213fe709a
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
11: 0x63d078 - std::panicking::rust_panic_with_hook::h095fccf1dc9379ee
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:779:13
12: 0x63cf52 - std::panicking::begin_panic_handler::{{closure}}::h032ba12139b353db
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:657:13
13: 0x63bc36 - std::sys_common::backtrace::__rust_end_short_backtrace::h9259bc2ff8fd0f76
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
14: 0x63ccb0 - rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
15: 0x40d525 - core::panicking::panic_fmt::h784f20a50eaab275
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
16: 0x40d8ab - core::panicking::assert_failed_inner::hbf94b40c37b92af0
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:342:17
17: 0x40321f - core::panicking::assert_failed::hb6516476849ac639
18: 0x42eade - hificnv::maf_utils::scan_maf_file::h89146bb176bd754d
19: 0x45605c - hificnv::main::h0fc30317dd29c333
20: 0x44a613 - std::sys_common::backtrace::__rust_begin_short_backtrace::heaacbb72de2fe6e0
21: 0x4428c9 - std::rt::lang_start::{{closure}}::ha105d5439a76d3a9
22: 0x633cc1 - core::ops::function::impls::<impl core::ops::function::FnOnce for &F>::call_once::h37600b1e5eea4ecd
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:284:13
23: 0x633cc1 - std::panicking::try::do_call::hb4bda49fa13a0c2b
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
24: 0x633cc1 - std::panicking::try::h8bbf75149211aaaa
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
25: 0x633cc1 - std::panic::catch_unwind::h8c78ec68ebea34cb
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
26: 0x633cc1 - std::rt::lang_start_internal::{{closure}}::hffdf44a19fd9e220
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:48
27: 0x633cc1 - std::panicking::try::do_call::hcb3194972c74716d
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
28: 0x633cc1 - std::panicking::try::hcdc6892c5f0dba4c
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
29: 0x633cc1 - std::panic::catch_unwind::h4910beb4573f4776
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
30: 0x633cc1 - std::rt::lang_start_internal::h6939038e2873596b
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:20
31: 0x45689f - main
32: 0x67f155 - __libc_start_main
at /mnt/tc-build/src/glibc-2.17/csu/libc-start.c:258:16
33: 0x40dfaa -
at /mnt/tc-build/src/glibc-2.17/csu/../sysdeps/x86_64/start.S:123
34: 0x0 -
(END)

@holtjma
Copy link
Collaborator

holtjma commented Oct 18, 2024

Hi @kawito,

The assertion that's failing there is checking for an AD tag with exactly 2 entries (typically the REF and ALT allele depths), and this has happened after a filter for bi-allelic variants. It seems like your VCF file has some with only 1 entry, despite having two alleles. This seems like a case of some malformed VCF.

Is your VCF different from a typical DeepVariant VCF file? I haven't encountered this issue in our standard runs.

Matt

@ctsa
Copy link
Member

ctsa commented Oct 18, 2024

Thanks @kawito and @holtjma - This is an interesting behavior because we're already filtering for VCF sites with a single ALT, so the AD field (Number=R), should have 2 entries here per spec. I wouldn't be surprised if, e.g, a vcf post-processing/normalization tool were altering this into an inconsistent state, among other possibilities. As a first step I think we could certainly improve the error message to write out the offending VCF line so that we could more quickly make decisions about how to followup on this type of event. We can get at least this part in place, and @kawito if you can provide more information about the VCF per Matt's question we might be able understand if another policy change made sense.

@kawito
Copy link
Author

kawito commented Oct 19, 2024 via email

@ctsa
Copy link
Member

ctsa commented Oct 22, 2024

Hi @kawito,

We've just released a bugfix update here which should either address your issue or provide a better error report:

https://github.com/PacificBiosciences/HiFiCNV/releases/tag/v1.0.1

This update fixes the parsing of the AD field to cover a few more cases in the spec, and will write out the full VCF line in case of any remaining error.

We haven't updated this tool in some time, so also please note that we have an updated license here:

https://github.com/PacificBiosciences/HiFiCNV/blob/main/LICENSE.md

Please let us know if the updated caller is still having difficulty with your VCF.

@kawito
Copy link
Author

kawito commented Oct 22, 2024 via email

@kawito
Copy link
Author

kawito commented Oct 22, 2024 via email

@ctsa
Copy link
Member

ctsa commented Oct 22, 2024

Thanks for running your case again with the latest version Tomoko,

The VCF record in the error report shows an invalid record for the AD field:

chr1    132138  .       C       T       12.2    PASS    .       GT:GQ:DP:AD:VAF:C       0/1:4:39:32:0.820513:DV

The AD entry here is just the number 32, so there isn't any way to parse the count of C and T observations from this entry.

The best I can suggest is to trace the processing pipeline for this VCF to see which program introduced the invalid AD count in this record. If this is direct output from DeepVariant then I'd have to assume it's a DV bug. Can you describe more about the origin of this file?

@ctsa
Copy link
Member

ctsa commented Oct 22, 2024

It might also be helpful to note here that the maf argument to HiFICNV doesn't impact CNV segmentation quality at this point, so you can drop this entry and get the same CNV call output. The only difference is that you won't have the maf bigwig track for visualization.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants