-
Notifications
You must be signed in to change notification settings - Fork 4
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Error in Scanning minor allele frequency data #31
Comments
Hi @kawito, The assertion that's failing there is checking for an Is your VCF different from a typical DeepVariant VCF file? I haven't encountered this issue in our standard runs. Matt |
Thanks @kawito and @holtjma - This is an interesting behavior because we're already filtering for VCF sites with a single ALT, so the AD field (Number=R), should have 2 entries here per spec. I wouldn't be surprised if, e.g, a vcf post-processing/normalization tool were altering this into an inconsistent state, among other possibilities. As a first step I think we could certainly improve the error message to write out the offending VCF line so that we could more quickly make decisions about how to followup on this type of event. We can get at least this part in place, and @kawito if you can provide more information about the VCF per Matt's question we might be able understand if another policy change made sense. |
Dear Matt
Thank you for your reply.
My VCF is a typical DeepVariant VCF file. I believe.
$ Ref=/nfs/nas31501/mfb/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.fasta
$ singularity exec --nv --bind /home /home/genebay/sif/pepper_deepvariant_r0.8-gpu.sif run_pepper_margin_deepvariant call_variant -b MFB_591C_GRCh38.bam -f $Ref -o MFB_591C -p MFB_591C_DV -t 8 --phased_out --hifi >> pepper.out 2>> pepper.err &
I 'm afraid now it might be still processing.!?
ls -l MFB_591C
drwxrwxr-x. 2 kawaito kawaito 4096 Oct 17 09:47 intermediate_files
drwxrwxr-x. 2 kawaito kawaito 153 Oct 17 04:11 logs
-rw-rw-r--. 1 kawaito kawaito 1279182 Oct 17 04:48 MFB_591C_DV.chunks.csv
-rw-rw-r--. 1 kawaito kawaito 340885769857 Oct 17 09:46 MFB_591C_DV.haplotagged.bam
-rw-r--r--. 1 kawaito kawaito 40183776 Oct 17 15:18 MFB_591C_DV.haplotagged.bam.bai
-rw-rw-r--. 1 kawaito kawaito 86007512 Oct 17 09:47 MFB_591C_DV.phased.vcf.gz <- This is the one.
-rw-rw-r--. 1 kawaito kawaito 1576394 Oct 17 09:47 MFB_591C_DV.phased.vcf.gz.tbi
-rw-rw-r--. 1 kawaito kawaito 981563 Oct 17 09:46 MFB_591C_DV.phaseset.bed
-rw-rw-r--. 1 kawaito kawaito 82739637 Oct 17 04:11 MFB_591C_DV.vcf.gz
-rw-rw-r--. 1 kawaito kawaito 1573741 Oct 17 04:11 MFB_591C_DV.vcf.gz.tbi
Tomoko
…________________________________
差出人: Matt Holt ***@***.***>
送信日時: 2024年10月18日 22:36
宛先: PacificBiosciences/HiFiCNV ***@***.***>
CC: 河合 智子 ***@***.***>; Mention ***@***.***>
件名: Re: [PacificBiosciences/HiFiCNV] Error in Scanning minor allele frequency data (Issue #31)
Hi @kawito<https://github.com/kawito>,
The assertion that's failing there is checking for an AD tag with exactly 2 entries (typically the REF and ALT allele depths). It seems like your VCF file has some with only 1 entry (I guess just the REF allele depth). Is your VCF different from a typical DeepVariant VCF file? I haven't encountered this issue in our standard runs.
Matt
—
Reply to this email directly, view it on GitHub<#31 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGUKOA5EFDXQ7WIEWH3TWRLZ4EFHBAVCNFSM6AAAAABQFNEXK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRSGQ4TMNBRGE>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Hi @kawito, We've just released a bugfix update here which should either address your issue or provide a better error report: https://github.com/PacificBiosciences/HiFiCNV/releases/tag/v1.0.1 This update fixes the parsing of the AD field to cover a few more cases in the spec, and will write out the full VCF line in case of any remaining error. We haven't updated this tool in some time, so also please note that we have an updated license here: https://github.com/PacificBiosciences/HiFiCNV/blob/main/LICENSE.md Please let us know if the updated caller is still having difficulty with your VCF. |
Dear Chris,
Thank you for your quick response and all your efforts.
I will let you know about the results.
Best regards,
Tomoko
…________________________________
差出人: Chris Saunders ***@***.***>
送信日時: 2024年10月22日 10:41
宛先: PacificBiosciences/HiFiCNV ***@***.***>
CC: 河合 智子 ***@***.***>; Mention ***@***.***>
件名: Re: [PacificBiosciences/HiFiCNV] Error in Scanning minor allele frequency data (Issue #31)
Hi @kawito<https://github.com/kawito>,
We've just released a bugfix update here which should either address your issue or provide a better error report:
https://github.com/PacificBiosciences/HiFiCNV/releases/tag/v1.0.1
This update fixes the parsing of the AD field to cover a few more cases in the spec, and will write out the full VCF line in case of any remaining error.
We haven't updated this tool in some time, so also please note that we have an updated license here:
https://github.com/PacificBiosciences/HiFiCNV/blob/main/LICENSE.md
Please let us know if the updated caller is still having difficulty with your VCF.
—
Reply to this email directly, view it on GitHub<#31 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGUKOAZMAHU2EW3KBALSRQLZ4WUNRAVCNFSM6AAAAABQFNEXK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRYGAZTQOBVGQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Dear Chris,
I still have problems.
Best regards,
Tomoko
########################
nohup: ignoring input
[2024-10-22][11:35:59][hificnv][INFO] Starting hificnv
[2024-10-22][11:35:59][hificnv][INFO] cmdline: ./hificnv --bam ../MFB_591C/MFB_591C_DV.haplotagged.bam --maf ../MFB_591C/MFB_591C_DV.phased.vcf.gz --ref /opt/biol/human_GRCh38_no_alt_analysis_set.fasta --exclude /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz --expected-cn /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed --threads 16 --output-prefix MFB_591C_CNV
[2024-10-22][11:35:59][hificnv][INFO] Running on 16 threads
[2024-10-22][11:35:59][hificnv][INFO] Reading reference genome from file '/opt/biol/human_GRCh38_no_alt_analysis_set.fasta'
[2024-10-22][11:36:07][hificnv][INFO] Reading excluded regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz'
[2024-10-22][11:36:07][hificnv][INFO] Reading expected CN regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed'
[2024-10-22][11:36:07][hificnv][INFO] Processing alignment file '../MFB_591C/MFB_591C_DV.haplotagged.bam'
[2024-10-22][11:41:08][hificnv][INFO] Processed alignments on 2,414,775 of 3,099,922 ref genome kb (77%)
[2024-10-22][11:42:44][hificnv][INFO] Finished processing all alignments
[2024-10-22][11:42:44][hificnv][INFO] Getting depth bin GC content
[2024-10-22][11:42:47][hificnv][INFO] Writing depth track to bigwig file: 'MFB_591C_CNV.FF12761124.depth.bw'
[2024-10-22][11:42:48][hificnv][INFO] Scanning minor allele frequency data from file '../MFB_591C/MFB_591C_DV.phased.vcf.gz'
thread 'main' panicked at src/maf_utils.rs:164:21:
Failed to process allele depth from minor allele frequency file record:
chr1 132138 . C T 12.2 PASS . GT:GQ:DP:AD:VAF:C 0/1:4:39:32:0.820513:DV
stack backtrace:
0: 0x6a0f9a - <std::sys::backtrace::BacktraceLock::print::DisplayBacktrace as core::fmt::Display>::fmt::h304520fd6a30aa07
1: 0x7c781b - core::fmt::write::hf5713710ce10ff22
2: 0x69e443 - std::io::Write::write_fmt::hda708db57927dacf
3: 0x6a2282 - std::panicking::default_hook::{{closure}}::he1ad87607d0c11c5
4: 0x6a1eee - std::panicking::default_hook::h81c8cd2e7c59ee33
5: 0x6a2b0f - std::panicking::rust_panic_with_hook::had2118629c312a4a
6: 0x6a27f7 - std::panicking::begin_panic_handler::{{closure}}::h7fa5985d111bafa2
7: 0x6a1479 - std::sys::backtrace::__rust_end_short_backtrace::h704d151dbefa09c5
8: 0x6a2484 - rust_begin_unwind
9: 0x40ca13 - core::panicking::panic_fmt::h3eea515d05f7a35e
10: 0x44f908 - hificnv::maf_utils::scan_maf_file::hb5efe2e5540234ee
11: 0x4438cd - hificnv::main::hcdf90b247eabe9c3
12: 0x43fc73 - std::sys::backtrace::__rust_begin_short_backtrace::hdff766743d57fb6e
13: 0x44ac09 - std::rt::lang_start::{{closure}}::h58604008dca8f1ab
14: 0x6971e0 - std::rt::lang_start_internal::h4d90db0530245041
15: 0x44427f - main
16: 0x6f1940 - __libc_start_main
17: 0x40d67e - _start
18: 0x0 - <unknown>
*****************************************************************
河合 智子
国立成育医�研究センタ` 研究所
周b期病�B研究部
胎�拱k育研究室
�157-8535 �|京都世田谷区大i2-10-1
Tel; 03-5494-7120(内�4584)
E-mail; ***@***.***
*****************************************************************
…________________________________
差出人: Chris Saunders ***@***.***>
送信日�r: 2024年10月22日 10:41
宛先: PacificBiosciences/HiFiCNV ***@***.***>
CC: 河合 智子 ***@***.***>; Mention ***@***.***>
件名: Re: [PacificBiosciences/HiFiCNV] Error in Scanning minor allele frequency data (Issue #31)
Hi @kawito<https://github.com/kawito>,
We've just released a bugfix update here which should either address your issue or provide a better error report:
https://github.com/PacificBiosciences/HiFiCNV/releases/tag/v1.0.1
This update fixes the parsing of the AD field to cover a few more cases in the spec, and will write out the full VCF line in case of any remaining error.
We haven't updated this tool in some time, so also please note that we have an updated license here:
https://github.com/PacificBiosciences/HiFiCNV/blob/main/LICENSE.md
Please let us know if the updated caller is still having difficulty with your VCF.
―
Reply to this email directly, view it on GitHub<#31 (comment)>, or unsubscribe<https://github.com/notifications/unsubscribe-auth/AGUKOAZMAHU2EW3KBALSRQLZ4WUNRAVCNFSM6AAAAABQFNEXK6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDIMRYGAZTQOBVGQ>.
You are receiving this because you were mentioned.Message ID: ***@***.***>
|
Thanks for running your case again with the latest version Tomoko, The VCF record in the error report shows an invalid record for the
The The best I can suggest is to trace the processing pipeline for this VCF to see which program introduced the invalid AD count in this record. If this is direct output from DeepVariant then I'd have to assume it's a DV bug. Can you describe more about the origin of this file? |
It might also be helpful to note here that the maf argument to HiFICNV doesn't impact CNV segmentation quality at this point, so you can drop this entry and get the same CNV call output. The only difference is that you won't have the maf bigwig track for visualization. |
How can I fix the below error?
[2024-10-18][16:55:48][hificnv][INFO] Starting hificnv
[2024-10-18][16:55:48][hificnv][INFO] cmdline: hificnv --bam ./MFB_591C/MFB_591C_DV.haplotagged.bam --maf ./MFB_591C/MFB_591C_DV.phased.vcf.gz --ref /opt/biol/human_GRCh38_no_alt_analysis_set.fasta --exclude /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz --expected-cn /opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed --threads 16 --output-prefix MFB_591C_CNV
[2024-10-18][16:55:48][hificnv][INFO] Running on 16 threads
[2024-10-18][16:55:48][hificnv][INFO] Reading reference genome from file '/opt/biol/human_GRCh38_no_alt_analysis_set.fasta'
[2024-10-18][16:55:55][hificnv][INFO] Reading excluded regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.excluded_regions.common_50.bed.gz'
[2024-10-18][16:55:55][hificnv][INFO] Reading expected CN regions from file '/opt/biol/human_GRCh38_no_alt_analysis_set/human_GRCh38_no_alt_analysis_set.expected_cn.XY.bed'
[2024-10-18][16:55:55][hificnv][INFO] Processing alignment file './MFB_591C/MFB_591C_DV.haplotagged.bam'
[2024-10-18][17:00:56][hificnv][INFO] Processed alignments on 1,611,114 of 3,099,922 ref genome kb (51%)
[2024-10-18][17:04:55][hificnv][INFO] Finished processing all alignments
[2024-10-18][17:04:55][hificnv][INFO] Getting depth bin GC content
[2024-10-18][17:04:57][hificnv][INFO] Writing depth track to bigwig file: 'MFB_591C_CNV.FF12761124.depth.bw'
[2024-10-18][17:04:58][hificnv][INFO] Scanning minor allele frequency data from file './MFB_591C/MFB_591C_DV.phased.vcf.gz'
thread 'main' panicked at src/maf_utils.rs:123:13:
assertion
left == right
failedleft: 1
right: 2
stack backtrace:
0: 0x63b736 - std::backtrace_rs::backtrace::libunwind::trace::hbee8a7973eeb6c93
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/libunwind.rs:104:5
1: 0x63b736 - std::backtrace_rs::backtrace::trace_unsynchronized::hc8ac75eea3aa6899
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
2: 0x63b736 - std::sys_common::backtrace::_print_fmt::hc7f3e3b5298b1083
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:68:5
3: 0x63b736 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::hbb235daedd7c6190
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:44:22
4: 0x7294c0 - core::fmt::rt::Argument::fmt::h76c38a80d925a410
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/rt.rs:142:9
5: 0x7294c0 - core::fmt::write::h3ed6aeaa977c8e45
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/fmt/mod.rs:1120:17
6: 0x6396af - std::io::Write::write_fmt::h78b18af5775fedb5
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/io/mod.rs:1810:15
7: 0x63b514 - std::sys_common::backtrace::_print::h5d645a07e0fcfdbb
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:47:5
8: 0x63b514 - std::sys_common::backtrace::print::h85035a511aafe7a8
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:34:9
9: 0x63cbe7 - std::panicking::default_hook::{{closure}}::hcce8cea212785a25
10: 0x63c949 - std::panicking::default_hook::hf5fcb0f213fe709a
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:292:9
11: 0x63d078 - std::panicking::rust_panic_with_hook::h095fccf1dc9379ee
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:779:13
12: 0x63cf52 - std::panicking::begin_panic_handler::{{closure}}::h032ba12139b353db
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:657:13
13: 0x63bc36 - std::sys_common::backtrace::__rust_end_short_backtrace::h9259bc2ff8fd0f76
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/sys_common/backtrace.rs:171:18
14: 0x63ccb0 - rust_begin_unwind
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:645:5
15: 0x40d525 - core::panicking::panic_fmt::h784f20a50eaab275
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:72:14
16: 0x40d8ab - core::panicking::assert_failed_inner::hbf94b40c37b92af0
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/panicking.rs:342:17
17: 0x40321f - core::panicking::assert_failed::hb6516476849ac639
18: 0x42eade - hificnv::maf_utils::scan_maf_file::h89146bb176bd754d
19: 0x45605c - hificnv::main::h0fc30317dd29c333
20: 0x44a613 - std::sys_common::backtrace::__rust_begin_short_backtrace::heaacbb72de2fe6e0
21: 0x4428c9 - std::rt::lang_start::{{closure}}::ha105d5439a76d3a9
22: 0x633cc1 - core::ops::function::impls::<impl core::ops::function::FnOnce for &F>::call_once::h37600b1e5eea4ecd
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/core/src/ops/function.rs:284:13
23: 0x633cc1 - std::panicking::try::do_call::hb4bda49fa13a0c2b
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
24: 0x633cc1 - std::panicking::try::h8bbf75149211aaaa
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
25: 0x633cc1 - std::panic::catch_unwind::h8c78ec68ebea34cb
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
26: 0x633cc1 - std::rt::lang_start_internal::{{closure}}::hffdf44a19fd9e220
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:48
27: 0x633cc1 - std::panicking::try::do_call::hcb3194972c74716d
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:552:40
28: 0x633cc1 - std::panicking::try::hcdc6892c5f0dba4c
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panicking.rs:516:19
29: 0x633cc1 - std::panic::catch_unwind::h4910beb4573f4776
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/panic.rs:142:14
30: 0x633cc1 - std::rt::lang_start_internal::h6939038e2873596b
at /rustc/07dca489ac2d933c78d3c5158e3f43beefeb02ce/library/std/src/rt.rs:148:20
31: 0x45689f - main
32: 0x67f155 - __libc_start_main
at /mnt/tc-build/src/glibc-2.17/csu/libc-start.c:258:16
33: 0x40dfaa -
at /mnt/tc-build/src/glibc-2.17/csu/../sysdeps/x86_64/start.S:123
34: 0x0 -
(END)
The text was updated successfully, but these errors were encountered: