Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ValueError and Invalid Target End Point Errors during Mikado Serialization #458

Open
joseph144155 opened this issue Jul 17, 2024 · 0 comments

Comments

@joseph144155
Copy link

Description:

I encountered multiple errors while running the Mikado serialization process using the following command:

singularity exec --cleanenv ../../mikado_2.3.2.sandbox mikado serialise --json-conf mikado_2.3.2_custom.conf --xml mikado_prepared.blast.tsv --orfs mikado_prepared.fasta.transdecoder.bed --junctions portcullis_filtered.pass.junctions.bed

The errors captured in the SLURM log file are as follows:

Process Preparer-44:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/local/lib/python3.10/site-packages/Mikado/serializers/blast_serializer/tabular_utils.py", line 387, in run
curr_hit, curr_hsps = prep_hit(key, rows)
File "/usr/local/lib/python3.10/site-packages/Mikado/serializers/blast_serializer/tabular_utils.py", line 247, in prepare_tab_hit
hit_dict["target_start"] = int(t_aligned.min())
File "/usr/local/lib/python3.10/site-packages/numpy/core/_methods.py", line 44, in _amin
return umr_minimum(a, axis, None, out, keepdims, initial, where)
ValueError: zero-size array to reduction operation minimum which has no identity

...

Process Preparer-45:
Traceback (most recent call last):
File "/usr/local/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
self.run()
File "/usr/local/lib/python3.10/site-packages/Mikado/serializers/blast_serializer/tabular_utils.py", line 387, in run
curr_hit, curr_hsps = prep_hit(key, rows)
File "/usr/local/lib/python3.10/site-packages/Mikado/serializers/blast_serializer/tabular_utils.py", line 250, in prepare_tab_hit
raise ValueError("Invalid target end point: {}, {}".format(hit_dict["target_end"], sends))
ValueError: Invalid target end point: 202, (449,)

...
Scoring File Used:(YAML)

requirements:
expression:

  • cdna_length and ((exon_num.multi and verified_introns_num and min_intron_length and max_intron_length) or (exon_num.mono and combined_cds_length))
    parameters:
    cdna_length: {operator: ge, value: 300}
    exon_num.multi: {operator: ge, value: 2}
    verified_introns_num: {operator: gt, value: 0}
    min_intron_length: {operator: ge, value: 5}
    max_intron_length: {operator: le, value: 2000}
    exon_num.mono: {operator: eq, value: 1}
    combined_cds_length: {operator: gt, value: 0}
    scoring:
    snowy_blast_score: {rescaling: max}
    is_complete: {rescaling: target, value: true}
    has_start_codon: {rescaling: target, value: true}
    has_stop_codon: {rescaling: target, value: true}
    number_internal_orfs: {rescaling: target, value: 1}
    cds_not_maximal: {rescaling: min}
    cds_not_maximal_fraction: {rescaling: min}
    selected_cds_fraction: {rescaling: target, value: 0.7}
    selected_cds_length: {rescaling: max}
    selected_cds_intron_fraction: {rescaling: max}
    selected_cds_intron_fraction: {rescaling: max}
    cdna_length: {rescaling: max}
    exon_num: {rescaling: max, filter: {operator: ge, value: 3}}
    five_utr_num: {rescaling: target, value: 2, filter: {operator: lt, value: 4}}
    five_utr_length: {rescaling: target, value: 100, filter: {operator: le, value: 2500}}
    three_utr_num: {rescaling: target, value: 1, filter: {operator: lt, value: 3}}
    three_utr_length: {rescaling: target, value: 200, filter: {operator: lt, value: 2500}}
    proportion_verified_introns_inlocus: {rescaling: max}
    non_verified_introns_num: {rescaling: min}
    end_distance_from_junction: {rescaling: min, filter: {operator: lt, value: 55}}
    as_requirements:
    expression: [cdna_length and three_utr_length and five_utr_length and utr_length and suspicious_splicing]
    parameters:
    cdna_length: {operator: ge, value: 200}
    utr_length: {operator: le, value: 2500}
    five_utr_length: {operator: le, value: 2500}
    three_utr_length: {operator: le, value: 2500}
    suspicious_splicing: {operator: ne, value: true}
    not_fragmentary:
    expression: [((exon_num.multi and (cdna_length.multi or selected_cds_length.multi)), or, (exon_num.mono and ((snowy_blast_score and selected_cds_length.zero) or selected_cds_length.mono)))]
    parameters:
    selected_cds_length.zero: {operator: gt, value: 300} # 600
    exon_num.multi: {operator: gt, value: 2}
    cdna_length.multi: {operator: ge, value: 300}
    selected_cds_length.multi: {operator: gt, 250}
    exon_num.mono: {operator: eq, value: 1}
    snowy_blast_score: {operator: gt, value: 0} # 0.3
    selected_cds_length.mono: {operator: gt, value: 600} # 900
    exon_num.mono: {operator: le, value: 2}
    Error Explanation:

ValueError: zero-size array to reduction operation minimum which has no identity:

This error occurs when attempting to find the minimum value of an empty array. It suggests that the input array t_aligned is empty during the execution of int(t_aligned.min()) in prepare_tab_hit.
ValueError: Invalid target end point:

This error indicates that the target end points are not matching expected values, causing the prepare_tab_hit function to raise an error.

Questions:

What could be the underlying cause for these arrays being empty or target end points being invalid?
Are there any specific input checks or preprocessing steps that I might be missing to prevent these errors?
Is it safe to ignore these errors, or do they indicate a critical problem that needs addressing?
Additional Information:

The serialized.log file does not output any errors; these are only captured in the SLURM log file when the job is submitted using sbatch.
I used docker pull quay.io/biocontainers/mikado:2.3.2--py37h9c5868f_0 to build the Singularity image.
Any guidance on how to resolve or debug these issues would be greatly appreciated.
Thank you for your assistance.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant