Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jellyfish failure leads to kSNP4 run failure #2

Open
kissake opened this issue Nov 19, 2022 · 1 comment
Open

Jellyfish failure leads to kSNP4 run failure #2

kissake opened this issue Nov 19, 2022 · 1 comment
Assignees

Comments

@kissake
Copy link
Owner

kissake commented Nov 19, 2022

In the case where a genome data file is empty or otherwise fails to be parsed correctly by jellyfish into a .Jelly file, the kSNP4 run will fail at a later point, and it is likely to be unclear to the user what the issue is.

The issue is in get_filtered_kmers.py:processJellyfishDumps()

Arguably it is questionable what to do here, because whatever you do you can't get exactly what the user requested because they requested processing of a file that isn't processable.

That said, it isn't clear that it is worth throwing away all of the work done up to that point, so if it is possible to leave the working directory in a state where the work can be resumed with a modified input file (to remove the flawed genome files, or to correct / replace them), that would be good, and if it is possible to alert the user to the issue as soon as possible (e.g. when the original jellyfish command fails), that would be good too.

Relevant error messages:

Initial jellyfish failure:

DEBUG:Running jellyfish: /home/jnisbet/Documents/Development/ksnp/find_snp-performance/kSNP4/binaries/jellyfish count -C -o fsplit858.Jelly -m 15 -s 1000000000 -t 8 fsplit858
terminate called after throwing an instance of 'std::runtime_error'
what(): Unsupported format

Subsequent get_filtered_kmers failure: ########################################################3

DEBUG:Looking for dump files for Escherichia_phage_LM33_P1-241
DEBUG:Processing (dump) fsplit302.Jelly
Traceback (most recent call last):
File "get_filtered_kmers.py", line 309, in
File "get_filtered_kmers.py", line 189, in processJellyfishDumps
IndexError: list index out of range
[683986] Failed to execute script 'get_filtered_kmers' due to unhandled exception!
#########################################################

@kissake kissake self-assigned this Nov 19, 2022
@kissake
Copy link
Owner Author

kissake commented Nov 19, 2022

It may be possible to detect and correct this issue when parsing the original genome file (before running Jellyfish, e.g. in merge_fasta_reads3.py or mergeFastaReads.py) for some cases, and the earlier the issue is detected, the better.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant