-
Notifications
You must be signed in to change notification settings - Fork 179
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
mhap fails on grid due to final query folder missing #2363
Comments
Is this a single run of canu or did you re-launch it at least one? Did any parameters change between the first and second run if it was run twice? What's the exact command you're using and the full log of the canu run. |
It was a single run of canu... however, I restarted (resumed) it and it seems to have restarted all the various log files? Same parameters. I got the admins to update to canu 2.3, cancelled the whole thing, cleared the directory, and restarted with the same parameters to see if that solves the problem. Currently it's running the cormhap process on 82 nodes, so we'll see pretty soon if the latest version helps or not. |
Actually, just checking the new process, the queries folder has up to 000193, where before it was only creating up to 000192, so maybe it is? |
Whelp, got my hopes up... but when I checked on it yesterday evening, the 000193 directory had disappeared, and of course, when mhap got to that one, it barfed again:
Command used to call canu: The bottom of the mhap results folder looks like this while the last mhap processes are finishing, then after it fails (WORKING never disappears):
|
I don't see how the 193 queries subfolder can be there and disappear, this folder is not removed until after all jobs have completed. Even if it is removed, the shell script ( So something is going on with your cluster that I think is outside of canu's control. Are you running in some kind of temp/scratch space where idle files are removed after a timeout? Can you post the full recursive contents of your 1-overlapper and canu-scripts folders? Also the |
This is basically the same issue as: #1191
I have a couple Pacbio datasets I'm attempting to assemble... but any combination of the two datasets fails in the correction stage as the final query folder (where it's only supposed to compare the last block to itself) in correction/1-overlapper/queries is not created. It's always one less than needed. So for instance, my current set creates 192 query folders, but then in results I get 193 ovb files, the last one failing.
This seems like a bug... the above link has a workaround (trying now, but it's not gotten to this step), but there definitely seems like there's an off-by-one error happening here. I'm also running another assembly on the same set locally on a single node without grid to see if this is something to do with my specific data, although that probably won't be done for a couple weeks.
I can run Pacbio-hifi assemblies no problem in this same grid setup (using slurm 23.11.8)...
The text was updated successfully, but these errors were encountered: