-
Notifications
You must be signed in to change notification settings - Fork 82
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PwBaseWorkChain
: handler for BFGS history failure
#985
Conversation
@process_handler(priority=559, exit_codes=[ | ||
PwCalculation.exit_codes.ERROR_IONIC_CYCLE_BFGS_HISTORY_FAILURE, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should we remove this exit code from the handler above (with priority 560)? Otherwise it will be handled twice. Since the first handler will just set the restart type and report the error being handled, I don't think we need it. And do you maybe want to handle ERROR_IONIC_CYCLE_BFGS_HISTORY_AND_FINAL_SCF_FAILURE
also in this new way?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why it will be handled twice? If the first handler fixes the issue, the second one should't be called, should it?
Re ERROR_IONIC_CYCLE_BFGS_HISTORY_AND_FINAL_SCF_FAILURE
I wasn't unsure what the FINAL_SCF_FAILURE
actually meant. But indeed we can make it to be handled too
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are right, it returns ProcessHandlerReport(True)
where the True
is for the do_break
argument, which is False
by default. If set to True
all other handlers are skipped. But that makes the problem worse. This means that your new handler is never actually called, right? Because it first matches the handler with priority 560 and then breaks. So it will never get to the next one.
The ERROR_IONIC_CYCLE_BFGS_HISTORY_AND_FINAL_SCF_FAILURE
exit code means that first BFGS failed due to the history and then on top of that, the final scf also didn't converge. I think it makes sense to handle this as the normal BFGS history problem.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, interesting. But why it should stop first at the handler with priority 560
? I thought lower priority number meant it would be called sooner. Is it the opposite? If this is the case we can just change it to 561
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nope, they are called based on priority in reverse order:
https://github.com/aiidateam/aiida-core/blob/main/aiida/engine/processes/workchains/restart.py#L243
This means higher numbers first. See also the docs: https://aiida.readthedocs.io/projects/aiida-core/en/latest/howto/workchains_restart.html#multiple-process-handlers
The process handlers with a higher priority will be called first.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Alright, thanks! Then I move it to higher priority and commit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To nitpick a bit further here ^^
If the handle_relax_recoverable_ionic_convergence_bfgs_history_error
process handler is called first, and it switches away from bfgs
at some point, can't we just remove the ERROR_IONIC_CYCLE_BFGS_HISTORY_FAILURE
and ERROR_IONIC_CYCLE_BFGS_HISTORY_AND_FINAL_SCF_FAILURE
exit codes from the handle_relax_recoverable_ionic_convergence_error
process handler? There shouldn't be any BFGS history failures for damp
or fire
, I'd think?
Thanks @bastonero! Does this one fix #637? I remember I was testing this when @giovannipizzi first opened the issue (and then got distracted, as usual), and the damped was very effective for |
Thanks @mbercx !
Yes, that was the idea (I forgot the number of that issue, thanks for linking).
Ok, amazing, then it's also tested! We can then proceed as you suggest. |
I think we only support v6.6 and newer, but maybe for the next release we already even planned to drop support for v6 completely. @mbercx ? |
Looking into QE history it seems the |
According to our compatibility guarantees, we support the last 3 minor releases, as well as older versions for up to two years. As QE v6.8 was released in July 2021: We can also drop support for that in the next |
Nevertheless, I didn't understand whether this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bastonero! I've left some nitpicks/questions. ^^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bastonero! I came up with some more comments. ^^
Sometimes the BFGS algorithm for ionic minimizaiton fails, and the current handler simply restart from scratch. This might work, but here we try to improve upon this simplistic solution, trying first to lower the trusted radius, and then trying to switch algorithm.
Co-authored-by: Marnik Bercx <mbercx@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @bastonero! I think this one is good to go.
Sometimes the BFGS algorithm for ionic minimizaiton fails, and the current handler simply restart from scratch. This might work, but here we try to improve upon this simplistic solution, trying first to lower the trusted radius, and then trying to switch algorithm.