Improve plumpy integration in `aiida-core` #6754

agoscinski · 2025-02-11T13:06:33Z

This is a collection of issues regarding plumpy that are not well fleshed out but I think they should at least be mentioned somewhere to keep track of it. I will edit this issue to be more precise in language once working on them.

plumpy WorkChain not correctly used in aiida-core

There is plumpy.workchains.WorkChain but is not really used in the aiida.engine.WorkChain. It is used as typehint in the Stepper classes, therefore the plumpy WorkChain should be an Interface or Protocol that aiida WorkChain should implement.

Issues with nest-asyncio

By switching from nest-asyncio to greenlet (see unkcpz/plumpy#39) we can solve test_disconnect timeout problem (see aiida-core tests/manage/test_manager.py::test_disconnect) and the python recursion limit issue, and most importantly it allows us to use calcfunctions in an unblocking manner.

Improved save and load interface

Refactor of save and load to make it work better pickled data unkcpz/plumpy#33 we can thus support some of the functionalities of workgraph regarding pickling more natively in aiida-core.

plumpy WorkChain not correctly used

Why do we have plumpy.workchains.WorkChain if it is not really used in the aiida.engine.WorkChain. It is used as typehint in the Stepper classes, so it seems to me that plumpy WorkChain should be an Interface or Protocol that aiida WorkChain should implement.

The text was updated successfully, but these errors were encountered:

unkcpz · 2025-02-11T23:34:53Z

Refactor of save and load to make it work better pickled data unkcpz/plumpy#33 we can thus support

I guess there were historical security consideration that yaml is used for se/de things storing in checkpoint (pinning @giovannipizzi @sphuber for confirmation). After move to pyyaml 5.4 we use UnsafeLoader anyway to support all kinds of user data, for the security point of view it is similarly not very secure. But since the checkpoint is cleaned up when the node is sealed it is "fine". It was mentioned in #3709 (comment) that we should strip the checkpoint when import data from archive. But I believe it is not yet implemented in the aiida-core?

There are some differences between storing the checkpoint items into yaml and into pickle, when I tried to move from storing checkpoint from yaml to pickle.

The pickle also stores some environment information, so the full benchmark is required to see how much size increase when using pickle.
The yaml format store the pointer to where the function can be called and passing the parameters to call after recreate the class and function later. It means the change of code of workchains/calcjobs are taking effect after restart the daemon. While if the objects are pickled, restart daemon will not help. Actually this is quite good since once the workchain is launched, it is natural to expect the behavior is unchanged (this part I am not so sure, correct me if I am wrong).

After all, store the object like process itself and run_fn into pickle make it possible to run process without adding it to python path, which is one of powerful feature manifest by workgraph. It deserve a native support from aiida-core and use the same mechanism to store checkpoint.

unkcpz · 2025-02-11T23:45:42Z

By switching from nest-asyncio to greenlet (see unkcpz/plumpy#39)

Also want to mention the idea here is partially inspired by #4876 (comment) to use greenlet to run synchronous code in asynchronous.

-                result = process.execute()
+                result = await_or_new_loop(coro_executor(process.execute))

what we actually need is running the asynchronous code in the synchronous, which is inside execute() call. It can then fully solve the nested event loop problem and bridge two color of functions.

sphuber · 2025-02-12T08:11:24Z

. It was mentioned in #3709 (comment) that we should strip the checkpoint when import data from archive. But I believe it is not yet implemented in the aiida-core?

At the very least there is an implicit guarantee because only terminated processes can be exported and when a process terminates, its checkpoint is removed. But this is just a soft guarantee that is easily skipped. But I vaguely remember that there might be an explicit check in the export code to drop the checkpoint attribute when exporting. Should be easy to find.

unkcpz · 2025-02-13T17:24:01Z

At the very least there is an implicit guarantee because only terminated processes can be exported

That is true, but if someone really want to put malicious code in the data node it is easy to go around.

But I vaguely remember that there might be an explicit check in the export code to drop the checkpoint attribute when exporting.

Right! I thought it was only in export code which surely is implemented but not for import. But actually it is already there in

aiida-core/src/aiida/tools/archive/imports.py

Lines 525 to 526 in d2fbf21

    
           if data.get('node_type', '').startswith('process.'): 
        
               # remove checkpoint from attributes of process nodes

agoscinski added the design-issue A software design issue label Feb 11, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Improve plumpy integration in `aiida-core` #6754

Improve plumpy integration in `aiida-core` #6754

agoscinski commented Feb 11, 2025 •

edited by unkcpz

Loading

unkcpz commented Feb 11, 2025 •

edited

Loading

unkcpz commented Feb 11, 2025

sphuber commented Feb 12, 2025

unkcpz commented Feb 13, 2025

Improve plumpy integration in aiida-core #6754

Improve plumpy integration in aiida-core #6754

Comments

agoscinski commented Feb 11, 2025 • edited by unkcpz Loading

plumpy WorkChain not correctly used in aiida-core

Issues with nest-asyncio

Improved save and load interface

plumpy WorkChain not correctly used

unkcpz commented Feb 11, 2025 • edited Loading

unkcpz commented Feb 11, 2025

sphuber commented Feb 12, 2025

unkcpz commented Feb 13, 2025

Improve plumpy integration in `aiida-core` #6754

Improve plumpy integration in `aiida-core` #6754

agoscinski commented Feb 11, 2025 •

edited by unkcpz

Loading

unkcpz commented Feb 11, 2025 •

edited

Loading