Skip to content

Conversation

@orionarcher
Copy link

Summary of Changes

This PR introduces support for the 3 high-level runners in TorchSim optimize, integrate, and static. It attempts to provide an easy to use interface for quacc users that relies on minimal TorchSim imports. It also adopts the quacc nomenclature (relax_job, md_job, and static_job) and IO (input atoms, return dicts).

I initially implemented a TorchSim runner but it ended up being just an extra unecessary layer so I decided recipes would be the best way forward.

Happy to take any feedback!

Requirements

Note: If you are an external contributor, you will see a comment from @buildbot-princeton. This is solely for the maintainers.

@buildbot-princeton
Copy link
Collaborator

Can one of the admins verify this patch?

@orionarcher orionarcher mentioned this pull request Nov 4, 2025
3 tasks
@codecov
Copy link

codecov bot commented Nov 5, 2025

Codecov Report

❌ Patch coverage is 87.15084% with 23 lines in your changes missing coverage. Please review.
✅ Project coverage is 97.62%. Comparing base (1e27489) to head (b723310).

Files with missing lines Patch % Lines
src/quacc/recipes/torchsim/_base.py 80.83% 23 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##             main    #3005      +/-   ##
==========================================
- Coverage   98.18%   97.62%   -0.57%     
==========================================
  Files          92       95       +3     
  Lines        3863     4042     +179     
==========================================
+ Hits         3793     3946     +153     
- Misses         70       96      +26     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.

Copy link
Member

@Andrew-S-Rosen Andrew-S-Rosen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @orionarcher! Thank you very much for this PR!

Overall, this is looking to be in relatively good shape. I have left some minor cleanup related comments below.

One larger question: by avoiding the use of runners entirely, there is no directory management. I assume there is some I/O being done? Everything will run in the current working directory at the moment, which would problematic if there is I/O. You will see that the various runners have a setup and cleanup method that they call from the BaseRunner class to take care of file management. Is there I/O to worry about here, or is it all kept in-memory?

Another comment: some of the @job-decorated recipes take classes as arguments. If these are classes that are not instantiated, that's fine. But it looks like some of these arguments are instantiated classes (e.g. the autobatcher). This is problematic for workflow tools, especially those from the MP side, because it is not possible to (de)serialize an instantiated class. The workaround from MP tooling is, as you know, that such methods need to have an .as_dict() and .from_dict() class. It's okay if internally called functions have such classes, but the main @job-decorated function cannot have them unless they are monty serializable. Any thoughts about how to best address this?

Comment on lines 288 to 295
model = model if isinstance(model, ModelInterface) else pick_model(*model)

state = ts.initialize_state(atoms, model.device, model.dtype)
if autobatcher:
autobatcher = ts.runners._configure_batches_iterator(
state, model, autobatcher=autobatcher
) # type: ignore
autobatcher_dict = _get_autobatcher_dict(autobatcher)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, this stuff is repeated through most of the recipes. To keep the recipes as minimal as possible, perhaps this is something that can be abstracted out into _base.py as a function?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done! Moved this all out to _base.py along with the type hints (some of which needed to be there.)

@orionarcher
Copy link
Author

Hi Andrew, thanks for the comments!

Is there I/O to worry about here, or is it all kept in-memory?

Good catch. There are output files for these calcualtions. I'll add in calc_setup / calc_teardown calls to get the output files in the proper location. While it makes the code a little noisier I still think it's preferable to having a whole extra set of classes and functions.

It's okay if internally called functions have such classes, but the main @job-decorated function cannot have them unless they are monty serializable. Any thoughts about how to best address this?

Very good point. I wasn't thinking about this but it definitely needs a change. I am also working on an Atomate2 API where I have eliminated the "live" classes and replaced them with config dicts instead. I'll transfer that work over this PR, which should solve the issue.

Will respond to your other comments as I address them!

@orionarcher
Copy link
Author

I think I've addressed all the changes! Only the larger question remains:

One larger question: by avoiding the use of runners entirely, there is no directory management. I assume there is some I/O being done? Everything will run in the current working directory at the moment, which would problematic if there is I/O. You will see that the various runners have a setup and cleanup method that they call from the BaseRunner class to take care of file management. Is there I/O to worry about here, or is it all kept in-memory?

There is no extraneous IO here (i.e. input sets) but we do write out trajectory files as output. Users must provide names for the trajectories. If they use something like "file.traj" it will end up at "./file.traj", of course they could also specify "/exact/path/file.traj". In my mind, this is the correct behavior. Workflow runners will write out un-rooted paths to the current dir and users still have control if they want to hard code it. Does that make sense to you?

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Nov 21, 2025

Thanks, @orionarcher! I will review soon.

There is no extraneous IO here (i.e. input sets) but we do write out trajectory files as output. Users must provide names for the trajectories. If they use something like "file.traj" it will end up at "./file.traj", of course they could also specify "/exact/path/file.traj". In my mind, this is the correct behavior. Workflow runners will write out un-rooted paths to the current dir and users still have control if they want to hard code it. Does that make sense to you?

Unfortunately, this will not work out because it requires the workflow runner to essentially cd into a new directory for every run, which is not what happens across the board. It is what happens with FireWorks but not for most other tools. In the more general case, we will have file.traj overwriting itself over and over. Everything related to paths must be handled by quacc and be an absolute path (to avoid multithreading issues where cd calls are a nightmare...), which is why the Runner classes inherit from the BaseRunner, which has a setup and cleanup method that automatically handles directory setup and cleanup. The added benefit of this approach is that it allows quacc to manage different directories for where calculations should be run vs. where they should be stored after the run.

I will try to provide some guidance about how to address this. I do not think it will be especially challenging. It probably means just creating a TorchSimRunner that inherits from BaseRunner with basically no extra features. We use the TorchSimRunner to setup the directory, pass the runtime directory path to the trajectory name like here and don't allow the user to overwrite it. I'm happy to provide more direct suggestions next week!

@orionarcher
Copy link
Author

Unfortunately, this will not work out because it requires the workflow runner to essentially cd into a new directory for every run, which is not what happens across the board.

Got it! Understood. That was my (incorrect) expectation.

Yeah I'm not sure I grok it so a little more guidance would be helpful. I can create a runner if needed, or if there is a way to get a way with just adding a copy_files arg to the optimize, integrate and static functions, perhaps that could work too.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Development

Successfully merging this pull request may close these issues.

3 participants