Parallelism & chunking at convert step

It might be nice to allow for the convert step to be performed in parallel, with each chunk combined at the end.

This should reduce peak memory usage (currently all entries need to fit in memory in the OPTIMADE format) and would also give us better control of concurrency, as for now it seems e.g., the pymatgen CIF reader will happily use all cores and lock up a system.

The only difficult here will be how the properties are then assigned to a structure. We could consider changing this to a two-step process, where first a bare `optimade.jsonl` is written with all the structures only, and then we loop through that file and add properties where appropriate, writing the results out to a new file.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Parallelism & chunking at convert step #68

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Parallelism & chunking at convert step #68

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions