Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Converting GAMESS output with multistate calculation #40

Open
wants to merge 7 commits into
base: master
Choose a base branch
from

Conversation

neelravi
Copy link
Contributor

This pull request offers a solution (issue #39) to handle the conversion of gamess or gaussian multistate calculation into multiple trexio files.

The summary of the work:

  • The original code is put under the loop over nstates for s in range(res.num_states):
  • Trexio files are generated for each state with names ending with _state_0.hdf5 or _state_1.hdf5
  • Tests are modified for these new naming convention
  • The following new block is added to store the state specific information:
        # State group
        # ---------
        trexio.write_state_num(trexio_file,res.num_states)
        trexio.write_state_current_label(trexio_file, "State "+ str(s) + " of " + str(res.num_states-1))
        trexio.write_state_label(trexio_file, [f"state{i}" for i in range(res.num_states)])
        trexio.write_state_file_name(trexio_file, [f"{trexio_basename}_state_{i}.hdf5" for i in range(res.num_states)])

        # CSF group
        # ---------
        try:
            num_csfs = len(res.csf_coefficients[s])
        except:
            num_csfs = len(res.det_coefficients[0])

        offset_file = 0
        trexio.write_csf_coefficient(trexio_file, offset_file, num_csfs, res.csf_coefficients[s])

@neelravi neelravi added the enhancement New feature or request label Oct 24, 2024
Copy link
Member

@q-posev q-posev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @neelravi !

I left a few comments. My main concern is about custom _state suffix which breaks the way the converter used to work. Not sure if this is the way to go.

Also, if this PR closes issue #39 , you can add "- Closes #39" line to the PR description and this issue will be closed automatically when the PR is merged.

trexio convert-from -t pyscf -i data/water.chk -b hdf5 trexio_pyscf_h2o_sph.h5
trexio convert-from -t pyscf -i data/diamond_single_k.chk -b hdf5 trexio_pyscf_1k.h5
trexio convert-from -t pyscf -i data/diamond_k_grid.chk -b hdf5 trexio_pyscf_Nk.h5
trexio convert-from -t orca -i data/h2o.json -b hdf5 trexio_orca_h2o_sph.h5
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't understand why the -b hdf5 has to be removed here. The test should work as it used to.

# ------
# Get the basename of the TREXIO file
try:
trexio_basename = os.path.splitext(os.path.basename(trexio_filename))[0]
Copy link
Member

@q-posev q-posev Oct 24, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think it is a good idea to do it this way. The user provided trexio_filename via CLI for a reason - they would expect it to be the name of the output TREXIO file. The moment you start to modify this logic with custom suffixes (like _state used here) - it becomes confusing and requires extra effort from the user. Now the ground state trexio files are all suffixed with _state_0 which basically breaks how the converter used to work and might affect the user workflows.
I personally think that it might be better to provide the state ID via CLI and let the user do the the conversion N times where N is the desired number of states. Yes, it will require N calls to resultsFile but at least the user will have full control of the pipeline. @scemama @neelravi what do you think?

trexio.write_state_file_name(trexio_file, [f"{trexio_basename}_state_{i}.hdf5" for i in range(res.num_states)])

# CSF group
# ---------
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Are you sure it works as expected? What happens if resultsFile instance does not contain csf_coefficients array? What will be written in the TREXIO file in this case when you call write_csf_coefficient?

raise NotImplementedError(f"Please remove the {trexio_filename} directory manually.")
# Check if the TREXIO file already exists, remove it if it does
trexio_basename = os.path.splitext(os.path.basename(trexio_filename))[0]
existing_file = glob.glob(trexio_basename + "_state_" + "*.hdf5")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same comment as above about custom suffix

@TREX-CoE TREX-CoE deleted a comment from q-posev Oct 31, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants