Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A question about how to read the *.nc file from a REMD simulation when the job is running or if the job is killed #754

Open
xiaowei-xie2 opened this issue Oct 4, 2024 · 1 comment

Comments

@xiaowei-xie2
Copy link

Hi,

I am using MultiStateReporter ("file.nc", checkpoint_interval=interval) as the storage file in a simulation from ReplicaExchangeSampler. I can read the *.nc file with netcdf once the simulation is done, but I cannot read the file in the middle of the simulation or if the job is killed. For example I will get the following error when reading the file with reporter = multistate.MultiStateReporter('file.nc', open_mode='r'). I was wondering is there anyway I can retrieve the information until right before the job is killed and restart the simulation?

Warning: The openmmtools.multistate API is experimental and may change in future releases
Traceback (most recent call last):
  File "/nfs/working/deep_learn/xiaowei/water_project/new_test_set_dGsolv_mpqrnn_32_3/132_500ps_restart/../useful_scripts/check_nc_valid.py", line 10, in <module>
    reporter = multistate.MultiStateReporter(temp_file_path, open_mode='r')
  File "/nfs/working/deep_learn/xiaowei/miniconda3/lib/python3.10/site-packages/openmmtools/multistate/multistatereporter.py", line 140, in __init__
    self.open(open_mode)
  File "/nfs/working/deep_learn/xiaowei/miniconda3/lib/python3.10/site-packages/openmmtools/multistate/multistatereporter.py", line 282, in open
    self._storage_analysis = self._open_dataset_robustly(self._storage_analysis_file_path,
  File "/nfs/working/deep_learn/xiaowei/miniconda3/lib/python3.10/site-packages/openmmtools/multistate/multistatereporter.py", line 395, in _open_dataset_robustly
    return netcdf.Dataset(*args, **kwargs)
  File "src/netCDF4/_netCDF4.pyx", line 2464, in netCDF4._netCDF4.Dataset.__init__
  File "src/netCDF4/_netCDF4.pyx", line 2027, in netCDF4._netCDF4._ensure_nc_success
OSError: [Errno -101] NetCDF: HDF error: './remd.nc'

Thank you,
Xiaowei

@schuhmc
Copy link

schuhmc commented Oct 8, 2024

You should be able to open the netCDF file if the job is killed. Are you sure that no other process has the file open at the same time? What is the reason that your simulations get "killed"?

Also, are you trying to run this as MPI job? If so, you need to set the open_mode to None in my experience.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants