Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Loading TCTracks with from_ibtracs_netcdf does not properly close the filestream #920

Open
spjuhel opened this issue Jul 16, 2024 · 2 comments
Assignees
Labels
accepting pull request Contribute by raising a pull request to resolve this issue! bug enhancement

Comments

@spjuhel
Copy link
Collaborator

spjuhel commented Jul 16, 2024

Describe the bug
When reading the IBTrACS.ALL.v04r00.nc file with from_ibtracs_netcdf() the file is opened with xr.open_dataset(ibtracs_path) which opens a stream to the file that is not closed afterwards. I had problems with this on euler when trying to access the file at later stages with the following error:

OSError: [Errno -101] NetCDF: HDF error: '/cluster/work/climate/sjuhel/climada/data/IBTrACS.ALL.v04r00.nc'

To Reproduce
I haven't been able to reproduce the issue on a local computer. This is possibly related to NetCDF too.

Expected behavior
The file is properly closed once the data is loaded.

See also: pydata/xarray#2887

Climada Version: 4.1.1

System Information (please complete the following information):

  • Operating system and version: Ubuntu 22.04 (euler)
  • Python version: 3.10.13 (Custom venv while climada is not yet installed on the cluster, possibly part of the problems)

Additional context
I don't think this is a critical problem, but the following would probably be a better way to handle the opening of the file:

From:

ibtracs_ds = xr.open_dataset(ibtracs_path)

to:

with xr.open_dataset(ibtracs_path) as ds:
    ibtracks_ds = ds.load()
@peanutfun
Copy link
Member

Indeed, a context manager is the appropriate way to open files with xarray. However, notice that ds.load() will load all data into memory, which is not the default for opening a dataset and might be an issue for very large files.

It looks to me like the line in question can simply be replaced by the context manager. All following lines of the function must then be indented.

@peanutfun peanutfun added the accepting pull request Contribute by raising a pull request to resolve this issue! label Jul 17, 2024
@spjuhel
Copy link
Collaborator Author

spjuhel commented Sep 26, 2024

So I noticed there are actually other places where dataset are opened without using a context manager and will make a PR to address all this occurrence instead of just this one.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
accepting pull request Contribute by raising a pull request to resolve this issue! bug enhancement
Projects
None yet
Development

No branches or pull requests

2 participants