You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
* made arguments explicit in notebook and added epsg comment
* improved docstring and comments in preprocess_ERA5
* update comments in test_download_era5
* prevented incorrect times including testcase for download_ERA5
Copy file name to clipboardExpand all lines: dfm_tools/xarray_helpers.py
+36-12Lines changed: 36 additions & 12 deletions
Original file line number
Diff line number
Diff line change
@@ -96,25 +96,49 @@ def preprocess_hisnc(ds):
96
96
97
97
defpreprocess_ERA5(ds):
98
98
"""
99
-
Reduces the expver dimension in some of the ERA5 data (mtpr and other variables), which occurs in files with very recent data. The dimension contains the unvalidated data from the latest month in the second index in the expver dimension. The reduction is done with mean, but this is arbitrary, since there is only one valid value per timestep and the other one is nan.
99
+
Aligning ERA5 datasets before merging them. These operations are currently
100
+
(2025) only required when (also) using previously retrieved ERA5 data.
101
+
102
+
In recent datasets retrieved from ERA5 the time dimension and variable are
103
+
now called valid_time. This is inconvenient since it causes issues when
104
+
merging with previously retrieved datasets. However, it is not necessary
105
+
for succesfully running a Delft3D FM simulation.
106
+
107
+
Reducing the expver dimension: In the past, the expver dimension was
108
+
present if you downloaded ERA5 data that consisted of a mix of ERA5 and
109
+
ERA5T data. This dimension was also present in the data variables, so it
110
+
broke code. Therefore this dimension is reduced with a mean operation.
111
+
Any reduction operation would do the trick since there is only one valid
112
+
value per timestep and the other one is nan. In datasets downloaded
113
+
currently (2025) the expver dimension is not present anymore,
114
+
but anexpver variable is present defining whether the data comes
115
+
from ERA5 (1) or ERA5T (5).
116
+
117
+
Removing scale_factor and add_offset: In the past, the ERA5 data was
118
+
supplied as integers with a scaling and offset that was different for
119
+
each downloaded file. This caused serious issues with merging files,
120
+
since the scaling/offset from the first file was assumed to be valid
121
+
for the others also, leading to invalid values. Only relevant for old
122
+
files. More info at https://github.com/Deltares/dfm_tools/issues/239.
100
123
"""
101
-
if'expver'inds.dims:
102
-
# TODO: this drops int encoding which leads to unzipped float32 netcdf files: https://github.com/Deltares/dfm_tools/issues/781
103
-
ds=ds.mean(dim='expver')
104
124
105
-
# datasets retrieved with new cds-beta have valid_time instead of time dimn/varn
# Prevent writing to (incorrectly scaled) int, since it might mess up mfdataset (https://github.com/Deltares/dfm_tools/issues/239)
112
-
# By dropping scaling/offset encoding and converting to float32 (will result in a larger dataset)
113
-
# ERA5 datasets retrieved with the new CDS-beta are zipped float32 instead of scaled int, so this is only needed for backwards compatibility with old files.
130
+
# reduce the expver dimension (not present in newly retrieved files)
131
+
if'expver'inds.dims:
132
+
ds=ds.mean(dim='expver')
133
+
134
+
# drop scaling/offset encoding if present and converting to float32. Not
135
+
# present in newly retrieved files, variables are zipped float32 instead
0 commit comments