-
Notifications
You must be signed in to change notification settings - Fork 0
netCDF
NetCDF is a binary file format used to store multidimensional data. This document is used to describe conventions used for model interoperability.
NetCDF files are built with named components, including dimensions, variables and attributes. The name for each of these should normally use only lower case characters (with a few exceptions), underscores and numbers. But no spaces, hyphen-minus or special characters. Essentially, these names should look as they would in a programming environment like Python. Also note that these names are often case sensitive.
Dimensions are used within variables to indicate their shape and size. The order specified is generally T
, Z
, Y
and X
.
Here is a list of common dimensions, in the order of how they should be used within variables:
-
time
- usually UNLIMITED, even if there is only one depth
-
latitude
/northing
-
longitude
/easting
-
reach
/id
- or any type of unique identifier - Any other dimension, such as ensemble or scenario, which is often one
A netCDF file generally has a variable to describe each dimension, often using the same name (e.g. float time(time)
).
A variable has both dimension and attribute properties (see "Dimensions" and "Attributes")
Each variable should have three attributes:
-
cdsm_name
- the CDSM name, as agreed upon within the interoperable model group; must be unique within this file. -
standard_name
- if defined by the CF Standard Names, then use this. However, if not defined, use "(no standard name)" -
long_name
- a descriptive name for the variable, for example "easting" or "evapotranspiration flux" -
units
- e.g. "m", "m3 s-1", "dimensionless", "hours since 1970-01-01 00:00:00" (see Time variable)
Data type in netCDF include external and use-defined. External data types start with "NC_" and are compatible with those in other program languages. For example, "NC_INT" is 32-bit signed integer.
While not required, it is a good habit to assign following attributes to a variable for any data type.
-
_FillValue
- a special value that indicates a missing value (e.g. NA or blank) -
valid_min
- lower limit for variable, e.g. 0 -
valid_max
- upper limit for variable, e.g. 1e+6
Date/times are represented numerically, by referencing a special units
, which is agreed to be fixed at "hours since 1970-01-01 00:00:00" with calendar
specified as gregorian
or standard
(i.e. real dates). The data type can then be any numeric type.
To convert to/from these dates in Python, see netCDF4
's date2num
and num2date
functions.
Attribute cell_methods
is used to describe the characteristic of a field that is represented by cell values (see CF conventions for cell methods). Its value is given by a string in the form of name: method
. For example, cell_methods
of variable river_flow_rate(time,site)
can be time: mean
, indicating that each value of river_flow_rate
is the [mean] of river flow rate for that the given time period. See more methods in Cell Methods.
Global attributes describe general information. A minimal set of global attributes should include:
-
title
- a short description for file contents, e.g. "simulated streamflow" -
institution
- e.g. "NIWA" -
Conventions
(note the upper "C") - normally "CF-1.7" -
source
- normally the name of the simulation software -
comments
- any relevant notes on the data which could be useful
-
netCDF4
- Python module used to read/write files - NetCDF CF Metadata Conventions - Climate and Forecast metatdata conventions metadata