Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ETOPO2022 artifact: elevation, bathymetry, land-sea masks at high resolutions (up to 15 arc-second) #47

Draft
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

akshaysridhar
Copy link
Member

@akshaysridhar akshaysridhar commented Oct 3, 2024

Checklist:

  • I created a new folder $artifact_name
    • I added a README.md in that that folder that
      • describes the data and processing done to it
      • lists the sources of the raw data
      • lists the required citation, licenses
    • If applicable (e.g., for Creative Commons), I added a LICENSE file
    • I added the scripts that retrieve, process, and produce the artifact
    • I added the environment used for such scripts (typically, Project.toml
      and Manifest.toml)
    • I added the OutputArtifacts.toml file containing the information
      needed for package developers to add $artifact_name to their package
  • I uploaded the artifact folder to the Caltech cluster (in
    /groups/esm/ClimaArtifacts/artifacts/$artifact_name)
  • I added the relevant code to the Overides.toml on the Caltech Cluster
    (in /groups/esm/ClimaArtifacts/artifacts/Overrides.toml)
  • I added a link to the main README.md to point to the new artifact

	new file:   create_artifacts.jl
@akshaysridhar akshaysridhar changed the title ETOPO2022 artifacts: elevation, bathymetry, land-sea masks at high resolutions ETOPO2022 artifact: elevation, bathymetry, land-sea masks at high resolutions Oct 3, 2024
@akshaysridhar akshaysridhar changed the title ETOPO2022 artifact: elevation, bathymetry, land-sea masks at high resolutions ETOPO2022 artifact: elevation, bathymetry, land-sea masks at high resolutions (up to 15 arc-second) Oct 3, 2024
Copy link
Member

@Sbozzolo Sbozzolo left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you provide an example of how this would be used?

Here's some things to note:

  • HDF5 files are tied to ClimaCore and the specifics of the target Space. Changing parameters (e.g., the number of quadrature points or the type of mesh) would probably require regenerating the files.
  • There is no guarentee that different versions of ClimaCore behave the same and are compatible (e.g., things I am currently fixing to support restarts). People that change ClimaCore would have to know that they might have to regenerate these files.
  • The generation of files containing topography maps should not require ClimaAtmos
  • How is consistency enforced with the Space that is being simulated? For example, what happens when one reads a file generated with SpaceFillingCurve but is not using a space with such topology?

I think the easiest way to read topography data is from a standard NetCDF file and resample it onto the desired grid. (The challenge that needs to be addressed here is size of the fine resolution of the NetCDF files). In addition to this, we can provide files for standard resolutions, but I think that should be seen as an optimization as opposed to the main way to accomplish this.

@akshaysridhar
Copy link
Member Author

akshaysridhar commented Oct 3, 2024

Can you provide an example of how this would be used?

(1) Generated artifacts are stored following the ClimaArtifacts guidelines. Source
code is available in the create_artifact.jl file if users wish to regenerate this data / trace origins. Generated artifacts (maps) will include:

  • Topography (15arc-second resolution)
  • Binary land-mask
  • Binary ocean-mask
  • (Future updates may include high resolution sea-ice and inland-lakes)

(2) User downloads high-res nc file following standard ClimaArtifacts procedure for their specific use case, typically within ClimaAtmos.jl (this follows the discussion that HDF5 would not necessarily be the best option).

(3) SpaceVaryingInputs replaces the stored spline object in ClimaAtmos type_getters to generate context-aware grid information. Smoothing functions are applied directly on the generated spectral grid with ClimaCore Operators.

Here's some things to note:

* HDF5 files are tied to ClimaCore and the specifics of the target Space. Changing parameters (e.g., the number of quadrature points or the type of mesh) would probably require regenerating the files.

Noted, these can be re-written as lon-lat regridded .nc files if this is more appropriate. Is regridding from a high resolution cubed-sphere representation onto modified quadrature / mesh representations using spectral interpolation operators not an option in ClimaCore?

* There is no guarentee that different versions of ClimaCore behave the same and are compatible (e.g., things I am currently fixing to support restarts). People that change ClimaCore would have to know that they might have to regenerate these files.

.nc outcomes instead of HDF5 files then might be sufficient, the intent is to preserve the artifact creation metadata for reproducibility. ClimaUtilities.SpaceVaryingInputs can be directly applied in an instance of ClimaAtmos to get context-aware regridding.

* The generation of files containing topography maps should not require ClimaAtmos

I've used convenient wrappers from within these packages to generate the desired horizontal cube-sphere space. The current draft generates a cubed-sphere representation (as you point out, tied to the specific version of ClimaCore used to generate the target space) for spectral regridding. I'll remove the ClimaAtmos functions (or move them to a test environment wherein users can visualise generated information without generating a simulation object) and convert outputs to a single nc file.

* How is consistency enforced with the Space that is being simulated? For example, what happens when one reads a file generated with SpaceFillingCurve but is not using a space with such topology?

I think the easiest way to read topography data is from a standard NetCDF file and resample it onto the desired grid.

This is reasonable, if we can update DataHandler to handle patches with do nothing operations outside the extents of a given panel. (I'm not sure if collapsing 288 panels into a single NC file is the best way to do this)

(The challenge that needs to be addressed here is size of the fine resolution of the NetCDF files). In addition to this, we can provide files for standard resolutions, but I think that should be seen as an optimization as opposed to the main way to accomplish this.

Yes, a loop over panels seemed reasonable here (only required once when generating the artifact), at least in generating this single HDF5 (draft) / nc (finalised) file. This is being tested on serial launches only for now, but I expect this operation can be parallelized.

yvalues = CC.Fields.coordinate_field(space).lat
xmin, xmax = extrema(data_x)
ymin, ymax = extrema(data_y)
Δx = diff(data_x)[1]
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Assumes data resolution is uniform.


Outputs:
HDF5 format
- Land-sea mask (on 256 `h_elem` cubed-sphere)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we use an UInt32 instead of (4) separate masks?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes - they'll be integer representations in each file (multiple such versions, following the NCAR information for different levels of down-sampling if necessary)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants