Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposed Recipes for NOAA WHOI Sea Surface Temperature #222

Closed
kathrynberger opened this issue Nov 16, 2022 · 7 comments
Closed

Proposed Recipes for NOAA WHOI Sea Surface Temperature #222

kathrynberger opened this issue Nov 16, 2022 · 7 comments

Comments

@kathrynberger
Copy link
Contributor

kathrynberger commented Nov 16, 2022

Dataset Name

NOAA Sea Surface Temperature - WHOI

Dataset URL

https://registry.opendata.aws/noaa-cdr-oceanic/

Description

An initial pangeo forge recipe for daily sea surface temperature product from the Woods Hole Oceanographic Institute (WHOI) which provides a 3-hourly 0.25° resolution grid over the global ice-free oceans from January 1988—present, updated monthly.The resultant sea surface temperature (SST) data are produced through modeling the diurnal variability in combination with AVHRR SST observations. Potential use cases of this climate data record include: analysis of extreme climate events, examining seasonal and inter-annual variability of fishing yields, exploring coral reef bleaching patterns, etc.

License

Open Data (https://registry.opendata.aws/noaa-cdr-oceanic/)

Data Format

NetCDF

Data Format (other)

No response

Access protocol

S3

Source File Organization

There is one directory per year, with one file per day (each file containing on time step = 1 day). From the From algorithm documentation we know that:

The NetCDF filenames are SEAFLUX-OSB-CDR_V02R00_{ATMOS, SST,
FLUX} _D<YYYYMMDD>_C<YYYYMMDD>.nc, where D<YYYYMMDD> is the date of the data
contained in the file and C<YYYYMMDD> is the create date of the file. 

so we will have to use a wildcard to capture file creation date as shown below:
s3://noaa-cdr-sea-surface-temp-whoi-pds/data/YYYY/SEAFLUX-OSB-CDR_V02R00_SST_DYYYYMMDD_C*.nc

Example URLs

s3://noaa-cdr-sea-surface-temp-whoi-pds/data/1988/SEAFLUX-OSB-CDR_V02R00_SST_D19880101_C20160820.nc

Authorization

No response

Transformation / Processing

N/A

Target Format

Reference Filesystem (Kerchunk)

Comments

Relates to this issues:
#208
and
https://github.com/developmentseed/aws-asdi/issues/21

@kathrynberger kathrynberger changed the title Proposed Recipes for [Dataset Name] Proposed Recipes for NOAA WHOI Sea Surface Temperature Nov 16, 2022
@kathrynberger
Copy link
Contributor Author

kathrynberger commented Nov 22, 2022

Upon further investigation the actual dataset is not available until present and instead is available until August 2021. Following up on this to find out more, email sent.

@kathrynberger
Copy link
Contributor Author

Received the following reply contacting NOAA with regards to this dataset, posting with Patrick's permission. Will keep this issue updated with follow up.

Kathryn, 

We have been working with the CDR teams to update various CDRs. Our team will take a look at this and get back to you. 

Thanks, 

Patrick Keown
Program Manager, NOAA Open Data Dissemination (NODD)
Office of the Chief Information Officer (OCIO)

@kathrynberger
Copy link
Contributor Author

A further missing component of the dataset (November 2019) was identified in this issue and an additional email was sent to NOAA for verification of this as an expected anomaly. Will update with NOAAs response.

@kathrynberger
Copy link
Contributor Author

kathrynberger commented Dec 20, 2022

At the recommendation of Patrick Keown, I had contacted the WHOI contact (ocean_bundle_contacts@noaa.gov) and received the following response (copied below, and shared with their permission):

Hello Kathryn,

Thank you for your interests in the NOAA WHOI CDR dataset.  However, due to lack of funding, this CDR is no longer operationally sustained.  This also impacts the ability to retrieve the missing 2019 data you mention. However, it may be worth looking at the NCEI data access point to see if perhaps there was just a transfer issue to AWS.   Additional information can be on the [SST - WHOI CDR page](https://www.ncei.noaa.gov/products/climate-data-records/sea-surface-temperature-whoi), [Ocean Heat Fluxes CDR](https://www.ncei.noaa.gov/products/climate-data-records/ocean-heat-fluxes) and [Ocean Near-Surface Atmospheric Properties CDR](https://www.ncei.noaa.gov/products/climate-data-records/ocean-near-surface-atmosphere).

Thank you,
Candace

I will follow up and determine if this was the result of a transfer issue to AWS (as suggested above) and follow up here accordingly.

@cisaacstern
Copy link
Member

Thanks, @kathrynberger. If it turns out the 2019 data is irrecoverable, we could add logic to the recipe to drop the missing date(s) from the file pattern.

@sharkinsspatial
Copy link
Contributor

@cisaacstern Can you elaborate on how modifying might address this issue? This recipe uses pattern_from_file_sequence https://github.com/pangeo-forge/aws-noaa-whoi-feedstock/blob/main/feedstock/recipe.py#L28 so the missing dates are already accounted for. I may be misunderstanding but I think the issue is more related to how MultizarrToZarr is inferring contiguous chunks. Can we also push this conversation over to pangeo-forge/aws-noaa-whoi-feedstock#2 for easier tracking?

@cisaacstern
Copy link
Member

Oh yes, I actually hadn't realized that we weren't on that thread. IIUC, this issue should actually be closed, because the associated staged-recipes PR has already been merged?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants