Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Benchmarking recipes (Lauer et al.) #3598

Open
wants to merge 113 commits into
base: main
Choose a base branch
from

Conversation

axel-lauer
Copy link
Contributor

@axel-lauer axel-lauer commented May 16, 2024

Description

This PR implements a set of benchmarking recipes for comparison of different metrics (RMSE, bias, correlation, EMD) calculated for a given model simulation to the results from an ensemble of (model) datasets:

For this, the existing monitoring diagnostics monitoring/monitor.py and monitor/multi_datasets.py have been extended.

The new diurnal cycle plot has also been added to the following existing recipes:

Documentation for the benchmarking recipes is available in recipes/recipe_benchmarking.rst, the documentation for monitoring and model evaluation have been updated to include the diurnal cycle plots.

Closes #3498

Note for testing

The benchmarking recipes require the new preprocessor functions local_solar_time and distance_metric and the extended version of preprocessor resample_hours.
The recipe creating the portrait diagram additionally requires PR #3551 providing the diagnostic script for this plot.

Checklist

It is the responsibility of the author to make sure the pull request is ready to review. The icons indicate whether the item will be subject to the 🛠 Technical or 🧪 Scientific review.

New or updated recipe/diagnostic

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just pushed some commits to make this PR up-to-date with other developments of the monitoring diagnostic that have been merged while this PR was open. The merge with main only adapted existing lines, but not those added via this PR. I also fixed some minor issues with the doc about allowed options for these diagnostics.

I also fixed a bug which lead to no output from the boxplot diagnostic. Now all recipes run fine and produce the expected output (just tested) 🎉

The only remaining comment (apart from two minor ones on the code) I have is about the lauer25gmd recipes. These include data from the EMAC model, which is not easily available. Usually we don't include these kind of recipes in the main repository because they are not really reproducible without the data. The alternative would be to put them on Zenodo (e.g., https://zenodo.org/records/7254313).

I know that the paper is almost published and it's probably hard to make any additional changes to it, but if that's possible, putting the recipe on Zenodo would be the preferred solution. If that's not possible, it would be great to add a note about that (ideally with a link to the data). We also need to make sure that those recipes are not part of the automated recipe test workflow.

doc/sphinx/source/recipes/recipe_benchmarking.rst Outdated Show resolved Hide resolved
@axel-lauer
Copy link
Contributor Author

The only remaining comment (apart from two minor ones on the code) I have is about the lauer25gmd recipes. These include data from the EMAC model, which is not easily available. Usually we don't include these kind of recipes in the main repository because they are not really reproducible without the data. The alternative would be to put them on Zenodo (e.g., https://zenodo.org/records/7254313).

Very good point! I removed the 'lauer25gmd' recipes from the branch and put them on Zenodo. The recipes are available at 10.5281/zenodo.11198444, this will be updated in the paper once we get the proofs.

Copy link
Contributor

@alistairsellar alistairsellar left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the great work everyone. I'm happy that my comments have been addressed (thanks!) so I'll approve now.

A note for future reviews - it really helps the reviewer if when you address a comment with e.g. "I have changed x" or "I have added this to y", your message also includes a link to the commit that addresses the comment. It makes it quicker for the reviewer to check what has changed in response.

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Axel for addressing all my comments and sorry for pushing while you were working on it, I think my changes broke the tests, so I wanted to fix them quickly.

I have two tiny comments, then will approve! Tests are failing because this requires a new ESMValCore release. We can merge once the first release candidate is out.

Cheers! 🚀

@axel-lauer
Copy link
Contributor Author

I have two tiny comments, then will approve! Tests are failing because this requires a new ESMValCore release. We can merge once the first release candidate is out.

Should be all done now!! Thanks for your review and help with this PR!

Copy link
Contributor

@schlunma schlunma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Awesome, thanks @axel-lauer and everyone!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
looking for technical reviewer metric new recipe Use this label if you are adding a new recipe requires new ESMValCore release A new release of ESMValCore is needed to solve this issue/merge this pull request..
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add diurnal cycle plot (variable vs. hour of day) to diagnostic monitor/multi_datasets.py
7 participants