Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add expected value CLI and plugin #1719

Merged
merged 13 commits into from
May 31, 2022
Merged

Conversation

tjtg
Copy link
Contributor

@tjtg tjtg commented May 16, 2022

Add a plugin and CLI to calculate the expected value from a probability distribution.

The expected value is the mean of random outcomes (eg. ensemble members) and can be used to produce a deterministic "best guess" forecast from a probabilistic forecast as processed by IMPROVER. The expected value will often be similar to the 50th percentile, but may differ, such as in the case of a positively or negatively skewed (asymmetrical) distribution.

The calculation of expected value for threshold probability data added here is a quick to implement method using existing IMPROVER functionality for conversion to percentiles - this has the correct input and output interfaces, but has high memory usage and has an impact on the accuracy of the output data. I expect to add direct calculation from threshold data (via numerical integration over the threshold values) in a future pull request.

Testing:

  • Ran tests and they passed OK
  • Added new tests for the new feature(s)

@codecov
Copy link

codecov bot commented May 16, 2022

Codecov Report

Merging #1719 (4880cde) into master (b18f795) will decrease coverage by 0.11%.
The diff coverage is 100.00%.

@@            Coverage Diff             @@
##           master    #1719      +/-   ##
==========================================
- Coverage   98.13%   98.01%   -0.12%     
==========================================
  Files         111      113       +2     
  Lines       10190    10332     +142     
==========================================
+ Hits        10000    10127     +127     
- Misses        190      205      +15     
Impacted Files Coverage Δ
improver/expected_value.py 100.00% <100.00%> (ø)
improver/metadata/probabilistic.py 100.00% <100.00%> (ø)
improver/feels_like_temperature.py 100.00% <0.00%> (ø)
improver/calibration/dataframe_utilities.py 100.00% <0.00%> (ø)
improver/calibration/rainforest_calibration.py 39.28% <0.00%> (ø)
improver/utilities/spatial.py 98.86% <0.00%> (+0.04%) ⬆️
...erate_ancillaries/generate_derived_solar_fields.py 98.48% <0.00%> (+9.19%) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update b18f795...4880cde. Read the comment docs.

@tjtg tjtg changed the title WIP: Add expected value CLI and plugin Add expected value CLI and plugin May 18, 2022
anja-bom
anja-bom previously approved these changes May 18, 2022
Copy link
Contributor

@anja-bom anja-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great work Tom, it was amazing to see how easily we can switch from thresholds to percentiles.

@MoseleyS
Copy link
Contributor

Something for a reviewer to consider. This PR calculates the mean value of the ensemble members, and the meta-data is updated to say exactly this. An "Expected Value" is a weighted mean, but this PR provides no mechanism for weighting the members. Should the CLI / plugin be renamed?

@tjtg tjtg added the MO review required PRs opened by non-Met Office developers that require a Met Office review label May 18, 2022
Copy link
Contributor

@benowen-bom benowen-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a good start to adding in the capability to evaluate the mean from an ensemble forecast. There are a couple of points below that need some consideration, but hopefully these should be reasonably straight forward to address.

I think renaming the CLI to ensemble_mean or something similar would be warranted, given (as@MoseleyS highlights) it is the mean being calculated here and strictly speaking not the expectation value.

One thing worth considering is the way multiple ensemble dimensions are handled. Currently the case of two percentile dimensions is identified through a ValueError, but this case is treated as not being a percentile cube. I appreciate that taking the expected value is ambiguous without knowing which percentile dimension to perform the mean over, but I think this case should raise an exception to highlight this ambiguity. The issue becomes even more complicated when one factors in the possibility of mixed ensemble dimensions (for example, both threshold, realization dims present).

improver/metadata/probabilistic.py Outdated Show resolved Hide resolved
improver/metadata/probabilistic.py Show resolved Hide resolved
improver/expected_value.py Outdated Show resolved Hide resolved
improver_tests/metadata/test_probabilistic.py Outdated Show resolved Hide resolved
improver_tests/metadata/test_probabilistic.py Show resolved Hide resolved
benowen-bom
benowen-bom previously approved these changes May 25, 2022
Copy link
Contributor

@benowen-bom benowen-bom left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for updates, and your consideration of other points and raising associated issues. This all looks good to me now.

@tjtg tjtg removed their assignment May 26, 2022
@fionaRust fionaRust self-assigned this May 30, 2022
Copy link
Contributor

@fionaRust fionaRust left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks Tom, could you let me know where I can find your new acceptance test data, so I can take a quick look and have it ready to merge in when this PR gets merged.

improver/metadata/probabilistic.py Outdated Show resolved Hide resolved
@fionaRust fionaRust assigned tjtg and unassigned fionaRust May 30, 2022
Co-authored-by: fionaRust <fiona.rust@metoffice.gov.uk>
@fionaRust fionaRust merged commit 493f9e4 into metoppv:master May 31, 2022
@tjtg tjtg deleted the expectedvalue branch June 24, 2022 03:05
MoseleyS pushed a commit to MoseleyS/improver that referenced this pull request Aug 22, 2024
* Skeleton for expected value

* Update style and copyright

* Add basic implementation

* Add tests for is_percentile

* Add expected value tests

* Fix imports and tests

* Add handling of threshold data via conversion to percentiles

* Update tests for threshold calculation

* Add acceptance tests

* Fix black making flake8 fail

* Changes from review comments

* Fix unused import

* Docstring fix

Co-authored-by: fionaRust <fiona.rust@metoffice.gov.uk>

Co-authored-by: fionaRust <fiona.rust@metoffice.gov.uk>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
MO review required PRs opened by non-Met Office developers that require a Met Office review
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants