Skip to content

Conversation

@grassesi
Copy link
Contributor

@grassesi grassesi commented Oct 15, 2025

Description

Store source window as a attribute of the zarr group corresponding to one OuputDataset. When converting OutputDataset to xarray, source window is available as a coordinate over the axis "sample". Lead time can be calculated from the source window and the absolute time.

Issue Number

closes #929

Is this PR a draft? Mark it as draft.

Checklist before asking for review

  • I have performed a self-review of my code
  • My changes comply with basic sanity checks:
    • I have fixed formatting issues with ./scripts/actions.sh lint
    • I have run unit tests with ./scripts/actions.sh unit-test
    • I have documented my code and I have updated the docstrings.
    • I have added unit tests, if relevant
  • I have tried my changes with data and code:
    • I have run the integration tests with ./scripts/actions.sh integration-test
    • (bigger changes) I have run a full training and I have written in the comment the run_id(s): launch-slurm.py --time 60
    • (bigger changes and experiments) I have shared a hegdedoc in the github issue with all the configurations and runs for this experiments
  • I have informed and aligned with people impacted by my change:
    • for config changes: the MatterMost channels and/or a design doc
    • for changes of dependencies: the MatterMost software development channel

Copy link
Collaborator

@tjhunter tjhunter left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@grassesi a few comments. Something does not seem right in one piece.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.

return list(example_stream.group_keys())

@functools.cached_property
def lead_times(self) -> list[int]:
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are you using it? I don't see a place where you would need it.

Also, let's not used cached_property unless it is vital to do so. Most people will be confused.

@grassesi
Copy link
Contributor Author

@grassesi a few comments. Something does not seem right in one piece.
Yes it is very much untested, I need to do that today, just wanted to get it out there.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.
I would very much avoid implementing Backward compatibility for this one 😅

@iluise
Copy link
Collaborator

iluise commented Oct 15, 2025

@grassesi a few comments. Something does not seem right in one piece.

Also, your code is not backward compatible. @iluise : is this fine for you that all the sameples are invalidated? We can make it backward compatible but I am not sure it is worth it.

Yes, no problem for the backward compatibility. If we have the checkpoints we can always re-run inference so I'd not worry about it.

@grassesi grassesi force-pushed the sgrasse/develop/issue_929_lead_time_in_output branch from 2618ce0 to 50fc8ab Compare October 27, 2025 14:10
@grassesi grassesi changed the title [929][evalution/io] Make lead time available to evaluation via OutputDataset [929][evalution/io] Make lead time and source windew available to evaluation via OutputDataset Oct 29, 2025
@grassesi grassesi requested review from iluise and tjhunter October 29, 2025 15:42
@grassesi grassesi force-pushed the sgrasse/develop/issue_929_lead_time_in_output branch 2 times, most recently from ac8a281 to 9d6232a Compare November 4, 2025 15:58
@grassesi grassesi force-pushed the sgrasse/develop/issue_929_lead_time_in_output branch from 48971bf to 16e27f9 Compare November 4, 2025 19:01
@grassesi grassesi force-pushed the sgrasse/develop/issue_929_lead_time_in_output branch from e581e93 to c8986d6 Compare November 5, 2025 07:56
@grassesi grassesi marked this pull request as ready for review November 5, 2025 08:20
Copy link
Collaborator

@iluise iluise left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi Simon,

I tested the PR and we are close. Small requests:

  • can we remove the 0 from the zio.forecast_steps when it is not predicted? If not we will always get forecast_steps = [0-x] even if we don't have target or prediction in the zarr for setp 0.

  • I can't retrieve source from forecast step 0. I get the following:

out = zio.get_data('0', stream, '0')
*** ValueError: Missing target dataset for item: 0/ERA5/0

is there a better way to retrieve source directly without passing through zio.get_data('0', stream, '0')?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

Status: No status

Development

Successfully merging this pull request may close these issues.

multiply forecast_step by len_hrs in ZarrIO

4 participants