Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature/mask NaNs in training loss function #56

Draft
wants to merge 8 commits into
base: develop
Choose a base branch
from

Conversation

sahahner
Copy link
Member

@sahahner sahahner commented Oct 2, 2024

Variables with missing values that are imputed by the imputer should not be considered in the loss.

The NaN masks are prepared in the imputer. The remapper contains a new function to remap the NaN masks from the imputer.

This goes together with PR #72 from anemoi-training.

@codecov-commenter
Copy link

codecov-commenter commented Oct 2, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.84%. Comparing base (f96bcf9) to head (87647b7).

Additional details and impacted files
@@           Coverage Diff            @@
##           develop      #56   +/-   ##
========================================
  Coverage    99.84%   99.84%           
========================================
  Files           23       23           
  Lines         1301     1301           
========================================
  Hits          1299     1299           
  Misses           2        2           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@floriankrb
Copy link
Member

This functionality seems to be related to ecmwf/anemoi-training#79
Perhaps the masks.py created by @JPXKQX should move in anemoi-models and a [refactored version of] OutputMask be used here?

@JPXKQX
Copy link
Member

JPXKQX commented Oct 15, 2024

I see some similarities between the output masking and the post-processors, but the part that doesn't fit is that the post-processors are only applied at the end of the rollout. Instead, the masking is called not only at the end, but also in between all the rollout steps (to roll out the boundary forcing). So I don't know if it's better to include it as a special post-processor or leave it in the anemoi-training.

I would say that we can do the loss masking here similar to the imputer, but I think the masking should remain in anemoi-training.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants