Release 0.2 #89

gmertes · 2024-10-15T10:43:07Z

Merge develop to main and release 0.2

Waiting for these PRs, necessary to facilitate the transition to the new mlflow server:

Add AnemoiMlflowClient with auth support #86

Optional but nice to include:

This one will go in the next release:

Fix/mlflow sync tag #83

Encompasses MSE loss and grad_scaler. Co-authored-by: Jesper Dramsch <jesper.dramsch@ecmwf.int> Co-authored-by: Simon Lang <simon.lang@ecmwf.int> Co-authored-by: Matthew Chantry <matthew.chantry@ecmwf.int> Co-authored-by: mihai.alexe <mihai.alexe@ecmwf.int>

…to develop

Provides a set of tools for monitoring and evaluating the performance of machine learning models. The module includes a set of classes and functions for logging, profiling, and visualizing model performance. Co-authored-by: Jesper Dramsch <jesper.dramsch@ecmwf.int> Co-authored-by: Matthew Chantry <matthew.chantry@ecmwf.int> Co-authored-by: Simon Lang <simon.lang@ecmwf.int> Co-authored-by: Mihai Alexe <mihai.alexe@ecmwf.int> Co-authored-by: Sara Hahner <sara.hahner@ecmwf.int> Co-authored-by: Ana Prieto Nemesio <ana.prietonemesio@ecmwf.int>

Allows Distributed Data Parallel and Distributed Model Parallel training Co-authored-by: Jesper Dramsch <jesper.dramsch@ecmwf.int> Co-authored-by: Simon Lang <simon.lang@ecmwf.int> Co-authored-by: Matthew Chantry <matthew.chantry@ecmwf.int> Co-authored-by: Mihai Alexe <mihai.alexe@ecmwf.int>

Co-authored-by: Jesper Dramsch <jesper.dramsch@ecmwf.int> Co-authored-by: Matthew Chantry <matthew.chantry@ecmwf.int> Co-authored-by: Simon Lang <simon.lang@ecmwf.int> Co-authored-by: Mihai Alexe <mihai.alexe@ecmwf.int> Co-authored-by: Ana Prieto Nemesio <ana.prietonemesio@ecmwf.int> Co-authored-by: Mario Santa Cruz <mario.santacruz@ecmwf.int>

Co-authored-by: Jesper Dramsch <<jesper.dramsch@ecmwf.int>

…esteps 40 support dataset with missing timesteps

Expanded intro docs and examples

* fix: change pre-cmmit autoupdate schedule to monthly * fix: change the merge strategy for Changelog to Union * fix: add .envrc to .gitignore * ci: exclude pre-commit and readthedocs yaml from changelog ci * ci: fix downstream-ci-hpc workflow call * chore: update pre-commit * chore: move gitignore * fix: hypothesis dependency * feat: add codeowners * ci: add hpc-config * chore: add python 3.10 * docs: update changelog --------- Co-authored-by: Jesper Dramsch <jesper.dramsch@ecmwf.int>

@lzampier

* feature: long rollout plots * incorporate review from lorenzo, correction for ocean variables to not consider them as pressure level variables and small fix for grouping of less than 15 variables in loss contribution histogram * backward compatibility for config files without longrolloutplot configuration Reviewers: @lzampier , @theissenhelen , @JesperDramsch

* fix: modify sysargv with subcommands * docs: add changelog

* fix version pinning * chore: run pre-commit. * fix version --------- Co-authored-by: Florian Prill <63-m300196@users.noreply.gitlab.dkrz.de>

* fix: remove staticmethod from `TokenAuth.enabled` * chore: changelog

* feat: authentication support for mlflow sync * chore: formatting * chore: changelog * chore: changelog add link * fix: sync authentication flag * refactor: move `health_check` to submodule top level * feat: add health check * chore: update error msg * refactor: mlflow utils

* fix: mlflow auth use web seed token * feat: make target env var an optional argument * chore: docstrings * fix: tests * chore: add comment * chore: changelog * chore: docstring

add link to transform

FussyDuck · 2024-10-15T10:43:20Z

All committers have signed the CLA.

…tion Allow for longer truncation when mlflow > 1.28

Update CODEOWNERS

Fix interactive multi-GPU training

* feat: anemoi mlflow client with authentication * fix: recursion on anemoi_auth * chore: add tests * chore: changelog

* [pre-commit.ci] pre-commit autoupdate updates: - [github.com/pre-commit/pre-commit-hooks: v4.6.0 → v5.0.0](pre-commit/pre-commit-hooks@v4.6.0...v5.0.0) - [github.com/astral-sh/ruff-pre-commit: v0.6.4 → v0.6.9](astral-sh/ruff-pre-commit@v0.6.4...v0.6.9) - [github.com/tox-dev/pyproject-fmt: 2.2.3 → 2.2.4](tox-dev/pyproject-fmt@2.2.3...2.2.4) - [github.com/jshwi/docsig: v0.60.1 → v0.64.0](jshwi/docsig@v0.60.1...v0.64.0) * fix: pre-commit stages --------- Co-authored-by: pre-commit-ci[bot] <66853113+pre-commit-ci[bot]@users.noreply.github.com> Co-authored-by: Gert Mertes <gert.mertes@ecmwf.int>

theissenhelen and others added 30 commits May 17, 2024 15:42

Add commands

4d4f301

Merge branch 'develop' of https://github.com/ecmwf/anemoi-training in…

a5993a9

…to develop

chore: add dependencies

3a46b4e

Update documentation tools

479b222

Merge branch 'develop' of github.com:ecmwf/anemoi-training into develop

6f5a6f8

Better rst formatter

6014703

Run fmt on documentation stub

0c51f4e

Update inference checkpoint

3b9f655

Update inference checkpoint

2837c0b

Work on cli

11dc23f

Work on documentation

5004cdc

Work on documentation

f36b52f

test: add initial tests for data module, checkpoint, and trainer

1e7729e

Co-authored-by: Jesper Dramsch <<jesper.dramsch@ecmwf.int>

update documentation and pre-commit

9e76a46

tidy conf.py

b993520

Update dependencies

3531281

Add mlflow subpackage

1c95835

Add auth module

624fabb

Use unix time

fbe8d0a

Store new refresh token in memory

a05a374

Use anemoi.utils.config

7dec83c

Add authenticate fn

683069b

Load config on init

7e5bb01

Refactor login logic

70bf8c3

Add force credentials arg

224897d

Check refresh token expiry on auth

460d598

JPXKQX and others added 19 commits September 13, 2024 08:57

Merge pull request #48 from ecmwf/40-support-dataset-with-missing-tim…

91d8c6e

…esteps 40 support dataset with missing timesteps

Fix filename typo

2643d89

Merge branch 'develop' into docs/expandIntro

5d31cbb

Merge pull request #46 from ecmwf/docs/expandIntro

a985297

Expanded intro docs and examples

fix: triggering event in QA

e808f64

[fix] Capture Anemoi Training subcommands in MLFlow (#61)

98b506d

* fix: modify sysargv with subcommands * docs: add changelog

fix version pinning (#66)

6c56348

* fix version pinning * chore: run pre-commit. * fix version --------- Co-authored-by: Florian Prill <63-m300196@users.noreply.gitlab.dkrz.de>

Fix mlflow auth on python 3.9 (#62)

977c26b

* fix: remove staticmethod from `TokenAuth.enabled` * chore: changelog

fix: hidden entrypoint for interactive ddp

3163eb1

chore: docstring

0df8311

add link to transform

42c1e2e

New mlflow authentication API (#78)

da0fd0d

* fix: mlflow auth use web seed token * feat: make target env var an optional argument * chore: docstrings * fix: tests * chore: add comment * chore: changelog * chore: docstring

Merge branch 'develop' into fix/interactive-ddp

45d714e

chore: changelog

e92a121

Allow for longer truncation when mlflow > 1.28

5479314

Merge pull request #81 from ecmwf/doc/add-link-to-transform

6c78956

add link to transform

HCookie and others added 8 commits October 15, 2024 14:45

Merge branch 'develop' into fix/mlflow-log_params-string-truncation

4a5fcc3

Merge pull request #88 from ecmwf/fix/mlflow-log_params-string-trunca…

78bf1bc

…tion Allow for longer truncation when mlflow > 1.28

Update CODEOWNERS

f15b53b

Merge pull request #90 from ecmwf/update/codeowners

3676f60

Update CODEOWNERS

Merge branch 'develop' into fix/interactive-ddp

acbab76

Merge pull request #82 from ecmwf/fix/interactive-ddp

39c309d

Fix interactive multi-GPU training

Add AnemoiMlflowClient with auth support (#86)

98292b1

* feat: anemoi mlflow client with authentication * fix: recursion on anemoi_auth * chore: add tests * chore: changelog

gmertes marked this pull request as ready for review October 16, 2024 10:23

gmertes merged commit d64cf6e into main Oct 16, 2024
191 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Release 0.2 #89

Release 0.2 #89

gmertes commented Oct 15, 2024 •

edited by HCookie

Loading

FussyDuck commented Oct 15, 2024 •

edited

Loading

Release 0.2 #89

Release 0.2 #89

Conversation

gmertes commented Oct 15, 2024 • edited by HCookie Loading

FussyDuck commented Oct 15, 2024 • edited Loading

gmertes commented Oct 15, 2024 •

edited by HCookie

Loading

FussyDuck commented Oct 15, 2024 •

edited

Loading