You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The code is currently run using scripts as main entry points (cmip26.py and trainScript.py).
This is not ideal for distributing as they are long scripts that a user needs to run and change details ('magic numbers') in.
These will also not work in the proposed src/ layout.
Scripts should be broken up in to functions with simple entry/call in main package, and grouped with other relvant code.
However, there is a larger discussion that needs to be held to decide how to break up the code.
Separate libraries within this repo for data acquisition/processing and model?
Separate repositories for data acquisition/processing and model?
This starts to feel verbose, but is what is required for Huggingface?
cmip26.py probably needs to be grouped with data acquisition and processing src/gz_ocean_momentum/data/
trainScript.py probably needs to be grouped with src/gz_ocean_momentum/train/
This is the next large refactoring that we should look at. Mostly a question of how we configure runs, and optionally MLflow. (The training stage in particular I think depends on MLflow to locate data to process. The data processing stage cmip26.py I was able to run without MLflow.)
We're not splitting up the code into separate repositories. Lots of coupling remains between steps, and it would be a bit complicated to untangle this (and would mean lots more maintenance stress).
The code is currently run using scripts as main entry points (
cmip26.py
andtrainScript.py
).This is not ideal for distributing as they are long scripts that a user needs to run and change details ('magic numbers') in.
These will also not work in the proposed
src/
layout.Scripts should be broken up in to functions with simple entry/call in main package, and grouped with other relvant code.
However, there is a larger discussion that needs to be held to decide how to break up the code.
Separate libraries within this repo for data acquisition/processing and model?
Separate repositories for data acquisition/processing and model?
cmip26.py
probably needs to be grouped with data acquisition and processingsrc/gz_ocean_momentum/data/
trainScript.py
probably needs to be grouped withsrc/gz_ocean_momentum/train/
Related:
The text was updated successfully, but these errors were encountered: