Art generated via neural style transfer over a blue marble.
In this repository lives the Model Intercomparison Suite for EarthNet2021 (ENMIS).
It incorporates models that have entered the challenge as submodules under src/models
and the persistance baseline. It can spill out a ton of the plots and analysis across the different test tracks.
- Full documentation on EarthNet Website
- Video: Why predict Earth surface?
- EarthNet toolkit on Pypi
- EarthNet2021 paper in Arxiv or in ClimateChangeAI @NeurIPS2020
- Clone the repo:
git clone --recursive https://github.com/earthnet2021/earthnet.git
Why
recursive
? Because we have git submodules for hosting models. This means you'll need to rungit submodule update
when updating your remote.
-
We recommend setting up a docker container using our Dockerfile. Run
docker build . --tag earthnet2021:1.0
-
Run a Docker container based on the image created. E.g,
docker run -it earthnet2021:1.0
. Use the util to run the container attaching properly all heavy directories. -
Run JupyterLab
jupyter lab
port 8888 is forwarded to the one defined in docker_run.sh -
You might just start to work on
hostname:8000
. However, docker/linux might have some bug that induces Jupyter lab irresponsivness due to port forwarding. Tryssh -N -f -L localhost:8000:localhost:8000 username@hostname
Submodules, such as 'tf_template' might require to set up a conda environment. If the environment was not created during the Docker build, run conda create --name ENtf115py36 python=3.6
, then activate the environment source activate ENtf115py36
and install the libraries pip install -r ./src/models/tf_template/requirements.txt
.
-
src contains the submodules for the
datasets
and machine learningmodels
. These are git projects themselves. These repositories live in symbiosis inside EarthNet2021. Development can occur on those submodules in the same way it has been done so far. Recent commits on their Master branch will be updated into EarthNet2021. -
utils is the place for useful functions e.g., npz-to-tfrecods.
-
scripts is the place for miscelaneous scripts to move data between machines, keep the working environment clean, set up the environments, etc.
-
data is the place for all of the heavy files.
-
data/temp contains only temporary files. For example data/temp/checkpoints/ can hold images, gifs and tensorboard logs for models during training. But only those models that make the cut will be moved to
pretrained
-
data/outputs holds
/<model_name>/<experiment_name>/<data_split_name>
with numpy datacubes of the predictions generated over the test set by our trained models. This is the only directory our evaluation pipeline should need to access for inputs. -
data/pretrained has the weights of fully trained and tested models that make their way into the publication.
-
data/results our evaluation pipeline should store all relevant figures/tables/animations here.
-
data/datasets is the directory where we drop the EarthNet2021 datasets
/release
-
Ideally, data/
is mounted into the Docker from a large disk as it will get big quickly.
Your main entry point is run.py.
configs/ is the location for the submodules configuration files. Each configuration is saved in a .yml format. It defines how to call standalone models added as submodules from run.py
. The config.yml file defines what script to run, in which conda environment, the arguments to passed and the experiments settings (as JSON) if any.