A hierarchical modeling framework to discover new machine learning-based equations for cloud cover, including symbolic regression
Grundner, A., Beucler, T., Gentine, P., & Eyring, V. (2023). Data-Driven Equation Discovery of a Cloud Cover Parameterization. Preprint
Author: Arthur Grundner, arthur.grundner@dlr.de
The current release on zenodo can be found here:
- Fig 1, Code: Comparison of the coarse-grained DYAMOND and ERA5 data
- Fig 2, Code: All cloud cover schemes in a performance x complexity plot
- Fig 3, Code: Predicted cloud cover distributions
- Fig 4, Code: Transfer to higher resolutions
- Fig 5.1, Code: Transfer learning to ERA5 data (selected schemes)
- Fig 5.2, Code: Transfer learning to ERA5 data (polynomials & NNs)
- Fig 6.1, Code: Plots of the terms I_1, I_2, I_3
- Fig 6.2, Code: Conditional average w.r.t. RH and T
- Fig 6.3, Code: Conditional average w.r.t. dzRH
- Fig 7.1, Code: Contour plot of dzRH
- Fig 7.2, Code: Cloud cover w.r.t. RH with and without modification to satisfy the RH-physical constraint
- Fig 8, Code: Ablation study of our analytic scheme on DYAMOND and ERA5 data
- Fig A1.1, Fig A1.2, Fig A1.3, Fig A1.4, Fig A1.5, Fig A1.6, Code: Maps of I1, I2, I3 on a specific vertical layer on ~1490m averaged over 10 days of DYAMOND data. Maps of the a5-term on three different altitudes
- Fig B1.1, Fig B1.2, Code: The distributions of cloud water and cloud ice on storm-resolving scales
To reproduce the results it is first necessary to have access to accounts on DKRZ/Levante. Then one can coarse-grain and preprocess the DYAMOND and ERA5/ERA5.1 data sets:
- Guide for how to coarse-grain the DYAMOND data: strategy.md
- To then pre-process the DYAMOND data: preprocessing.ipynb
- Scripts to coarse-grain ERA5 data (1979-2021, first day of every quarter): horizontally, vertically
It suffices to coarse-grain the variables: clc/cc, cli/ciwc, clw/clwc, hus/q, pa, ta/t, ua/u, va/v, zg/z
The results were produced with the version numbers indicated below:
- PySR 0.10.1 [https://github.com/MilesCranmer/PySR]
- GP-GOMEA [https://github.com/marcovirgolin/GP-GOMEA]
- mlxtend 0.20.0 [https://github.com/rasbt/mlxtend]
- scikit-learn 1.0.2 [https://scikit-learn.org/]
- SymPy 1.10.1 [https://github.com/sympy]
- SciPy 1.8.1 [https://github.com/scipy/]
- TensorFlow 2.7.0 [https://tensorflow.org/]
To create a working environment you can run the following line:
conda install -c conda-forge tensorflow==2.7.0 scipy==1.8.1 sympy==1.10.1 scikit-learn==1.0.2 mlxtend==0.20.0 pysr==0.10.1
To install the GP-GOMEA dependency please refer to their website.
This code is released under Apache 2.0. See LICENSE for more information.