This repository contains the code used for the analysis in the paper:
Zhang et al. 2022. A CNN-LSTM Model for Soil Organic Carbon Content Prediction with Long Time Series of MODIS-Based Phenological Variables. Remote Sensing 14, 4441.
https://doi.org/10.3390/rs14184441
- Python3
- numpy
- pandas
- scikit-learn
- pytorch
- seaborn
- matplotlib
The CNN-LSTM deep learning model for soil organic carbon (SOC) predictive mapping with inputs of static and dynamic environmental covariates. The spatially contextual features in static variables (e.g. topographic variables) were extracted by CNN, while the temporal features in dynamic variables (e.g. vegetation phenology over a long period of time) were extracted by LSTM. The extracted spatial and temporal features are concatenated to connect fully-connected layers for calculating the outputs (predicted SOC values).
- data (directory):
Here the user needs to put the pickle files of the input data (X and y) for training the CNN-LSTM model. The requried data include:
- The table file (e.g. csv format) of the sample data, this file should include columns of the sample location (longitude and latitude) and the value of target soil property, e.g. soil organic carbon values. We recommend that users use their own sample data, or use simulated data for testing. Our sample dataset collected in this study is not publicly available but can be available from the author on reasonable request.
- The pickle file of input data (X) for CNN (e.g., climate and topographic data with spatially contextual information).
- The pickle file of input data (X) for LSTM (e.g., EVI data with temporally dynamic information).
- The pickle file of input data (X) for LSTM (e.g., phenological data with temporally dynamic information).
- model (directory): The folder for storing the model.
- config.py: The configuration file for setting the data locations and model hyperparameters.
- models.py: The core functions for generating the CNN-LSTM model for the soil prediction.
- train.py: It implements data preparation, model initialization and model training procedure.
- pred.py: For predicting the target values by using the saved model and evaluating the model performance on the test set.
- utils.py: It contains functions for the data loading and generating X and y as the inputs for model training and validating.
All model parameters can be set in config.py
, such as the learning rate, batch size, number of layers, etc.
python train.py
The program can save the model parameters in the model
directory.
python pred.py
The saved model can be loaded and evaluating on the test set.
- The climate data are available at: https://www.worldclim.org
- The topographic data are available at: https://doi.org/10.5069/G91R6NPX
- The MODIS land surface phenology product (MCD12Q2) and EVI (calculated based on MOD09GA) data data are available at: https://ladsweb.modaps.eosdis.nasa.gov
The code and data shared in this study by Lei Zhang are licensed under CC BY-NC 4.0
For questions and supports please contact the author: Lei Zhang 张磊 (lei.zhang.geo@outlook.com)
Lei Zhang's Homepage