Crop yield information plays a pivotal role in ensuring food security. Advances in Earth Observation technology and the availability of historical yield records have promoted the use of machine learning for yield prediction. Significant research efforts have been made in this direction, encompassing varying choices of yield determinants and particularly how spatial and temporal information are encoded. However, these efforts are often conducted under diverse experimental setups, complicating their inter-comparisons. In this paper, we present our findings on multiple strategies for encoding spatial-spectral information at the county level—specifically through average pixel values, pixel sampling, and image histograms alongside approaches for temporal information, including recurrent neural networks, temporal convolutions, and attention mechanisms.
- conf - configuration management files for the experiments
- data_preparation - dataset classes for transforming data (time x channel x pixels) into different data structures
- models - model classes for classical and deep learning architectures
- train_XXX.py - scripts to execute training of deep learning models according to input data structure
The United States of America (USA) is the world’s largest producer of corn, accounting for approximately one-third of global production. We conduct a case study focusing on the USA’s top five corn-producing states: Iowa, Illinois, Indiana, Nebraska, and Minnesota. Altogether, they accounted for over one-half of the USA’s corn(grain) production in 2021.
Fig.1 : Map of the study area showing the difference in corn yield for 2012(drought) and 2011 (predrought)The effectiveness of machine learning for crop yield prediction also depends on selecting the appropriate sets of features. We review selected studies that apply machine learning (ML) to remote sensing data for predicting crop yields to determine our predictors. The figure below shows that the selected features adequately captures variations in yield.
- MODIS surface reflectance (MODIS
- Meterological factors (temperature and precipitation)
- Spectral indices (NDVI and NDWI)
The data used for the study is available on huggingface.
The workflow follows the compression or transformation of spatio-temporal information into other data structures. Depending on the type of structure, a befitting machine learning model is applied.
Fig.3: Demonstrating the data preparation workflow and experiment setupThe figure presents the percentage difference between observed and predicted crop yield for the year 2021. Interested readers are encouraged to consult the main paper for additional evaluation metrics for the extensive list of models compared.
Fig.4: Demonstrating the data preparation workflow and experiment setupPlease cite our work as:
TBD
Model implementations from this work are sourced from: