Skip to content

Commit

Permalink
Update paper, meta data. Update test_gaea. Improve vignettes
Browse files Browse the repository at this point in the history
  • Loading branch information
mengqi-z committed Aug 3, 2024
1 parent 22f2609 commit 06aaaff
Show file tree
Hide file tree
Showing 14 changed files with 361 additions and 31 deletions.
6 changes: 4 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@

`gaea` is an open-source R package designed to estimate crop yield shocks in response to annual climate variations and CO2 concentrations at the country scale for 12 major crops. `gaea` streamlines the workflow from raw climate data processing to the production of different forms of yield shock, such as agricultural productivity changes at the region-basin level, which can be directly integrated into the latest Global Change Analysis Model (GCAM).

<br>
<br />

<p align="center">
<a href="https://jgcri.github.io/gaea/" target="_blank"><img src="https://github.com/JGCRI/jgcricolors/blob/main/vignettes/button_user_guide.PNG?raw=true"
Expand All @@ -31,6 +31,8 @@ alt="https://jgcri.github.io/gaea/" height="60"/></a>

> Zhao, M., Waldhoff, S., Tebaldi, C., Snyder, A. 2024. gaea: An R package to estimate crop yield responses to temperature and precipitation. (In progress) Journal of Open Source Software, DOI: XXXX
<br/>

<!-- ------------------------>
<!-- ------------------------>
# <a name="InstallGuide"></a>Installation Guide
Expand Down Expand Up @@ -64,7 +66,7 @@ remotes::install_github("JGCRI/gaea")

> Waldhoff, S.T., Wing, I.S., Edmonds, J., Leng, G. and Zhang, X., 2020. Future climate impacts on global agricultural yields over the 21st century. Environmental Research Letters, 15(11), p.114010. https://doi.org/10.1088/1748-9326/abadcb

<br/>

<!-- ------------------------>
<!-- ------------------------>
Expand Down
2 changes: 1 addition & 1 deletion _pkgdown.yml
Original file line number Diff line number Diff line change
Expand Up @@ -19,7 +19,7 @@ navbar:
text: "News"
href: news/index.html
- text: "Contribute"
href: CONTRIBUTE.html
href: CONTRIBUTING.html
- icon: fas fa-quote-left
text: Citation
href: CITATION.html
Expand Down
14 changes: 7 additions & 7 deletions paper/paper.md
Original file line number Diff line number Diff line change
Expand Up @@ -8,13 +8,13 @@ authors:
orcid: 0000-0001-5385-2758
affiliation: 1
- name: Stephanie Waldhoff
orcid:
orcid: 0000-0002-8073-0868
affiliation: 2
- name: Claudia Tebaldi
orcid:
orcid: 0000-0001-9233-8903
affiliation: 2
- name: Abigail Snyder
orcid:
orcid: 0000-0002-9034-9948
affiliation: 2

affiliations:
Expand All @@ -28,7 +28,7 @@ bibliography: paper.bib

# Summary

`gaea` is an open-source R package designed to estimate crop yield shocks in response to annual climate variations and CO2 concentrations at the country scale for 12 major crops. This innovative tool streamlines the workflow from raw data processing to the production of different forms of yield shock, such as agricultural productivity changes at the basin level, which can be directly integrated into the latest Global Change Analysis Model (GCAM) [@Bond-Lamberty_2024; @Calvin_2019]. At its core, `gaea` employs an empirical econometric model that leverages historical crop yield data (e.g., from [FAOSTAT](https://www.fao.org/faostat/en/#data/QCL)) and climate data for robust empirical fitting across diverse country-crop-climate combinations. `gaea` is fully compatible with the widely-used climate data from the Coupled Model Intercomparison Project Phase 6 (CMIP6), bias-adjusted by the Inter-Sectoral Impact Model Intercomparison Project ([ISIMIP](https://www.isimip.org/)). For future projections, the package aggregates global gridded precipitation and temperature data to the national level, weighted by cropland area derived from MIRCA [@Portmann_2010]. This approach enables the projection of annual yield shocks under various future climate scenarios, differentiated by crop type, country, and time. More broadly, `gaea` serves as a lightweight yet powerful model that equips researchers with the tools to explore the possibility space of global crop yields responses to future climate uncertainties, enhancing human-Earth system analysis capabilities.
`gaea` is an open-source R package designed to estimate crop yield shocks in response to annual climate variations and CO2 concentrations at the country scale for 12 major crops. This innovative tool streamlines the workflow from raw climate data processing to the production of different forms of yield shock, such as agricultural productivity changes at the region-basin level, which can be directly integrated into the latest Global Change Analysis Model (GCAM) [@Bond-Lamberty_2024; @Calvin_2019]. At its core, `gaea` employs an empirical econometric model that leverages historical crop yield data (e.g., from [FAOSTAT](https://www.fao.org/faostat/en/#data/QCL)) and climate data for robust empirical fitting across diverse country-crop-climate combinations. `gaea` is fully compatible with the widely-used climate data from the Coupled Model Intercomparison Project Phase 6 (CMIP6), bias-adjusted by the Inter-Sectoral Impact Model Intercomparison Project ([ISIMIP](https://www.isimip.org/)). For future projections, the package aggregates global gridded precipitation and temperature data to the national level, weighted by cropland area derived from MIRCA [@Portmann_2010]. This approach enables the projection of annual yield shocks under various future climate scenarios, differentiated by crop type, country, and time. More broadly, `gaea` serves as a lightweight yet powerful model that equips researchers with the tools to explore the possibility space of global crop yields responses to future climate uncertainties, enhancing human-Earth system analysis capabilities.


# Statement of need
Expand All @@ -46,15 +46,15 @@ The exploration of future climate change impacts on agricultural production is c

The primary functionality of `gaea` is encapsulated in the `yield_impact` wrapper function, which executes the entire workflow from climate data processing to yield shock estimation. Users can also execute individual functions to work through the main steps of the process (\autoref{fig:workflow}). Detailed instructions on `gaea` can be accessed at https://jgcri.github.io/gaea.

1. `weighted_climate`: Processes CMIP-ISIMIP climate NetCDF data and calculates cropland-weighted precipitation and temperature at the country level, differentiated by crop type, irrigation type, and country. The function accepts both daily or monthly climate data.
1. `weighted_climate`: Processes CMIP-ISIMIP climate NetCDF data and calculates cropland-weighted precipitation and temperature at the country level, differentiated by crop type and irrigation type. The function accepts both daily or monthly climate data.
2. `crop_calenders`: Generates crop planting months for each country based on crop calendar data [@Sacks_2010].
3. `data_aggregation`: Calculates crop growing seasons using climate variables processed by `weighted_climate` and crop calendars for both historical and projected periods. This function prepares climate and yield data for subsequent model fitting.
4. `yield_regression`: Performs regression analysis based on historical crop yields and climate variations. The default econometric model employed in `gaea` is derived from @Waldhoff_2020.
4. `yield_regression`: Performs regression analysis based on historical crop yields and climate variations. The default econometric model applied in `gaea` is from @Waldhoff_2020. User can provide other formulas as long as it is using the same data from `data_aggregation`.
5. `yield_projection`: Projects yield shocks for future climate scenarios using the fitted model.
6. `gcam_agprodchange`: Remaps country-level yield shocks to GCAM-required spatial scales (i.e., region and basin intersections) based on harvest area and converts crops to GCAM commodities. This function calculates agricultural productivity growth (a key parameter for GCAM to estimate future yield) and creates ready-to-use XML outputs for GCAM.


![The gaea workflow showing the functions and the steps of modeling crop yield shocks to climate variations using empritical econometric model. \label{fig:workflow}](workflow.jpg)
![The gaea workflow showing the functions and the corresponding outputs of modeling crop yield shocks to climate variations using empritical econometric model. \label{fig:workflow}](workflow.jpg)


# Acknowledgements
Expand Down
Binary file modified paper/workflow.jpg
Loading
Sorry, something went wrong. Reload?
Sorry, we cannot display this file.
Sorry, this file is invalid so it cannot be displayed.
126 changes: 126 additions & 0 deletions tests/testthat/helper.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,126 @@
library(dplyr)

# -------------------------------
# Prepare example data
# -------------------------------

# output directory
output_dir_i <- file.path(getwd(), 'output')

# download data
data_dir <- gaea::get_example_data(
download_url = 'https://zenodo.org/records/13179630/files/weighted_climate.zip?download=1',
data_dir = output_dir_i)

# setup variables
climate_hist_dir_i <- file.path(data_dir, 'canesm5_hist')
climate_impact_dir_i <- file.path(data_dir, 'canesm5')

climate_model_i <- 'canesm5'
climate_scenario_i <- 'gcam-ref'
member_i = 'r1i1p1f1'
bias_adj_i = 'w5e5'

cfe_i = 'no-cfe'
gcam_version_i = 'gcam7'

base_year_i = 2015
start_year_i = 2015
end_year_i = 2100
smooth_window_i = 20

diagnostics_i <- T
use_default_coeff_i <- F


# -------------------------------
# Functions
# -------------------------------

# Step 1: weighted_climate

# Step 2: crop_calendar
run_crop_calendars <- function(output_dir = output_dir_i){

output <- gaea::crop_calendars(output_dir = output_dir)

return(output)
}

# Step 3: data_aggregation
run_data_aggregation <- function(climate_hist_dir = climate_hist_dir_i,
climate_impact_dir = climate_impact_dir_i,
climate_model = climate_model_i,
climate_scenario = climate_scenario_i,
output_dir = output_dir_i){

output <- gaea::data_aggregation(climate_hist_dir = climate_hist_dir,
climate_impact_dir = climate_impact_dir,
climate_model = climate_model,
climate_scenario = climate_scenario,
output_dir = output_dir)

return(output)
}


# Step 4: yield_regression
run_yield_regression <- function(diagnostics = diagnostics_i,
output_dir = output_dir_i){

gaea::yield_regression(diagnostics = diagnostics,
output_dir = output_dir)

}


# Step 5: yield_projections

run_yield_projections <- function(use_default_coeff = use_default_coeff_i,
climate_model = climate_model_i,
climate_scenario = climate_scenario_i,
base_year = base_year_i,
start_year = start_year_i,
end_year = end_year_i,
smooth_window = smooth_window_i,
diagnostics = diagnostics_i,
output_dir = output_dir_i){

output <- gaea::yield_projections(use_default_coeff = use_default_coeff,
climate_model = climate_model,
climate_scenario = climate_scenario,
base_year = base_year,
start_year = start_year,
end_year = end_year,
smooth_window = smooth_window,
diagnostics = diagnostics,
output_dir = output_dir)

return(output)

}

# Step 6: gcam_agprodchange

run_gcam_agprodchange <- function(data = NULL,
climate_model = climate_model_i,
climate_scenario = climate_scenario_i,
member = member_i,
bias_adj = bias_adj_i,
cfe = cfe_i,
gcam_version = gcam_version_i,
diagnostics = diagnostics_i,
output_dir = output_dir_i){

output <- gaea::gcam_agprodchange(data = df_yield_projection,
climate_model = climate_model,
climate_scenario = climate_scenario,
member = member,
bias_adj = bias_adj,
cfe = cfe,
gcam_version = gcam_version,
diagnostics = diagnostics,
output_dir = output_dir)

return(output)
}
78 changes: 65 additions & 13 deletions tests/testthat/test-gaea.R
Original file line number Diff line number Diff line change
@@ -1,19 +1,71 @@
library(gaea); library(testthat);
library(sf)

timestep = 'monthly'
# climate_model = 'canesm5'
# climate_scenario = 'gcam-ref'
# time_periods = seq(2015, 2020, 1)
output_dir = file.path(getwd(), 'output')
testthat::skip_on_cran()
testthat::skip_on_travis()
testthat::skip_on_ci()

# Run tests for each function
test_that("weighted_climate runs correctly", {
testthat::expect_error(gaea::weighted_climate(timestep = NULL))
# ------------------------------------
# Testing Outputs from Major Functions
# ------------------------------------
#
# test_that("crop_calendars runs correctly", {
#
# out_crop_calendars <- run_crop_calendars()
#
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'data_processed', 'crop_calendar.csv'))
# })
#
# test_that("data_aggregation runs correctly", {
#
# out_data_aggregation <- run_data_aggregation()
#
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'data_processed', 'weather_canesm5_gcam-ref_wheat.csv'))
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'data_processed', 'historic_vars_wheat.csv'))
# })
#
#
# test_that("yield_regression runs correctly", {
#
# out_yield_regression <- run_yield_regression()
#
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'data_processed', 'reg_out_wheat_fit_lnyield_mmm_quad_noco2_nogdp.csv'))
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'data_processed', 'weather_yield_wheat.csv'))
# testthat::expect_snapshot_file(
# testthat::test_path('output', 'figures', 'model_wheat_fit_lnyield_mmm_quad_noco2_nogdp.pdf'))
# })

test_that("yield_projections runs correctly", {

out_yield_projections <- run_yield_projections()

testthat::expect_snapshot_file(
testthat::test_path('output', 'yield_impacts_annual', 'yield_canesm5_gcam-ref_wheat.csv'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'yield_impacts_smooth', 'yield_canesm5_gcam-ref_wheat.csv'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'data_processed', 'format_yield_change_rel2015.csv'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'figures', 'annual_projected_climate_impacts_canesm5_gcam-ref_wheat_fit_lnyield_mmm_quad_noco2_nogdp.pdf'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'figures', 'smooth_projected_climate_impacts_canesm5_gcam-ref_wheat_fit_lnyield_mmm_quad_noco2_nogdp.pdf'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'maps', 'map_canesm5_gcam-ref_wheat_2090.pdf'))
})

# Run tests for each function
test_that("crop_calendars runs correctly", {
crop_cal <- gaea::crop_calendars(output_dir = output_dir)

test_that("gcam_agprodchange runs correctly", {

out_gcam_agprodchange <- run_gcam_agprodchange(data = out_yield_projections)

testthat::expect_snapshot_file(
testthat::test_path('output', 'gcam7_agprodchange_no-cfe', 'agyield_impact_canesm5_r1i1p1f1_w5e5v2_gcam-ref.xml'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'gcam7_agprodchange_no-cfe', 'figures_yield_impacts', 'Wheat.png'))
testthat::expect_snapshot_file(
testthat::test_path('output', 'data_processed', 'crop_calendar.csv'))
testthat::test_path('output', 'gcam7_agprodchange_no-cfe', 'figures_agprodchange', 'Wheat.png'))
})
48 changes: 48 additions & 0 deletions vignettes/vignette.R
Original file line number Diff line number Diff line change
@@ -0,0 +1,48 @@
## ----eval=FALSE---------------------------------------------------------------
#
# # load gaea
# library(gaea)
#
# # NOTE: please change `data_dir` to your desired location for downloaded data
# data_dir <- gaea::get_example_data(
# download_url = 'https://zenodo.org/records/13179630/files/gaea_example_climate.zip?download=1',
# data_dir = 'path/to/desired/location'
# )
#
# # Path to the climate NetCDF files
# # NOTE: Each variable can have more than one file
# # historical climate data
# pr_historical_file <- file.path(data_dir, 'pr_monthly_canesm5_w5e5_rcp7_1950_2014.nc')
# tas_historical_file <- file.path(data_dir, 'tas_monthly_canesm5_w5e5_rcp7_1950_2014.nc')
#
# # projected climate data
# pr_projection_file <- file.path(data_dir, 'pr_monthly_canesm5_w5e5_rcp7_2015_2100.nc')
# tas_projection_file <- file.path(data_dir, 'tas_monthly_canesm5_w5e5_rcp7_2015_2100.nc')
#
# # Run gaea
# # The full run with raw climate data can take up to an hour
# gaea::yield_impact(
# pr_hist_ncdf = pr_historical_file,
# tas_hist_ncdf = tas_historical_file,
# pr_proj_ncdf = pr_projection_file,
# tas_proj_ncdf = tas_projection_file,
# timestep = 'monthly', # specify the time step of the NetCDF data (monthly or daily)
# historical_periods = c(1950:2014), # vector of historical years selected for fitting
# climate_model = 'canesm5', # label of climate model name
# climate_scenario = 'gcam-ref', # label of climate scenario name
# member = 'r1i1p1f1', # label of ensemble member name
# bias_adj = 'w5e5', # label of climate data for bias adjustment
# cfe = 'no-cfe', # label of CO2 fertilization effect in the formula (default is no CFE)
# gcam_version = 'gcam7', # output is different depending on the GCAM version (gcam6 or gcam7)
# use_default_coeff = FALSE, # set to TRUE when there is no historical climate data available
# base_year = 2015 # GCAM base year
# start_year = 2015, # start year of the projected climate data
# end_year = 2100, # end year of the projected climate data
# smooth_window = 20, # number of years as smoothing window
# co2_hist = NULL, # historical annual CO2 concentration. If NULL, will use default value
# co2_proj = NULL, # projected annual CO2 concentration. If NULL, will use default value
# diagnostics = TRUE, # set to TRUE to output diagnostic plots
# output_dir = 'path/to/output/folder' # path to the output folder
# )
#

Loading

0 comments on commit 06aaaff

Please sign in to comment.