Replication materials for paper Glyphosate exposure and GM seed rollout unequally reduced perinatal health
We run the analysis using make
version 4.4.1
and R
version 4.4.1
. We use renv
to manage packages. To get started, install renv
and run renv::restore()
to download and install the package versions used in this project, which are recorded in the renv.lock
file.
make data-clean
will generate all of the intermediate files we need for the analysis, which takes about 4 minutes to run. This does not run two categories of targets: downloading data and the water ML pipeline. Those both take a while to run, along with the data we downloaded manually---they are grouped together in the data/download-manual
, data/download-script
, and data/watershed
directories.
make data-raw
will create all of the files in the data/raw
directory, make data-clean
creates all of the files in the data/clean
directory, make water
creates all of the files in the data/watershed
directory, and make data-download
creates all of the files in the data/download-script
directory.
For analysis:
make desc-figs
will create the descriptive time series plots and maps, as well as some appendix figuresmake cnty-results
runs the county level analysis (event studies, DiD, TSLS for many outcomes and different treatment vars) and creates the event study figures. It also does the Ag district level analysismake predict-bw
runs scripts to train birthweight prediction models and generate predictions for each birthmake micro-mods
runs the birth-level analysismake micro-results
creates figures from the birth-level analysis
The following API keys are required, save them to .Renviron with with usethis::edit_r_environ()
- USDA QuickStats API saved in .Renviron as
NASS_KEY
- Census API saved in .Renviron as
CENSUS_KEY
- BEA API saved in .Renvoron as
BEA_KEY
Most data required for our analysis is included in this repository, but some files are too large or not allowed to be shared publicly.
Instructions on how to get access to the restricted Births (Natality) and Deaths (Mortality) files are on the NCHS website. Our primary analysis uses the natality files between 1990 and 2013. We do supplemental analysis that uses the mortality files over the same time period. Once obtained, the raw natality and mortality files go into: data/health-restricted/raw
.
Download the HydroBASINS data for North America and copy the contents into data/watersheds/hydrobasins
.
There are two pieces of data from the USGS's gridded soil survey needed for water analysis, which the USGS hosts on Box here:
- Download
MUKEY Grids (TIF)/FY2021_gNATSGO_mukey_grid.zip
and place the contents here:data/watersheds/soil-quality/gNATSGO_mukey_grid
- Download
MUKEY Grids (TIF)/FY2021_gNATSGO_Tabular_CSV.zip
and place the contents here:data/watersheds/soil-quality/gNATSGO_Tabular_CSV
GAEZ data for Attainable Yield is in data/download-manual/attainable-yield/
. These can also be downloaded using the following links:
- Soy high: res05/CRUTS32/Hist/8110H/ylHr_soy.tif
- Soy low: res05/CRUTS32/Hist/8110H/ylLr_soy.tif
- Corn high: res05/CRUTS32/Hist/8110H/ylHr_mze.tif
- Corn low: res05/CRUTS32/Hist/8110H/ylLr_mze.tif
- Cotton high: res05/CRUTS32/Hist/8110H/ylHr_cot.tif
- Cotton low: res05/CRUTS32/Hist/8110H/ylLr_cot.tif
The USDA Agriculutral Statistic District to County FIPS crosswalk is here.