Skip to content

Latest commit

 

History

History
148 lines (121 loc) · 5.9 KB

README.md

File metadata and controls

148 lines (121 loc) · 5.9 KB

Greenness segregation shapes mental health racial inequality in the U.S.

Introduction

This repo is the source code for paper: Greenness segregation shapes mental health racial inequality in the U.S..

Folder Structure

├── MentalHealthInequity
│   ├── Data
│   │   ├── ACS
│   │   │   ├── CensusTract
│   │   │   │   ├── DP02-CT
│   │   │   │   │     ├── ACSDP5Y2019.DP02-Data.csv
│   │   │   │   ├── DP03-CT
│   │   │   │   ├── DP05-CT
│   │   │   ├── County
│   │   │   │   ├── DP02-County
│   │   │   │   ├── DP03-County
│   │   │   │   ├── DP05-County
│   │   ├── Boundary
│   │   │   ├── cb_2019_us_bg_500k
│   │   │   ├── cb_2019_us_county_5m
│   │   │   ├── cb_2019_us_tract_500k
│   │   │   ├── cb_2019_us_nation_5m
│   │   ├── PLACES (Please put the PLACES data here)
│   │   ├── Trust_for_Public_Land
│   │   │   ├── ParkServe_Shapefiles (Please put the ParkServe data here)
│   │   ├── WorldCover
│   │   │   ├── US (Please put the WorldCover data here)
│   │   ├── WorldPop
│   │   │   ├── usa_ppp_2019.tif (Please put the WorldPop data here)
│   ├── src
│   │   ├── data_download
│   │   ├── preprocess
│   │   ├── spark_for_safegraph
│   │   ├── Fig1_ABC.py
│   │   ├── Fig1_D.py
│   │   ├── Fig2_BC.py
│   │   ├── Fig2_EF.py
│   │   ├── Fig3.py
│   │   ├── Fig4.py

System Requirement

Installation Guide

Typically, a modern computer with fast internet can complete the installation within 10 mins.

  1. Download Anaconda according to Official Website, which can be done by the following command (newer version of anaconda should also work)
wget -c https://repo.anaconda.com/archive/Anaconda3-2023.09-0-Linux-x86_64.sh
  1. Install Anaconda through the commandline guide. Permit conda init when asked.
./Anaconda3-2023.09-0-Linux-x86_64.sh
  1. Quit current terminal window and open a new one. You should be able to see (base) before your command line.

  2. Use the following command to install python environment

conda create -n MentalHealth python=3.11
conda activate MentalHealth
pip install ipython pandas==2.1.3 matplotlib statsmodels plotly geopandas seaborn pathlib shapely rasterio scipy

(Optional) If you need to exit the environment for other project, use the following command.

conda deactivate 

Prepare the Data

[Necessary] For reproducing the main results

We provide the necessary data in this Google Drive link. Please download and put it in the root directory of this project.

[Optional] Starting from the beginning

ACS data

Please download the 2019 ACS 5-year estimate for DP02, DP03, DP04, DP05 in both census tract and county level, uncompress and put the main data file in Data/ACS according to the direction of Folder Structure.

Boundary data

Please download the cartographic boundary files of census block groups, census tracts, counties and nation in 2019 from United States Census Bureau.

PLACES data

Please download the PLACES data in census tract level. Note that there is a 2-year lag between the release date and the data sampling.

ParkServe data

Please download the ParkServe data shapefile.

WorldCover data

We provide a python script to automatically download the WorldCover data for the WorldCover data.

Run the following code from root directory:

python ./src/data_download/download_worldcover_data.py

WorldPop data

Please download the WorldPop data for United States in 2019 year.

SafeGraph data

The SafeGraph data can be purchased from Dewey.

Run the Code

Preprocess the data

Note: this part is not necessary. We have provided the pre-processed data file described earlier.

python ./src/preprocess/0_make_all_ct_data.py
python ./src/preprocess/1_process_park_data_census_tract.py
python ./src/preprocess/2_process_landuse_data_census_tract.py
python ./src/preprocess/3_post_process_dynamic_visit.py  # It requires processed SafeGraph data. We provide pyspark code of generating such processed SafeGraph data in ./src/spark_for_safegraph

After performing these steps, you will get the following files. These files are available in the above Google Drive link.

File Name
census_tract_data_all_with_park_2019.parquet
census_tract_data_all_with_park_with_landuse_2019.parquet
tract_visit_all_US_within_county_2019.parquet
tract_visit_selected_county_with_google_2019.parquet
park_visit_all_US_within_county_2019.parquet
park_tract_bipart_all_us_within_county_2019.parquet

By running the following command, you will get the corresponding *.pdf that reproduces the figures in our main manuscript. The expected run time should be in few secends.

Figure 1

python ./src/Fig1_ABC.py
python ./src/Fig1_D.py

Figure 2

python ./src/Fig2_BC.py
python ./src/Fig2_EF.py

Figure 3

python ./src/Fig3.py

Figure 4

python ./src/Fig4.py