This repo contains the files for the Grduate Project for DATA 200 at Berkeley.
/analysis
contains all jupyter notebooks used for data processing, modeling, and analysis.- files named
initial_*
are uncleaned initial data exploration - a subfolder
/analysis_figures
contains any photos included in the analysis notebooks
- files named
/data
contains all initial and processed data/figures
contains all figures created in the analysis notebooks that were saved for the report (plt.savefig
was used to save them)/narrative
contains the.tex
files used to compile the report, along withreport.pdf
which is the final write up
Since some notebooks process data and then save the output to be used in other notebooks, the data processing pipeline is included below to enable replicaton:
covid_data_processing.ipynb
,vaccine_data_processing.ipynb
, andweather_data_processing.ipynb
read and process the initial datadata_merge_and_eda.ipynb
combines the data outputed from the above and performs some exploratory data analaysisbaseline_modeling.ipynb
,weather_modeling.ipynb
anddeath_rate_models.ipynb
perform the data analysis describe in the report using the aggregated data from the previous step