Table of Contents | |
1. Project Overview | 5. Visualizations |
2. Data Description | 6. Conclusion |
3. Data Cleaning | 7. References |
4. Methodology | |
This project uses data from Lucas Davis' 2004 paper titled "The Effect of Health Risk on Housing Values: Evidence from a Cancer Cluster", published in the American Economic Review. The paper investigates the effect of a child cancer cluster in Churchill County on housing prices, estimating the willingness of residents to pay to avoid environmental health risks. The goal of this project is to replicate the analysis using data from real estate transactions in two counties: Churchill and Lyons.
The data can be found by following the link on the AER’s website which will take you to the ICPSR’s data repository, or in this repositories data folder.
The dataset contains the following columns:
sales
: The sales price of the home.sale_yr
: The year the house was sold.cc
: County identifier (Churchill or Lyon).home_type
: Type of home (single-family, multi-family, etc.).
sale_yr
will be used to identify the timeline surrounding the cancer cluster event (year 2000).cc
will help us differentiate between Churchill and Lyon counties.
Before proceeding with the analysis, we clean the dataset:
- Remove missing or invalid values.
- Convert columns to appropriate data types (e.g.,
sale_yr
to integer). - Filter the data to only include home sales between 1995 and 2005 (around the cancer cluster event).
- Create indicator variable for Churchill county
temp1<-temp1[!is.na(temp1$date),]
temp1<-temp1[temp1$usecode==20,]
temp1<-temp1[temp1$date<=20001300,]
# generate two new variables: a Churchill county indicator, cc and a Lyon County indicator, lc.
temp1$cc<-1
temp1$lc<-0
The core analysis uses a Difference-in-Differences (DID) approach to estimate the impact of the cancer cluster on housing prices. The DID approach compares the change in housing prices in Churchill County (the treatment group) before and after the cancer cluster emergence, relative to Lyons County (the control group), which is assumed to be unaffected.
The model is specified as follows:
Where:
log(sale_price)
: Log-transformed sales price of the home.Post
: Indicator variable for years after 1999.cc
: Indicator variable for Churchill County.Post * cc
: Interaction term capturing the differential effect on Churchill County post-cancer cluster.epsilon
: Error term.
The sales prices are adjusted for inflation using the Nevada Home Price Index (nvhpi
) available for each quarter. The real sales prices are calculated as:
nvhpi<-read_dta("price.dta")
nvhpi<-as.data.frame(nvhpi)
tempn <- merge(temp, nvhpi[, c("year", "quarter", "nvhpi")],
by.x = c("sale_yr", "q"),
by.y = c("year", "quarter"),
all.x = TRUE)
tempn$adj_index<- (tempn$sales*100)/tempn$nvhpi
- Regression Model: This model includes county, year, and other factors as independent variables to predict home prices.
- The regression analysis provides insights into the effect of the cancer cluster on housing prices, particularly in Churchill County after the emergence of the cancer cluster in 2000.
β0: Houses in the control group (Lyon County) have an average log(house price) of 11.519
β1: After the treatment, the log(house price) increases by approximately 23.1%
β2: Being in the treatment group (Churchill County) is associated with a 4.8% decrease in log real sales
β3: The combined effect of Treatment and Churchill County reduces log real sales by approximately 7.6%
- Difference-in-Differences (DiD): We apply the DiD methodology to analyze if the cancer cluster event had a differential impact on home prices in Churchill County relative to Lyon County.
- Key coefficients from the DID regressions are interpreted to assess how prices in Churchill County diverged from Lyons County after the cancer cluster began.
- Plot illustrating the trend of average home prices over time in both counties are provided, with confidence intervals for the estimates.
- Plot illustrating the estimated effect of Event Study
The first visualization tracks the trend of home prices over time for both Churchill and Lyon counties. We observe if there was any significant price deviation around the year 2000, which corresponds to the identified cancer cluster event.
We perform an event study to analyze the impact of the cancer cluster event in 2000 on home prices. We define two periods: Pre-Event (before 2000) and Post-Event (after 2000). The key steps are:
- Define the Event Window: Homes are categorized based on their sale year (pre and post 2000).
- Calculate Abnormal Returns: We compare the mean sales price of homes before and after the event for both counties.
- Plot Results: We display the mean home prices with confidence intervals to visually identify any significant changes.
The analysis provides insights into the possible effects of the cancer cluster on home prices. By using regression models, the Difference-in-Differences approach, and an event study, we can conclude whether or not the cancer cluster had a significant impact on housing prices in Churchill County compared to Lyon County.
- Investigating additional variables (e.g., distance from the cancer cluster) might help refine the results.
- Expanding the sample to include more counties in control and treatment groups.
Davis, Lucas W. “The Effect of Health Risk on Housing Values: Evidence from a Cancer Cluster.” American Economic Review, vol. 94, no. 5, Nov. 2004, pp. 1693–1704, https://doi.org/10.1257/0002828043052358.