This project is a comprehensive analysis of the real estate market in Seattle. Using a Jupyter Notebook, we explore various trends, pricing factors, and the impact of economic indicators on the housing market.
- To identify patterns and trends in the real estate prices in Seattle.
- To analyze the relationship between housing features and market prices.
- Quantify the relationship between housing prices and first floor square footing.
- Compare statistical metrics of properties with vs. without waterfront views.
The analysis is conducted using a dataset that includes information on housing features, sale prices, and dates of transactions. The dataset contains the following key columns:
Price
: Sale price of the homeDate
: Date of the saleBedrooms
: Number of bedroomsBathrooms
: Number of bathroomsSqft_Living
: Square footage of the living spaceSqft_Lot
: Square footage of the lotFloors
: Number of floorsWaterfront
: Whether the home is on the waterfrontView
: Quality of the view from the homeCondition
: Overall condition of the homeGrade
: Overall grade given to the housing unitSqft_Above
: Square footage of the house apart from the basementSqft_Basement
: Square footage of the basementYear_Built
: Year when the house was builtYear_Renovated
: Year when the house was renovatedZipcode
: Zip code areaLat
: Latitude coordinateLong
: Longitude coordinate
The notebook starts with
- Data cleaning
- Data wrangling, followed by
- exploratory data analysis (EDA) to understand the distribution and relationship of variables.
- Linear regression
- Coefficient determination
- Ridge regression
- Finding the min, first quartile, median, third quartile, and max through boxplots.
- Visualizations to support our analysis and employ statistical methods to draw conclusions.
- Jupyter Notebook: As the coding canvas for python programming.
- Excel: As the main database file to extra and manipulate a dataframe from.
- Pandas: For data manipulation and analysis.
- Matplotlib/Seaborn: For creating static, interactive, and informative visualizations.
- sklearn: For for linear and polynomial regression modeling, machine learning, and standardization.
Assuming all other factors remain constant, this function means that the house price in King County increases by approximately $268.47 per each additional square foot of space above the ground level. The relationship between home price and above square footage can be modeled by the following function: f(x)=268.47x+59953.19
R^2=49.29% suggests that the variability in the house prices can be explained by the square footage of the home (independent varaible 'sqft_living'). In other words, the squre footage of the living space accounts for nearly half of the observed varaition in house prices.