House Price Prediction Model

Overview

This project implements a machine learning model to predict house prices using various features such as year built, number of bedrooms, amenities, and location data. The model uses Random Forest Regression and compares its performance with OLS (Ordinary Least Squares) and Decision Tree approaches.

Data Description

The dataset contains housing data from 2016-2017 with the following key features:

Property characteristics (bedrooms, bathrooms, square footage)
Building information (year built, number of floors)
Financial details (maintenance costs, taxes, common charges)
Amenities (garage, pets allowed, fuel type)
Property type (coop/condo)
Kitchen and dining room specifications

Technical Implementation

Data Preprocessing

Feature selection reducing to 25 key features
Binary variable conversion to 0/1 dummies
Min-Max scaling for year built feature
Missing value imputation using Random Forest
Categorical variable encoding

Models Implemented

Random Forest Regressor (Primary Model)
OLS (Ordinary Least Squares) Regression
Decision Tree Regressor (for comparison)

Key Features

Automated missing value handling
Feature importance analysis
Model performance comparison
Visualization of decision trees
Price prediction functionality for new properties

Model Performance

The project includes various performance metrics:

R-squared score
Root Mean Squared Error (RMSE)
Training and testing set evaluations
Feature importance rankings

Requirements

- pandas
- numpy
- statsmodels
- scikit-learn
- matplotlib
- graphviz
- PIL

Usage

Load and preprocess the data:

data_frame = pd.read_csv("housing_data_2016_2017.csv")

Train the model:

rf_model = RandomForestRegressor(n_estimators=100, random_state=42)
rf_model.fit(X_train, y_train)

Make predictions:

predicted_price = rf_model.predict(new_house_df)

File Structure

main.ipynb: Main Jupyter notebook containing all analysis and model implementation
- Data cleaning and preprocessing
- Model training and evaluation
- Visualization and analysis
- Prediction functionality

Future Improvements

Feature engineering optimization
Hyperparameter tuning
Additional model architectures
Cross-validation implementation
Enhanced visualization features

Contributors

This project is maintained by Carlos Vega.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
data		data
.DS_Store		.DS_Store
Housing_Price_Predictive_Model.pdf		Housing_Price_Predictive_Model.pdf
README.md		README.md
main.ipynb		main.ipynb
regression_tree_top_layers		regression_tree_top_layers
regression_tree_top_layers.png		regression_tree_top_layers.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

House Price Prediction Model

Overview

Data Description

Technical Implementation

Data Preprocessing

Models Implemented

Key Features

Model Performance

Requirements

Usage

File Structure

Future Improvements

Contributors

About

Releases

Packages

Languages

iCarlosVega/House_Pricing_Model

Folders and files

Latest commit

History

Repository files navigation

House Price Prediction Model

Overview

Data Description

Technical Implementation

Data Preprocessing

Models Implemented

Key Features

Model Performance

Requirements

Usage

File Structure

Future Improvements

Contributors

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages