This project focuses on predicting household energy consumption using smart metering data and machine learning models. It leverages temporal patterns, seasonal effects, and household behavior insights to forecast daily energy usage accurately.
The project uses the eMARC: Insights from Smart Metering Data dataset, which contains:
eMARC daily consumption.csv– Daily energy usage in kWh for each household.eMARC load blocks.csv– Granular load block readings across intervals.Household-Deployment basic info.csv– Metadata about households, regions, and deployment configurations.
Source: eMARC Dataset
- Merged daily consumption data with household metadata.
- Engineered time-based features:
Day,Month,Weekday,Is_Weekend,Season. - Handled missing values and normalized consumption values.
- One-hot encoding for seasons.
- Optionally included household size and appliance count as custom features.
- Trained and evaluated the following models:
- XGBoost Regressor
- LSTM (Long Short-Term Memory)
- Facebook Prophet
- Root Mean Square Error (RMSE) used for model performance comparison:
| Model | RMSE |
|---|---|
| XGBoost | 0.0721 |
| LSTM | 0.3623 |
| Prophet | 3.3566 |
✅ XGBoost showed the best performance and generalization.
- Seasonal trends and weekend patterns strongly influence energy usage.
- High correlation found between household metadata and consumption.
- The final XGBoost model was deployed in a user-friendly interface using Streamlit (not included in this repo).