This project demonstrates a complete end-to-end Machine Learning workflow using a Random Forest Classifier, along with an interactive Streamlit dashboard for real-time prediction and visualization.
The model predicts whether a customer will purchase a product based on:
- Age
- Estimated Salary
- 🔹 Random Forest Classification (Entropy Criterion)
- 🔹 Feature Scaling using StandardScaler
- 🔹 Confusion Matrix Visualization
- 🔹 ROC Curve & AUC Score
- 🔹 Bias–Variance Analysis
- 🔹 Decision Boundary Visualization (Train & Test)
- 🔹 Model Serialization using Pickle
- 🔹 Interactive Streamlit Dashboard with Dark UI
Dataset: Social Network Ads
Features Used:
- Age
- Estimated Salary
Target Variable:
- Purchased (0 = No, 1 = Yes)
- Data Loading & Preprocessing
- Train–Test Split (80% / 20%)
- Feature Scaling
- Random Forest Model Training
- Model Evaluation
- Accuracy
- Confusion Matrix
- ROC Curve & AUC
- Bias vs Variance
- Decision Boundary Visualization
- Model & Scaler Serialization
- Streamlit Dashboard Deployment
- Accuracy Score: Displayed in Dashboard
- Bias: Training Score
- Variance: Test Score
- AUC Score: ROC Curve
- Confusion Matrix
- ROC Curve
- Decision Boundary (Training Set)
- Decision Boundary (Test Set)
- 🎚️ Sidebar sliders for Age & Salary
- 🔮 Real-time Purchase Prediction
- 📊 Model Metrics Display
- 📈 ROC Curve Visualization
- 🧮 Confusion Matrix
- 🌙 Modern Dark-Theme UI
- Python
- NumPy
- Pandas
- Matplotlib
- Scikit-learn
- Streamlit
- Pickle
- Add Cross-Validation
- Hyperparameter Tuning
- Multiple Classifiers Comparison
- Deployment on Streamlit Cloud / Hugging Face Spaces
- Feature Importance Visualization