Model Comparison Report: Automobile Price Prediction

Report Date: March 01, 2025 Revised: November 7, 2025

Project: Auto Price Prediction Using 1985 Auto Imports Database Evaluation Dataset: 158 training, 40 test samples | 42 features (6 PCA + 36 categorical)

Executive Summary

This report compares 10 regression algorithms for automobile price prediction. After systematic evaluation using test performance, cross-validation stability, overfitting analysis, and training efficiency, Lasso Regression (alpha=10) was selected despite not achieving the lowest test RMSE.

Key Findings:

Best Test Performance: XGBoost (RMSE = 1,663, R² = 0.942)
Most Generalizable: Lasso (CV R² = 0.894 ± 0.027, overfitting = 3.3%)
Fastest Training: Lasso (0.014 seconds)
Worst Performance: SVR (RMSE = 6,918, R² = -0.009)

The trade-off between test accuracy and generalization stability led to Lasso's selection, prioritizing robust deployment over single-test-set metrics.

Model Comparison Report: Automobile Price Prediction

1. Evaluation Framework

Dataset: Train 158 (79%) | Test 40 (21%) | Features 42 | Target: Price ($5,118 - $29,589)

Metrics:

Metric	Purpose	Interpretation
RMSE	Prediction error	Lower better (dollars)
R²	Variance explained	Higher better (0-1)
Training R²	Fit capacity	Indicates overfitting if >> Test R²
CV R² (mean ± SD)	Generalization	5-fold stability measure
Overfit (Δ R²)	Train - Test R²	Lower gap = better generalization
Training Time	Fit duration	Seconds; faster enables retraining

Selection Criteria (Weighted):

Generalization (CV R² and stability) - 40%
Accuracy (Test RMSE and R²) - 30%
Stability (Overfitting gap) - 20%
Efficiency (Training time, interpretability) - 10%

2. Base Model Comparison (Default Parameters)

Performance Rankings:

Rank	Model	Test RMSE	Test R²	Training R²	Overfit	Time (s)	CV R²
1	Gradient Boosting	1,659	0.942	0.993	0.051	0.211	0.867
2	XGBoost	1,723	0.937	0.989	0.052	0.101	0.836
3	Random Forest	1,823	0.930	0.958	0.028	0.166	0.848
4	KNN	1,864	0.927	0.881	-0.046	0.003	0.792
5	Linear Regression	1,920	0.922	0.956	0.033	0.009	0.879
6	Lasso	1,919	0.922	0.956	0.033	0.002	0.874
7	Decision Tree	2,079	0.909	0.958	0.049	0.003	0.763
8	Ridge	2,114	0.906	0.942	0.037	0.004	0.892
9	ElasticNet	2,531	0.865	0.898	0.033	0.002	0.872
10	SVR	6,918	-0.009	-0.093	-0.083	0.006	-0.118

Model Insights:

Tree-Based Models (Ranks 1-3, 7):

Gradient Boosting: Best test RMSE (1,659) but training R² = 0.993 signals overfitting. CV-test gap = 7.5 points (0.942 - 0.867) suggests test set bias.
XGBoost: Second-best RMSE with extreme training R² (0.989). CV-test gap = 10.1 points (largest), concerning for generalization.
Random Forest: Best overfitting control among trees (Δ R² = 0.028), balances accuracy with generalization.
Decision Tree: High variance (CV R² = 0.763), not competitive.

Linear Models (Ranks 5-6, 8-9):

Linear Regression & Lasso: Nearly identical performance, excellent generalization with CV-test gap = 4.3-4.8 points. Fastest training (0.002-0.009s).
Ridge: Highest CV R² (0.892) but weaker test performance (RMSE = 2,114). Default alpha may over-regularize.
ElasticNet: Worst linear model (RMSE = 2,531). Default hyperparameters clearly suboptimal.

Instance-Based (Ranks 4, 10):

KNN: Unusual underfit pattern (test R² > training R²). Poor CV R² (0.792) confirms lack of robustness.
SVR: Complete failure with negative R² on all sets. Default RBF kernel inappropriate.

Key Observations:

Tree methods dominate test RMSE (top 3) but show 7-10 point CV-test gaps
Linear models show better generalization (4-5 point gaps) but sacrifice 200-250 RMSE
Overfitting trade-off: lowest RMSE models have highest Δ R² (~0.05)
Default hyperparameters critical: ElasticNet and SVR fail, suggesting tuning potential

3. Hyperparameter Tuning

Tuning Configurations:

Model	Parameters Tuned	Grid	Folds	Time (s)	Optimal Parameters
Lasso	alpha: 10 values	10	5	5.71	alpha=10.0 (10x default)
ElasticNet	alpha: 10, l1_ratio: 10	100	5	0.82	alpha=0.0046, l1_ratio=0.6
Random Forest	n_estimators, max_depth, min_samples	36	5	13.55	n=50, depth=20, split=2
Gradient Boosting	n_estimators, lr, max_depth	27	5	6.78	n=50, lr=0.3, depth=3
XGBoost	n_estimators, lr, depth, subsample	81	5	15.15	n=200, lr=0.1, depth=3, sub=0.6

Total Tuning Time: 42 seconds (253 model fits)

Post-Tuning Performance:

Model	Test RMSE	Test R²	CV R² (Mean ± SD)	Training R²	Overfit	Time (s)
XGBoost	1,663	0.942	0.859 ± 0.027	0.997	0.056	0.161
Gradient Boosting	1,842	0.928	0.865 ± 0.032	0.997	0.068	0.060
Random Forest	1,883	0.925	0.848 ± 0.053	0.978	0.053	0.130
ElasticNet	1,968	0.918	0.893 ± 0.034	0.953	0.034	0.004
Lasso	1,987	0.917	0.894 ± 0.027	0.950	0.033	0.014

Tuning Impact:

Model	Δ RMSE	Δ CV R²	Key Finding
ElasticNet	-563	+0.021	Massive improvement (default severely over-regularized)
Lasso	+68	+0.020	Worse test, better CV (prioritized generalization)
XGBoost	-60	+0.023	Marginal gain (already near-optimal)
Gradient Boosting	+183	-0.002	Worse test (tuning favored generalization)
Random Forest	+60	+0.001	Negligible change

4. Model Selection Decision

Multi-Criteria Scoring:

Model	CV R² (40%)	Test R² (30%)	Stability (20%)	Efficiency (10%)	Total
Lasso	0.358	0.275	0.194	0.098	0.925
ElasticNet	0.357	0.275	0.193	0.100	0.925
Gradient Boosting	0.346	0.278	0.186	0.083	0.893
XGBoost	0.344	0.283	0.189	0.031	0.847
Random Forest	0.339	0.278	0.189	0.038	0.844

Decision: Lasso Regression (alpha=10.0)

Lasso and ElasticNet tied (0.925), but Lasso selected for:

Interpretability: L1 regularization zeroed 13 features (31% sparsity). ElasticNet retains all with small coefficients.
CV Stability: SD = 0.027 vs. ElasticNet SD = 0.034 (26% lower variance)
Simplicity: 1 hyperparameter (alpha) vs. 2 (alpha, l1_ratio)
Established: More widely adopted in pricing applications, easier regulatory explanation

Why Not XGBoost (Lowest Test RMSE)?

Concern	XGBoost	Lasso	Impact
Test RMSE	$1,663	$1,987	Lasso loses $324 (16% higher)
CV R²	0.859	0.894	Lasso gains 3.5 points (4% better generalization)
CV-Test Gap	8.3 pts	2.3 pts	Lasso consistent across data splits
Training R²	0.997	0.950	XGBoost memorizing training data
Interpretability	Black box (200 trees × 3 depth)	Transparent (29 coefficients)	Lasso enables business insights
Training Time	0.161s (11.5x slower)	0.014s	Lasso enables rapid retraining
Prediction Speed	25,200 ops (600x slower)	42 ops	Lasso suitable for real-time API

Trade-off Analysis:

RMSE Sacrifice: $1,987 vs. $1,663 = $324 additional error
Percentage of avg price ($12,759): 2.5%

Generalization Gain: CV R² 0.894 vs. 0.859 = +3.5 points
Improvement on new data: 4% better variance explained

Business Context: Pricing decisions round to nearest $500-$1,000
$324 difference is within rounding tolerance

Cross-Validation Validation:

Lasso outperformed XGBoost in 4 of 5 CV folds:

Fold	Lasso R²	XGBoost R²	Winner
1	0.935	0.890	Lasso (+4.5)
2	0.917	0.875	Lasso (+4.2)
3	0.867	0.820	Lasso (+4.7)
4	0.909	0.850	Lasso (+5.9)
5	0.871	0.860	XGBoost (+1.1)
Mean	0.900	0.859	Lasso (+4.1)

Interpretation: Test set not representative. Lasso wins most data splits, suggesting superior production performance.

5. Cross-Validation Analysis

Fold-by-Fold Stability:

Metric	Lasso	XGBoost	Interpretation
Mean CV R²	0.900	0.859	Lasso +4.1 points
Std Dev	0.027	0.027	Equal fold variance
Range	0.068 (0.867-0.935)	0.070 (0.820-0.890)	Similar spread
Worst fold	0.867	0.820	Lasso's worst > XGBoost mean

CV vs. Test Gap Analysis:

Model	Test R²	CV R²	Gap	Interpretation
Lasso	0.917	0.894	0.023	Consistent generalization
ElasticNet	0.918	0.893	0.025	Consistent
XGBoost	0.942	0.859	0.083	Potential test set overfit
Gradient Boosting	0.928	0.865	0.063	Test set favorability
Random Forest	0.925	0.848	0.077	Test easier than CV

Key Insight: XGBoost's 8.3-point gap (largest) suggests test set contains patterns XGBoost exploits but that don't generalize. Lasso's 2.3-point gap indicates consistent performance. In production, new data resembles CV fold distributions more than specific test set.

6. Training Efficiency

Time Comparison:

Model	Training Time	Speedup vs. Slowest
ElasticNet	0.004s	52.8x
Lasso	0.014s	15.1x
Linear Regression	0.009s	23.4x
Gradient Boosting	0.060s	3.5x
XGBoost	0.161s	1.3x
Random Forest	0.130s	1.6x
Gradient Boosting (base)	0.211s	1.0x (slowest)

Operational Implications:

Scenario	Lasso	XGBoost	Difference
Daily retraining (2,000 samples)	0.14s	1.61s	1.47s (negligible)
Hyperparameter retuning	5.71s (50 fits)	15.15s (405 fits)	9.44s
Real-time inference (per prediction)	42 operations	25,200 operations	600x faster

7. Error Distribution

Comparison:

Metric	Lasso	XGBoost	Difference
MAE	$1,482	$1,255	XGBoost 15% better
RMSE	$1,987	$1,663	XGBoost 16% better
MAPE	12.4%	1.9%	XGBoost significantly better
RMSE/MAE Ratio	1.34	1.32	Similar distribution shape

95% Confidence Intervals (for $12,759 avg car):

Lasso: $8,865 - $16,653 (±$3,894)
XGBoost: $9,499 - $16,019 (±$3,260)

Impact: 6% wider interval for Lasso acceptable given superior generalization properties.

8. Recommendations

8.1 Deployment Strategy

Primary Model: Lasso (alpha=10)

Deploy for production pricing
Use sparse coefficients for stakeholder communication
Retrain quarterly with new data

A/B Testing (Recommended):

Primary: Lasso (70% of predictions)
Challenger: XGBoost (30%)
Monitor: If XGBoost consistently outperforms over 3+ months, consider switching
Metrics: Live RMSE, prediction latency, business impact

8.2 Retraining Triggers

Retrain if:

RMSE on new data exceeds $2,500 (20% degradation)
100+ new samples collected (50% data increase)
Market changes (new brands, economic shifts)
Quarterly scheduled (best practice)

8.3 Improvement Roadmap

Short-Term (1-3 months):

Collect 200+ contemporary samples (2020-2025 data)
Test interaction terms (brand × engine-size)
Validate on modern vehicle data

Medium-Term (3-6 months):

Implement SHAP values for XGBoost interpretability
Develop A/B testing framework
Add temporal features (year, mileage)

Long-Term (6-12 months):

Transition to ensemble if XGBoost proves reliable
Explore neural networks for automatic feature learning
Build region-specific models (North America, Europe, Asia)

9. Conclusion

After evaluating 10 algorithms across base and tuned configurations, Lasso Regression (alpha=10) was selected based on multi-criteria framework prioritizing generalization, stability, and interpretability.

Key Findings:

Test Accuracy vs. Generalization: XGBoost achieved lowest test RMSE (1,663) but 8.3-point CV-test gap. Lasso showed consistent 2.3-point gap with only 16% higher RMSE.
Overfitting Trade-offs: Tree models (Training R² > 0.99) indicate memorization. Lasso's Training R² = 0.950 suggests appropriate complexity.
Hyperparameter Impact: ElasticNet saw largest improvement (+563 RMSE). Lasso gained CV stability despite slightly worse test performance.
Interpretability: Lasso's 29 sparse coefficients enable stakeholder trust and regulatory compliance. XGBoost remains black box.
Deployment: Lasso's 11.5x faster training and 600x faster prediction make it suitable for real-time APIs.

Final Verdict: Lasso selected over XGBoost represents principled trade-off: sacrificing 2.5% test accuracy ($324 RMSE) to gain 4.1 points in CV R², reduce overfitting by 2.3%, and enable full interpretability. This aligns with best practices for production ML systems where robustness and transparency outweigh marginal accuracy gains.

10. Code Snippets

Lasso Configuration:

from sklearn.linear_model import Lasso

model = Lasso(alpha=10.0, max_iter=10000, random_state=42)

XGBoost Configuration (Alternative):

from xgboost import XGBRegressor

model = XGBRegressor(
    n_estimators=200,
    learning_rate=0.1,
    max_depth=3,
    subsample=0.6,
    random_state=42
)

GridSearchCV:

from sklearn.model_selection import GridSearchCV

param_grid = {'alpha': [0.1, 0.5, 1, 5, 10, 50, 100, 500, 1000]}
grid = GridSearchCV(Lasso(), param_grid, cv=5, scoring='neg_mean_squared_error')
grid.fit(X_train, y_train)
best_model = grid.best_estimator_

Cross-Validation:

from sklearn.model_selection import cross_val_score

cv_scores = cross_val_score(model, X_train, y_train, cv=5, scoring='r2')
print(f"CV R²: {cv_scores.mean():.3f} ± {cv_scores.std():.3f}")
# Output (Lasso): CV R²: 0.899 ± 0.027

Report Prepared By: Dhanesh B. B. Contact: GitHub License: MIT

End of Model Comparison Report

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Model Comparison Report: Automobile Price Prediction

Executive Summary

Table of Contents

1. Evaluation Framework

2. Base Model Comparison (Default Parameters)

3. Hyperparameter Tuning

4. Model Selection Decision

5. Cross-Validation Analysis

6. Training Efficiency

7. Error Distribution

8. Recommendations

8.1 Deployment Strategy

8.2 Retraining Triggers

8.3 Improvement Roadmap

9. Conclusion

10. Code Snippets

FilesExpand file tree

Model_Comparison_Report.md

Latest commit

History

Model_Comparison_Report.md

File metadata and controls

Model Comparison Report: Automobile Price Prediction

Executive Summary

Table of Contents

1. Evaluation Framework

2. Base Model Comparison (Default Parameters)

3. Hyperparameter Tuning

4. Model Selection Decision

5. Cross-Validation Analysis

6. Training Efficiency

7. Error Distribution

8. Recommendations

8.1 Deployment Strategy

8.2 Retraining Triggers

8.3 Improvement Roadmap

9. Conclusion

10. Code Snippets