Solution to the Otto challenge hosted on Kaggle. The final submission had a score of 0.39871 which placed 9th on the leaderboard.
We used an ensemble of three layers.
- Layer 1: 21 base learners
- Layer 2: 3 meta learners
- Layer 3: Non-linear combination of meta learners
The predictions of Layer 1 were used as training features for the Layer 2 meta learners, stacked according to the Variant A detailed here.
| Model | Data Transformation | Model | Data Transformation |
|------------|---------------------|---------------------|---------------------|
| Deep NN | | LGMB Dart | |
| Deep NN | log(X+1) | LGMB GBDT | |
| Deep NN | Standard Scaled | SKLearn MLP | |
| Deep NN | 0-1 Scaled | Naïve Bayes | |
| CatBoost | | Naïve Bayes | Standard Scaled |
| ExtraTrees | | Random Forest | |
| KNN | 0-1 Scaled | Softmax | |
| KNN | 0-1 Scaled | XGBoost | |
| KNN | 0-1 Scaled | Logistic Regression | |
| KNN | 0-1 Scaled | Logistic Regression | Standard Scaled |
| KNN | 0-1 Scaled | | |
Three models used on the second layer:
- Deep Neural Net (pytorch)
- XGBoost (XGB)
- Calibrated Random Forest (SKLearn)
Non-linear combination of the three meta learners according to the below equation. Best results were obtained with a=0.995a=0.995a=0.995, b=1/3b=1/3b=1/3, c=2/3c=2/3c=2/3, d=0.05d=0.05d=0.05
a×(NNb×XGBc)+d×RFa \times (NN^b \times XGB^c) + d\times RFa×(NNb×XGBc)+d×RF
Otto.ipynb
Data preprocessing, base models, and XGB and RF meta learners.
base-MLP.ipynb
Deep NN in pytorch, used for Layer 1 base models
meta-MLP.ipynb
Deep NN in pytorch, used as a Layer 2 meta learner
dim_reduction_otto.ipynb
Dimensionality reduction analysis of the Otto dataset