Skip to content

benjmcarthur/kaggle-otto

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

22 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation



Otto Kaggle Challenge

Solution to the Otto challenge hosted on Kaggle. The final submission had a score of 0.39871 which placed 9th on the leaderboard.

Structure

We used an ensemble of three layers.

  • Layer 1: 21 base learners
  • Layer 2: 3 meta learners
  • Layer 3: Non-linear combination of meta learners

The predictions of Layer 1 were used as training features for the Layer 2 meta learners, stacked according to the Variant A detailed here.

Layer 1 Models

| Model      | Data Transformation | Model               | Data Transformation |
|------------|---------------------|---------------------|---------------------|
| Deep NN    |                     | LGMB Dart           |                     |
| Deep NN    | log(X+1)            | LGMB GBDT           |                     |
| Deep NN    | Standard Scaled     | SKLearn MLP         |                     |
| Deep NN    | 0-1 Scaled          | Naïve Bayes         |                     |
| CatBoost   |                     | Naïve Bayes         | Standard Scaled     |
| ExtraTrees |                     | Random Forest       |                     |
| KNN        |  0-1 Scaled         | Softmax             |                     |
| KNN        |  0-1 Scaled         | XGBoost             |                     |
| KNN        |  0-1 Scaled         | Logistic Regression |                     |
| KNN        |  0-1 Scaled         | Logistic Regression | Standard Scaled     |
| KNN        |  0-1 Scaled         |                     |                     |

Layer 2 Models

Three models used on the second layer:

  • Deep Neural Net (pytorch)
  • XGBoost (XGB)
  • Calibrated Random Forest (SKLearn)

Layer 3

Non-linear combination of the three meta learners according to the below equation. Best results were obtained with a=0.995a=0.995a=0.995, b=1/3b=1/3b=1/3, c=2/3c=2/3c=2/3, d=0.05d=0.05d=0.05

a×(NNb×XGBc)+d×RFa \times (NN^b \times XGB^c) + d\times RFa×(NNb×XGBc)+d×RF

Files

Otto.ipynb
Data preprocessing, base models, and XGB and RF meta learners.

base-MLP.ipynb
Deep NN in pytorch, used for Layer 1 base models

meta-MLP.ipynb
Deep NN in pytorch, used as a Layer 2 meta learner

dim_reduction_otto.ipynb
Dimensionality reduction analysis of the Otto dataset

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published