Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Use grid_search in notebook and add visualization
Addresses issues with example notebook brought up at July 26 meetup: 1. Standardize training and testing separately 2. Use AUROC on continuous rather than binary predictions Clean up variable names. Simplify to to testing/training terminology. No more "hold out". Use sklearn.grid_search.GridSearchCV to optimize hyperparameters. Expand range of l1_ratio and alpha. Specify random_state in GridSearchCV, which should prevent having to set the seed manually using the random module. Grid search should enable a more modular architecture enabling swapping in different algorithms as long as their `param_grid` is defined. Add exploratory analysis of predictions. Add parallel processing using joblib to speed up cross validation. Remove median absolute deviation feature selection. This step had to be removed or modified because it used testing data for feature selection.
- Loading branch information