This is my solution for the precisionFDA Brain Cancer Predictive Modeling and Biomarker Discovery Challenge
The main steps in my solution are as follows:
- Feature selection with L0Learn 'Fast Best Subset Selection' (Hazimeh et al., 2018) for both Gene Expression and CNV data
- Then gradient boosting decision tree models are applied as a predictive model to the selected features with l0learn
- Three different boosting models are used: XGBoost (Chen and Guestrin, 2016), LightGBM (Ke et al., 2017), and CatBoost (Prokhorenkova et al., 2018).