CatBoost is a machine learning algorithm that uses gradient boosting on decision trees. It is available as an open source library.It is a powerfull library build by Yandex community. This note book deals with indepth understanding of how to implement the catboost algorithm on the data and improving the accuracy of the model.
Main advantages of CatBoost:
- Superior quality when compared with other GBDT libraries on many datasets.
- Best in class prediction speed.
- Support for both numerical and categorical features.
- Fast GPU and multi-GPU support for training out of the box.
- Visualization tools included.
Implementation of CatBoost in machine learning in Python on brest cancer dataset
- All CatBoost documentation is available here
- dataset link
- https://catboost.ai/docs/concepts/python-reference_parameters-list.html
Anna Veronika Dorogush, Andrey Gulin, Gleb Gusev, Nikita Kazeev, Liudmila Ostroumova Prokhorenkova, Aleksandr Vorobev "Fighting biases with dynamic boosting". arXiv:1706.09516, 2017.
Anna Veronika Dorogush, Vasily Ershov, Andrey Gulin "CatBoost: gradient boosting with categorical features support". Workshop on ML Systems