This mini-project aims to build a classifier to recognize the type of harmfulness in tweets (non-harmful, cyberbullying and hate-speech). The task originates from the Poleval competition in 2019 (task 6-2).
The project explores:
- lemmatization
- feature extraction methods and their parameters
- class balancing methods
- different classification algorithms
The progress, results and comments can be found in the main.ipynb notebook, as well as summary and conclusions. The utils.py file contains utilities for the project.