Use Key NLP techniques to classify news articles into categories: Bag_of_Words (tf-Idf), word embeddings and BERT language model
The original project and work was published in towardscience here. Adaptation of the code was made with regards to the BERT language modeling part. The rest cames from the original author of this article.
The Dataset used can be found at Kaggle here. This dataset contains around 200k news headlines from the year 2012 to 2018 obtained from HuffPost. Each news headline has a corresponding category (politics, entertainement, science,...). There are 30 categories in total.