In this practice project, a labeled (spam or ham) email text dataset is processed using the Bag of Word (BoW) method and a Naive Bayes classifier is trained with the processed dataset.
- The dataset is loaded and preprocessed using the BoW method available in sklearn (sklearn.feature_extraction.text.CountVectorizer).
- A Naive Bayes model is built.
- The model is trained with the processed dataset.
- An F1 score of 0.95 is achieved.