Skip to content

Toxic comment classification challenge kaggle. Performing Text analytics and modeling with different approaches for text classification.

Notifications You must be signed in to change notification settings

abyanjan/Toxic-Comment-Classification

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

4 Commits
 
 
 
 

Repository files navigation

Toxic-Comment-Classification

The is a text classification problem performed on solving the Toxic Comment Classification Challenge at Kaggle.

-As a multi-label classification task, different appraoches are applied to solving the problem.

  • For text vectorization, bag of words and word embeddings are applied

The blog post written for the project can be viewed at https://ajay-byan.medium.com/toxic-comments-classification-8d8a9a9b99e6

Modeling Approaches

  • Binary Relevance
  • Classifier Chain
  • LSTM
  • Transfer Learning with BERT

Libraries

  • pandas==1.1.5
  • numpy==1.19.5
  • matplotlib
  • seaborn==0.11.1
  • scikit-learn
  • scikit-multilearn==0.2.0
  • texthero==1.0.9
  • wordcloud==1.8.1
  • gensim==3.6.0
  • tensorflow==2.4.1
  • tensorflow-hub==0.11.0
  • tensorflow-text==2.4.3
  • tf-models-official==2.4.0

The final score obtained with BERT model at Kaggle is mean AUC ROC score of 0.98172

About

Toxic comment classification challenge kaggle. Performing Text analytics and modeling with different approaches for text classification.

Topics

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published