Skip to content

Identify and classify toxic online comments using NLP and Machine-Learning algorithms

Notifications You must be signed in to change notification settings

Kyziridis/Toxic-Comment-Classification-Challenge

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

2 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Toxic Comment Classification Challenge

Identify and classify toxic online comments.

As we all understand and experience, the occurrence of bad comments, which are often categorized as ”hate”, ”toxic”, ”insulting” or ”threat”, is an increasing problem on social media platforms and portals like Yahoo! Answers, Quora and Wikipedia. At first, anyone would think that we should convince people against posting this kind of stuff but then, since we cannot tell every single one of the internet users to not post bad views/speech, it’s better if we can devise and implement a system that prevents users from posting bad comments.

Contents

  • Data Exploration
  • Text-Preprocess
  • Feature extraction
  • Manipulation of Imbalanced-Data
  • Multilabel Classification approaches (Binary Relevance, Classifier Chains, etc..)
  • Classification Methods (Logistic Regression, Naive-Bayes)

Goal

  • Optimize Multilabel-CLassification
  • Optimize Mean-Area-Under-ROC-Curve

For more information please read the Report.pdf

Support GNU/Linux >_