Hate Speech Detection on Twitter

This project focuses on the detection of hate speech on Twitter using various natural language processing (NLP) and machine learning techniques. The project explores different feature extraction methods, including TF-IDF, sentiment analysis, and Doc2Vec, and evaluates multiple machine learning models to determine the most effective approach.

Feature Extraction

Word Cloud

Visualizing the most commonly used words in the dataset through a word cloud.

TF-IDF

TF-IDF feature extraction to transform the text data into numerical features.

Sentiment Analysis

Using VADER sentiment analysis to extract sentiment-related features from the tweets.

Doc2Vec

Training a Doc2Vec model and extracting document vectors to represent the tweets in vector space.

Model Training and Evaluation

Logistic Regression

Training and evaluating a logistic regression model for hate speech detection.

Random Forest

Training and evaluating a random forest classifier for hate speech detection.

Naive Bayes

Training and evaluating a Naive Bayes classifier for hate speech detection.

Support Vector Machine (SVM)

Training and evaluating a Support Vector Machine for hate speech detection.

Comparison of Models

Visualizing the accuracy of different models to compare their performance.

Results

The accuracy and performance of each model are presented in a comparison chart. The logistic regression and support vector machine models performed better than the others.

Visualization

Word Cloud

Word clouds for the entire dataset and for hate and offensive speech specifically.

Confusion Matrix

The confusion matrix helps to understand the misclassifications made by the model. It provides insights into the performance of the model by showing the true and predicted values for each class.

Conclusion

This project demonstrates the effectiveness of various NLP techniques and machine learning models in detecting hate speech on Twitter. The results highlight the importance of feature extraction methods and model selection in achieving high accuracy and reliable performance in hate speech detection tasks.

Name		Name	Last commit message	Last commit date
Latest commit History 4 Commits
Hate-Speech Detection.ipynb		Hate-Speech Detection.ipynb
HateSpeechData.csv		HateSpeechData.csv
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Hate Speech Detection on Twitter

Table of Contents

Installation

Dataset

Data Preprocessing

Feature Extraction

Word Cloud

TF-IDF

Sentiment Analysis

Doc2Vec

Model Training and Evaluation

Logistic Regression

Random Forest

Naive Bayes

Support Vector Machine (SVM)

Comparison of Models

Results

Visualization

Word Cloud

Confusion Matrix

Conclusion

About

Releases

Packages

Languages

aarryasutar/Hate_Speech_Detection

Folders and files

Latest commit

History

Repository files navigation

Hate Speech Detection on Twitter

Table of Contents

Installation

Dataset

Data Preprocessing

Feature Extraction

Word Cloud

TF-IDF

Sentiment Analysis

Doc2Vec

Model Training and Evaluation

Logistic Regression

Random Forest

Naive Bayes

Support Vector Machine (SVM)

Comparison of Models

Results

Visualization

Word Cloud

Confusion Matrix

Conclusion

About

Topics

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages