Júlia Rodríguez Sánchez | August 2021
- Project description
- Business objectives
2.1. Long-term goal
2.2. Short-term goal - Datasets
3.1 Training datasets
3.2 Evaluation dataset - BERT fine-tuning
- Evaluation
5.1. Test split
5.2. Evaluation data - Web app
- Next steps
- Project structure
The Aggressive Tweet Analyzer is a web app that takes a Twitter handle as input and analyzes the last 100 tweets of that user. It runs the tweets through an NLP model (based on BERT) that classifies them as aggressive or not. Then it returns a results page that contains an aggressiveness score for the user and the tweets classified as aggressive.
The ultimate aim of this project is serve as a cyber-bullying detector for FITA Fundación, an organization that works to prevent mental illness in Spain. After working with people affected by eating disorders, conduct disorders, addictions and other mental health problems, they have come to realize that bullying is a major risk factor for developing them. They are currently working on tools to detect bullying cases earlier and support the victims to prevent them from becoming ill.
The model developed in this project detects comments that are very aggressive, so it serves as a first step to detect open attacks. However, cyber-bullying is ofter subtler, and the model needs to be further improved to detect these cases.
The most immediate goal of this project is to detect aggressive comments. I have deployed the model as a standalone app, but it could also be used in any website, forum or social media platform that wants to ban aggressive language from its posts or comments section.
All the data used for this project can be found here. See the Project structure to understand how the folders are organized.
The training datasets contain comments that are classified as different types of bullying (aggression, racism, sexism, etc) and that are sourced from different social media platforms.
The long-term application of this model will be to detect cyber-bullying cases on social media. The two most popular platforms among teenagers are Instagram and TikTok, so I decided to evaluate the model with original Instagram data that I scraped myself. Using the Instaloader library, I obtained the comments of Kevin Spacey's last 3 Instagram posts. I chose this user because he receives many comments from haters as well as supporters. The evaluation dataset consists of 22k+ comments, 17k+ after deleting the ones that contained only emojis or words shorter that two characters.
The model I used as a basis for this project is BERT, a transformer. Transformers use an attention mechanism that learns contextual relations between words (or sub-words) in a text. BERT is pre-trained on a large corpus of English text data.
To fine-tune the model I used the prebuilt TFBertForSequenceClassification
class and trained it. My intention was to use all the datasets to train a binary classifier that would sort comments as "bullying" or "not bullying". However, as I trained the model with more datasets its score decreased, so I decided to train it with the aggression data only. A multi-class classifier would be more appropriate to train with several datasets.
The weights of the fine-tuned models I tested in the notebooks can be found here.
When predicting the labels of the test split of the aggression dataset, the model achieved the following metrics:
Since the evaluation data was obtained directly from Instagram, it was not pre-labeled.
- Out of 17.749 comments, 83% were predicted as class 0 ("not aggressive") and 17% as class 1 ("aggressive").
- For the comments predicted as class 1, I manually checked if the label was correct and found that in 86% of the cases it was correct. In most of the cases where it was incorrect, the comment contained swear words used in a friendly way.
I deployed the model into production using Flask and the Twitter API.
- Flask is a micro web framework well suited for building light-weight apps.
- The Twitter API allows developers to programmatically access to public Twitter data.
In the home page of the app, users can enter a Twitter handle to analyze:
When they hit "Analyze", the app makes a request to the Twitter API to get the last 100 tweets of that username, including replies but excluding retweets. It preprocesses and tokenizes the tweets, then runs them throught the model to classify them. Finally, it renders the results page, which includes an aggressiveness score and the tweets labeled as aggressive:
The user can click on the tweets and navigate to the original tweet URL or hit "Try again" (at the bottom of the page) and go back to the home.
- Train a multi-class classifier to detect more types of bullying such as racism and sexism.
- Train the model with comments that contain swear words but are not aggressive.
- Build the Keras layers from scratch using
TFBertModel
and test more parameters. - Create spiders to monitor Instagram and TikTok in order to find victims of cyber-bullying.
GitHub repository:
- web_app folder: contains the code for the local deployment of the Aggressive Tweet Analyzer. To execute the code you need to download the model weights from the drive and create a json file with your Twitter API credentials.
- 1_EDA_Preprocessing_Tokenization.ipynb
- 2_BERT_Fine_tuning.ipynb
- 3_Evaluation.ipynb
- 4_Web_app.ipynb
Drive:
-
data:
- 1_raw_data: training datasets obtained from Mendeley Data. (original source)
- 2_clean_data: preprocessed training datasets.
- 3_tokenized_data: pickle files containing input ids, attention masks and labels ready to be inputed into the BERT model. Includes training and evaluation data.
- 4_evaluation_data:
- raw_comments: 3 json files containing comments from 3 Instagram posts, extracted directly from the platform with Instaloader.
- clean_evaluation_data.csv: preprocessed comments.
- labeled_evaluation_data.csv: comments labeled by the model.
- labeled_evaluation_data_checked.csv: same as the provious file but with an added column
wrong_label
.
-
models: saved weights of the fine-tuned BERT models.
- histories: saved histories for the fine-tuning of each model.