Skip to content

areumb/hatespeech-offensive

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

60 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Modeling Offensive Language as a Distinct Class for Hate Speech Detection

This repository is part of my master's thesis project,"Modeling Offensive Language as a Distinct Class for Hate Speech Detection" (Kim, 2025), supervised by Dr. Antske Fokkens and Dr. Hennie van der Vliet. The project explored how modeling offensive (but not hateful) language as a distinct class impacts the task of detection of hate speech. Using a ternary classification scheme (Hateful, Offensive, Clean), I fine-tuned and evaluated a RoBERTa-base model in the full three-class setup and in binary variants where two classes are merged or the offensive class is removed (Hate vs. Non-hate, Non-clean vs. Clean, and Hate vs. Clean). The code used in this study includes my modifications and extensions of Khurana et al. (2025)'s code.

HateCheck-XR

In the project, to probe model behavior beyond set-internal performance, I revised both HateCheck (Röttger et al., 2021) and an existing extension by Khurana et al. (2025), aligning them with the ternary system by re-annotating them and correcting errors present in the extension. The resulting dataset, HateCheck-XR, is available in this repository, under "dataset" in a csv format.

Folder structure

Project
├─ hs_generalization/  
│  ├─ __init__.py
│  ├─ modes.py
│  ├─ train.py
│  ├─ test.py
│  └─ uitls.py    
├─ tools/
│  └─ run_many.py          
├─ configs/
│  └─ example.json
├─ dataset/
│ ├─ davidson/
│ └─ hatecheck/hatecheck-xr.csv
├─ requirements.txt
└─ README.md

Set-up

Set up the environment like the following:

# Create environment.
conda create -n hs-generalization python=3.9
conda activate hs-generalization

# Install packages.
python setup.py develop
pip install -r requirements.txt

Training

Create a config file and run the following:

python -m hs_generalization.train -c configs\train\example.json

Evaluation

Create a config file and run like the example:

#running based on a single seed/checkpoint
python -m hs_generalization.test -c configs\test\example.json --dataset davidson --eval-mode 3class --train-mode 3class --seed 5 --checkpoint "outputs\davidson\RoBERTa-base\3class\RoBERTa-base_0.pt"

# If you want to run multiple files at once, you can use run_many.py like the following example:
python tools/run_many.py ^
  -c configs/test/test.json ^
  --dataset hatecheck_xr ^
  --eval-mode 3class ^
  --train-mode 3class ^
  --seeds 7 222 550 999 3111 ^
  --ckpt-pattern "outputs/davidson/RoBERTa-base/3class/*.pt" ^
  --hatecheck-csv dataset/hatecheck/hatecheck-xr.csv

Update (2026 January):

Currently I am transforming my Master's thesis project on hate speech detection into a production-ready content moderation platform. Extended a fine-tuned RoBERTa classifier with a multi-mode REST API (FastAPI), RAG-powered policy explanations using ChromaDB, containerization with Docker, experiment tracking via MLflow, and CI/CD automation with GitHub Actions. The system supports three classification modes—ternary (hateful/offensive/clean), binary hate detection, and toxicity filtering—making it adaptable for different moderation use cases. The repository will be shared here as soon the work is complete.

About

Study the impact of modeling a distinct class for offensive language in hate speech detection

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors

Languages