Modeling Offensive Language as a Distinct Class for Hate Speech Detection

This repository is part of my master's thesis project,"Modeling Offensive Language as a Distinct Class for Hate Speech Detection" (Kim, 2025), supervised by Dr. Antske Fokkens and Dr. Hennie van der Vliet. The project explored how modeling offensive (but not hateful) language as a distinct class impacts the task of detection of hate speech. Using a ternary classification scheme (Hateful, Offensive, Clean), I fine-tuned and evaluated a RoBERTa-base model in the full three-class setup and in binary variants where two classes are merged or the offensive class is removed (Hate vs. Non-hate, Non-clean vs. Clean, and Hate vs. Clean). The code used in this study includes my modifications and extensions of Khurana et al. (2025)'s code.

HateCheck-XR

In the project, to probe model behavior beyond set-internal performance, I revised both HateCheck (Röttger et al., 2021) and an existing extension by Khurana et al. (2025), aligning them with the ternary system by re-annotating them and correcting errors present in the extension. The resulting dataset, HateCheck-XR, is available in this repository, under "dataset" in a csv format.

Folder structure

Project
├─ hs_generalization/  
│  ├─ __init__.py
│  ├─ modes.py
│  ├─ train.py
│  ├─ test.py
│  └─ uitls.py    
├─ tools/
│  └─ run_many.py          
├─ configs/
│  └─ example.json
├─ dataset/
│ ├─ davidson/
│ └─ hatecheck/hatecheck-xr.csv
├─ requirements.txt
└─ README.md

Set-up

Set up the environment like the following:

# Create environment.
conda create -n hs-generalization python=3.9
conda activate hs-generalization

# Install packages.
python setup.py develop
pip install -r requirements.txt

Training

Create a config file and run the following:

python -m hs_generalization.train -c configs\train\example.json

Evaluation

Create a config file and run like the example:

#running based on a single seed/checkpoint
python -m hs_generalization.test -c configs\test\example.json --dataset davidson --eval-mode 3class --train-mode 3class --seed 5 --checkpoint "outputs\davidson\RoBERTa-base\3class\RoBERTa-base_0.pt"

# If you want to run multiple files at once, you can use run_many.py like the following example:
python tools/run_many.py ^
  -c configs/test/test.json ^
  --dataset hatecheck_xr ^
  --eval-mode 3class ^
  --train-mode 3class ^
  --seeds 7 222 550 999 3111 ^
  --ckpt-pattern "outputs/davidson/RoBERTa-base/3class/*.pt" ^
  --hatecheck-csv dataset/hatecheck/hatecheck-xr.csv

Update (2026 January):

Currently I am transforming my Master's thesis project on hate speech detection into a production-ready content moderation platform. Extended a fine-tuned RoBERTa classifier with a multi-mode REST API (FastAPI), RAG-powered policy explanations using ChromaDB, containerization with Docker, experiment tracking via MLflow, and CI/CD automation with GitHub Actions. The system supports three classification modes—ternary (hateful/offensive/clean), binary hate detection, and toxicity filtering—making it adaptable for different moderation use cases. The repository will be shared here as soon the work is complete.

Name		Name	Last commit message	Last commit date
Latest commit History 60 Commits
configs		configs
datasets		datasets
hs_generalization		hs_generalization
scripts		scripts
README.md		README.md
Thesis_Areum.pdf		Thesis_Areum.pdf
requirements.txt		requirements.txt

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Modeling Offensive Language as a Distinct Class for Hate Speech Detection

HateCheck-XR

Folder structure

Set-up

Training

Evaluation

Update (2026 January):

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Modeling Offensive Language as a Distinct Class for Hate Speech Detection

HateCheck-XR

Folder structure

Set-up

Training

Evaluation

Update (2026 January):

About

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages