Skip to content

Aniezka/hatespeech-russian

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

62 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

RuHateBe: A Benchmark Dataset for Hate Speech in Russian

Anna Scherbakova, Anna Sukhanova, Anna Palatkina, Elina Sigdel

Modern dialogue systems have learned to communicate with users, although it has not yet been possible to completely eliminate situations in which conversational agents use inappropriate language. It is difficult to examine the generalisability of the dialogue models, as usually they are evaluated on held-out data. In this paper, we present RuHateBe, a novel Russian benchmark dataset for evaluating dialogue systems on hate speech. To illustrate RuHateBe’s usefulness, we test several state-of-the-art generative models in order to reveal models’ weaknesses. To the best of our knowledge, this is the first study that examines Russian dialogue models to use hate speech towards specific social group.

Link to the RuHateBe dataset: https://disk.yandex.ru/d/hi3PF0XuoyCRlg

Link to the paper: https://disk.yandex.ru/i/Divcpu7LaJwchw

About

Benchmark for hate speech detection in Russian dialogues

Topics

Resources

License

Stars

Watchers

Forks

Contributors 4

  •  
  •  
  •  
  •