Home  | Publications | YMS22

Separating Hate Speech and Offensive Language Classes via Adversarial Debiasing

MCML Authors

Antonis Maronikolakis

Link to Profile Hinrich Schütze PI Matchmaking

Hinrich Schütze

Prof. Dr.

Principal Investigator

Abstract

Research to tackle hate speech plaguing online media has made strides in providing solutions, analyzing bias and curating data. A challenging problem is ambiguity between hate speech and offensive language, causing low performance both overall and specifically for the hate speech class. It can be argued that misclassifying actual hate speech content as merely offensive can lead to further harm against targeted groups. In our work, we mitigate this potentially harmful phenomenon by proposing an adversarial debiasing method to separate the two classes. We show that our method works for English, Arabic German and Hindi, plus in a multilingual setting, improving performance over baselines.

inproceedings


WOAH 2022

6th Workshop on Online Abuse and Harms. Seattle, WA, USA, Jul 14, 2022.

Authors

S. Yuan • A. MaronikolakisH. Schütze

Links

DOI

Research Area

 B2 | Natural Language Processing

BibTeXKey: YMS22

Back to Top