Home | Publications | HLW+24

Uncovering Labeler Bias in Machine Learning Annotation Tasks

MCML Authors

Luke Haliburton

Dr.

* Former Member

→ Group Albrecht Schmidt
Human-Centered Ubiquitous Media

Albrecht Schmidt

Prof. Dr.

Principal Investigator

Human-Centered Ubiquitous Media

Sven Mayer

Prof. Dr.

Associate

* Former Associate

Abstract

As artificial intelligence becomes increasingly pervasive, it is essential that we understand the implications of bias in machine learning. Many developers rely on crowd workers to generate and annotate datasets for machine learning applications. However, this step risks embedding training data with labeler bias, leading to biased decision-making in systems trained on these datasets. To characterize labeler bias, we created a face dataset and conducted two studies where labelers of different ethnicity and sex completed annotation tasks. In the first study, labelers annotated subjective characteristics of faces. In the second, they annotated images using bounding boxes. Our results demonstrate that labeler demographics significantly impact both subjective and accuracy-based annotations, indicating that collecting a diverse set of labelers may not be enough to solve the problem. We discuss the consequences of these findings for current machine learning practices to create fair and unbiased systems.

article HLW+24