Home | Publications | WP23

ACTOR: Active Learning With Annotator-Specific Classification Heads to Embrace Human Label Variation

MCML Authors

Xinpeng Wang

→ Group Barbara Plank
AI and Computational Linguistics

Barbara Plank

Prof. Dr.

Principal Investigator

AI and Computational Linguistics

Abstract

Label aggregation such as majority voting is commonly used to resolve annotator disagreement in dataset creation. However, this may disregard minority values and opinions. Recent studies indicate that learning from individual annotations outperforms learning from aggregated labels, though they require a considerable amount of annotation. Active learning, as an annotation cost-saving strategy, has not been fully explored in the context of learning from disagreement. We show that in the active learning setting, a multi-head model performs significantly better than a single-head model in terms of uncertainty estimation. By designing and evaluating acquisition functions with annotator-specific heads on two datasets, we show that group-level entropy works generally well on both datasets. Importantly, it achieves performance in terms of both prediction and uncertainty estimation comparable to full-scale training from disagreement, while saving 70% of the annotation budget.

inproceedings WP23

EMNLP 2023

Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023.

Authors

X. Wang • B. Plank

Links

DOI

Research Area

B2 | Natural Language Processing

BibTeXKey: WP23

#p-plank