Home  | Publications | BFP+24

Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?

MCML Authors

Link to Profile Barbara Plank PI Matchmaking

Barbara Plank

Prof. Dr.

Principal Investigator

Abstract

With the rise of increasingly powerful and user-facing NLP systems, there is growing interest in assessing whether they have a good representation of uncertainty by evaluating the quality of their predictive distribution over outcomes. We identify two main perspectives that drive starkly different evaluation protocols. The first treats predictive probability as an indication of model confidence; the second as an indication of human label variation. We discuss their merits and limitations, and take the position that both are crucial for trustworthy and fair NLP systems, but that exploiting a single predictive distribution is limiting. We recommend tools and highlight exciting directions towards models with disentangled representations of uncertainty about predictions and uncertainty about human labels.

inproceedings


EACL 2024

18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024.
Conference logo
A Conference

Authors

J. Baan • R. Fernández • B. Plank • W. Aziz

Links

URL

Research Area

 B2 | Natural Language Processing

BibTeXKey: BFP+24

Back to Top