Home  | Publications | Hey25a

Who Counts? Survey Data Quality in the Age of AI

MCML Authors

Leah von der Heyde

Leah von der Heyde

Dr.

Abstract

Large language models (LLMs) have been hoped to make survey research more efficient, while also improving survey data quality. However, as they are based on Internet data, LLMs may come with similar potential pitfalls as other digital data sources with regard to making inferences about human attitudes and behavior. As such, they not only have the potential to mitigate, but also to amplify existing biases regarding our understanding of different populations and constructs of interest. In this dissertation, I investigate whether and under which conditions LLMs can be leveraged in survey research by providing empirical evidence of the potentials and limits of two major applications: supplementing survey data with LLM-generated data, and coding open-ended survey responses with LLMs. I test these applications in previously unexamined contexts – European countries and languages. I conclude that LLMs cannot fully replace, but could augment human-powered survey research, given proper supervision and validation.

phdthesis Hey25a


Dissertation

Universität Mannheim. Jul. 2025

Authors

L. von der Heyde

Links

URL

Research Area

 C4 | Computational Social Sciences

BibTeXKey: Hey25a

Back to Top