Home | Research | Groups | Michael Hedderich

Research Group Michael Hedderich

Link to Michael Hedderich

Michael Hedderich

Dr.

JRG Leader Human-Centered NLP

Artificial Intelligence and Computational Linguistics

Michael Hedderich

leads the MCML Junior Research Group 'Human-Centered NLP' at LMU Munich.

His team's research covers the intersection of machine learning, natural language processing (NLP) and human-computer interaction. Human factors have a crucial interplay with modern AI and NLP development, from the way data is obtained, e.g. in low-resource scenarios, to the need to understand and control models, e.g. through global explainability methods. AI technology also does not exist in a vacuum but must be validated together with the application experts and stakeholders it should serve.

The group explores these questions from different perspectives, taking the lense of machine learning, natural language processing and human-computer interaction. By embracing these diverse perspectives, the researcher value how each viewpoint enriches the understanding of the same issues and how different skill sets complement one another.

Team members @MCML

Link to Florian Eichin

Florian Eichin

Artificial Intelligence and Computational Linguistics

Publications @MCML

[3]
B. Ma, X. Wang, T. Hu, A.-C. Haensch, M. A. Hedderich, B. Plank and F. Kreuter.
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models.
Findings of the Conference on Empirical Methods in Natural Language Processing (EMNLP 2024). Miami, FL, USA, Nov 12-16, 2024. To be published. Preprint at arXiv. URL.
Abstract

Recent advances in Large Language Models (LLMs) have sparked wide interest in validating and comprehending the human-like cognitive-behavioral traits LLMs may capture and convey. These cognitive-behavioral traits include typically Attitudes, Opinions, Values (AOVs). However, measuring AOVs embedded within LLMs remains opaque, and different evaluation methods may yield different results. This has led to a lack of clarity on how different studies are related to each other and how they can be interpreted. This paper aims to bridge this gap by providing a comprehensive overview of recent works on the evaluation of AOVs in LLMs. Moreover, we survey related approaches in different stages of the evaluation pipeline in these works. By doing so, we address the potential and challenges with respect to understanding the model, human-AI alignment, and downstream application in social sciences. Finally, we provide practical insights into evaluation methods, model enhancement, and interdisciplinary collaboration, thereby contributing to the evolving landscape of evaluating AOVs in LLMs.

MCML Authors
Link to Bolei Ma

Bolei Ma

Social Data Science and AI Lab

Link to Xinpeng Wang

Xinpeng Wang

Artificial Intelligence and Computational Linguistics

Link to Michael Hedderich

Michael Hedderich

Dr.

Artificial Intelligence and Computational Linguistics

Link to Barbara Plank

Barbara Plank

Prof. Dr.

Artificial Intelligence and Computational Linguistics

Link to Frauke Kreuter

Frauke Kreuter

Prof. Dr.

Social Data Science and AI Lab


[2]
F. Eichin, C. Schuster, G. Groh and M. A. Hedderich.
Semantic Component Analysis: Discovering Patterns in Short Texts Beyond Topics.
Preprint at arXiv (Oct. 2024). arXiv.
Abstract

Topic modeling is a key method in text analysis, but existing approaches are limited by assuming one topic per document or fail to scale efficiently for large, noisy datasets of short texts. We introduce Semantic Component Analysis (SCA), a novel topic modeling technique that overcomes these limitations by discovering multiple, nuanced semantic components beyond a single topic in short texts which we accomplish by introducing a decomposition step to the clustering-based topic modeling framework. Evaluated on multiple Twitter datasets, SCA matches the state-of-the-art method BERTopic in coherence and diversity, while uncovering at least double the semantic components and maintaining a noise rate close to zero while staying scalable and effective across languages, including an underrepresented one.

MCML Authors
Link to Florian Eichin

Florian Eichin

Artificial Intelligence and Computational Linguistics

Link to Michael Hedderich

Michael Hedderich

Dr.

Artificial Intelligence and Computational Linguistics


[1]
J. Shin, M. A. Hedderich, B. J. Rey, A. Lucero and A. Oulasvirta.
Understanding Human-AI Workflows for Generating Personas.
ACM Conference on Designing Interactive Systems (DIS 2024). Copenhagen, Denmark, Jul 01-05, 2024. DOI.
Abstract

One barrier to deeper adoption of user-research methods is the amount of labor required to create high-quality representations of collected data. Trained user researchers need to analyze datasets and produce informative summaries pertaining to the original data. While Large Language Models (LLMs) could assist in generating summaries, they are known to hallucinate and produce biased responses. In this paper, we study human–AI workflows that differently delegate subtasks in user research between human experts and LLMs. Studying persona generation as our case, we found that LLMs are not good at capturing key characteristics of user data on their own. Better results are achieved when we leverage human skill in grouping user data by their key characteristics and exploit LLMs for summarizing pre-grouped data into personas. Personas generated via this collaborative approach can be more representative and empathy-evoking than ones generated by human experts or LLMs alone. We also found that LLMs could mimic generated personas and enable interaction with personas, thereby helping user researchers empathize with them. We conclude that LLMs, by facilitating the analysis of user data, may promote widespread application of qualitative methods in user research.

MCML Authors
Link to Michael Hedderich

Michael Hedderich

Dr.

Artificial Intelligence and Computational Linguistics