Home  | Research | Groups | Plank

Research Group Barbara Plank


Link to website at LMU PI Matchmaking

Barbara Plank

Prof. Dr.

Principal Investigator

Barbara Plank

heads the Chair for AI and Computational Linguistics at LMU Munich.

Her lab carries out research in Natural Language Processing, an interdisciplinary subdiscipline of Artificial Intelligence at the interface of computer science, linguistics and cognitive science. In broad terms, the aim is human-facing NLP: to make NLP models more robust and inclusive, so that they can deal better with underlying shifts in data due to language variation, are fairer and embrace human label variation.

Team members @MCML

PostDocs

Link to website

Silvia Casola

Dr.

Link to website

Siyao Peng

Dr.

PhD Students

Link to website

Verena Blaschke

Link to website

Beiduo Chen

Link to website

Jana Grimm

Link to website

Felicia Körner

Link to website

Robert Litschko

Link to website

Philipp Mondorf

Link to website

Monica Riedler

Link to website

Andreas Säuberli

Link to website

Soh-Eun Shim

Link to website

Xinpeng Wang

Link to website

Shijia Zhou

Recent News @MCML

Link to MCML at EMNLP 2025

02.11.2025

MCML at EMNLP 2025

37 Accepted Papers (17 Main, 13 Findings, and 7 Workshops)

Link to Barbara Plank Featured on ARD

26.10.2025

The Segment Highlights Challenges AI Faces in Understanding Regional Language Variations

Link to Barbara Plank Awarded 2025 Imminent Research Grant for Work on Language Data

29.07.2025

Barbara Plank Awarded 2025 Imminent Research Grant for Work on Language Data

Supporting Innovative Research at the Intersection of Language and AI

Link to MCML at ACL 2025

25.07.2025

MCML at ACL 2025

37 Accepted Papers (17 Main, 8 Findings, and 12 Workshops)

Publications @MCML

2025


[105] A* Conference
X. WangM. WangY. LiuH. SchützeB. Plank
Refusal Direction is Universal Across Safety-Aligned Languages.
NeurIPS 2025 - 39th Conference on Neural Information Processing Systems. San Diego, CA, USA, Nov 30-Dec 07, 2025. To be published. Preprint available. URL

[104]
P. MondorfM. WangS. GerstnerA. D. HakimiY. LiuL. VelosoS. ZhouH. SchützeB. Plank
BlackboxNLP-2025 MIB Shared Task: Exploring Ensemble Strategies for Circuit Localization Methods.
BlackboxNLP @EMNLP 2025 - 8th Workshop on Analyzing and Interpreting Neural Networks for NLP at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[103] A* Conference
L. Bertolazzi • P. MondorfB. Plank • R. Bernardi
The Validation Gap: A Mechanistic Analysis of How Language Models Compute Arithmetic but Fail to Validate It.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[102] A* Conference
B. Chen • Y. J. Liu • A. Korhonen • B. Plank
Threading the Needle: Reweaving Chain-of-Thought Reasoning to Explain Human Label Variation.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[101] A* Conference
Y. Du • P. MondorfS. Casola • Y. Yao • R. LitschkoB. Plank
Reason to Rote: Rethinking Memorization in Reasoning.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[100] A* Conference
P. Hong • B. ChenS. Peng • M.-C. de Marneffe • B. Plank
LiTEx: A Linguistic Taxonomy of Explanations for Understanding Within-Label Variation in Natural Language Inference.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[99] A* Conference
S. M. Lo • S. Casola • E. Sezerer • V. Basile • F. Sansonetti • A. Uva • D. Bernardi
PERSEVAL: A Framework for Perspectivist Classification Evaluation.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. PDF

[98] A* Conference
A. Testoni • B. Plank • R. Fernández
RACQUET: Unveiling the Dangers of Overlooked Referential Ambiguity in Visual LLMs.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[97] A* Conference
C. Wu • B. MaY. Liu • Z. Zhang • N. Deng • Y. Li • B. Chen • Y. Zhang • Y. Xue • B. Plank
M-ABSA: A Multilingual Dataset for Aspect-Based Sentiment Analysis.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[96]
R. LitschkoV. Blaschke • D. Burkhardt • B. Plank • D. Frassinelli
Make Every Letter Count: Building Dialect Variation Dictionaries from Monolingual Corpora.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[95]
Y. LiuM. WangA. H. KargaranF. KörnerE. NieB. Plank • F. Yvon • H. Schütze
Tracing Multilingual Factual Knowledge Acquisition in Pretraining.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv GitHub

[94]
R. ZhaoB. ChenB. PlankM. A. Hedderich
MAKIEval: A Multilingual Automatic WiKidata-based Framework for Cultural Awareness Evaluation for LLMs.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[93]
S. ZhouS. Peng • S. Luebke • J. Haßler • M. Haim • S. M. Mohammad • B. Plank
What Media Frames Reveal About Stance: A Dataset and Study about Memes in Climate Change Discourse.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[92]
L. Zuo • P. Hong • O. Kraus • B. PlankR. Litschko
Evaluating Large Language Models for Cross-Lingual Retrieval.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[91]
E. Leonardelli • S. CasolaS. Peng • G. Rizzi • V. Basile • E. Fersini • D. Frassinelli • H. Jang • M. Pavlovic • B. Plank • M. Poesio
LeWiDi-2025 at NLPerspectives: The Third Edition of the Learning with Disagreements Shared Task.
LeWiDi @EMNLP 2025 - Learning with Disagreements Track at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv

[90]
S. Eckman • B. MaC. Kern • R. Chew • B. PlankF. Kreuter
Aligning NLP Models with Target Population Perspectives using PAIR: Population-Aligned Instance Replication.
NLPerspectives @EMNLP 2025 - 4th Workshop on Perspectivist Approaches to NLP at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. To be published. Preprint available. arXiv


[88]
P. Hong • B. ChenS. Peng • M.-C. de Marneffe • B. Roth • B. Plank
Agree, Disagree, Explain: Decomposing Human Label Variation in NLI through the Lens of Explanations.
Preprint (Oct. 2025). arXiv

[87]
B. Ma • Y. Cao • I. Sen • A.-C. HaenschF. KreuterB. Plank • D. Hershcovich
Too Open for Opinion? Embracing Open-Endedness in Large Language Models for Social Simulation.
Preprint (Oct. 2025). arXiv


[85]
T. Ruiz • S. PengB. Plank • C. Schwemmer
BoN Appetit Team at LeWiDi-2025: Best-of-N Test-time Scaling Can Not Stomach Annotation Disagreements (Yet).
Preprint (Oct. 2025). arXiv

[84]
X. Wang • N. Joshi • B. Plank • R. Angell • H. He
Is It Thinking or Cheating? Detecting Implicit Reward Hacking by Measuring Reasoning Effort.
Preprint (Oct. 2025). arXiv

[83]

[82] A* Conference
A. Bavaresco • R. Bernardi • L. Bertolazzi • D. Elliott • R. Fernández • A. Gatt • E. Ghaleb • M. Giulianelli • M. Hanna • A. Koller • A. F. T. Martins • P. Mondorf • V. Neplenbroek • S. Pezzelle • B. Plank • D. Schlangen • A. Suglia • A. K. Surikuchi • E. Takmaz • A. Testoni
LLMs instead of Human Judges? A Large Scale Empirical Study across 20 NLP Evaluation Tasks.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[81] A* Conference
F. EichinY. J. LiuB. PlankM. A. Hedderich
Probing LLMs for Multilingual Discourse Generalization Through a Unified Label Set.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[80] A* Conference
M. A. Hedderich • A. Wang • R. ZhaoF. Eichin • J. Fischer • B. Plank
What's the Difference? Supporting Users in Identifying the Effects of Prompt and Model Changes Through Token Patterns.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[79] A* Conference
B. Ma • Y. Li • W. Zhou • Z. Gong • Y. J. Liu • K. Jasinskaja • A. Friedrich • J. Hirschberg • F. KreuterB. Plank
Pragmatics in the Era of Large Language Models: A Survey on Datasets, Evaluation, Opportunities and Challenges.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[78] A* Conference
B. Ma • B. Yoztyurk • A.-C. HaenschX. Wang • M. Herklotz • F. KreuterB. PlankM. Aßenmacher
Algorithmic Fidelity of Large Language Models in Generating Synthetic German Public Opinions: A Case Study.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[77] A* Conference
P. Mondorf • S. Wold • B. Plank
Circuit Compositions: Exploring Modular Structures in Transformer-Based Language Models.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[76]
A. Säuberli • D. Frassinelli • B. Plank
Do LLMs Give Psychometrically Plausible Responses in Educational Assessments?
BEA @ACL 2025 - 20th Workshop on Innovative Use of NLP for Building Educational Applications at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[75]
V. Blaschke • M. Fedzechkina • M. Ter Hoeve
Analyzing the Effect of Linguistic Similarity on Cross-Lingual Transfer: Tasks and Experimental Setups Matter.
Findings @ACL 2025 - Findings at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[74]
B. ChenS. Peng • A. Korhonen • B. Plank
A Rose by Any Other Name: LLM-Generated Explanations Are Good Proxies for Human Explanations to Collect Label Distributions on NLI.
Findings @ACL 2025 - Findings at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[73]
C. Gruber • H. AlberB. BischlG. KauermannB. PlankM. Aßenmacher
Revisiting Active Learning under (Human) Label Variation.
Preprint (Jul. 2025). arXiv

[72]
V. Blaschke • M. Winkler • C. Förster • G. Wenger-Glemser • B. Plank
A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation.
Preprint (Jun. 2025). arXiv

[71]
S. CasolaY. J. LiuS. Peng • O. Kraus • A. Gatt • B. Plank
Evaluation Should Not Ignore Variation: On the Impact of Reference Set Choice on Summarization Metrics.
Preprint (Jun. 2025). arXiv

[70]
D. N. Jakobi • M. Stegenwallner-Schütz • N. Hollenstein • C. Ding • R. Kaspere • A. M. Škorić • E. Pavlinusic Vilus • S. Frank • M.-L. Müller • K. M. Jensen de López • N. Kharlamov • H. B. Søndergaard Knudsen • Y. Berzak • E. Lion • I. A. Sekerina • C. Acarturk • M. F. Ansari • K. Harezlak • P. Kasprowski • A. Bautista • L. Beinborn • A. Bondar • A. Boznou • L. Bradshaw • J. M. Hofmann • T. Krosness • N. B. Soliva • A. Çepani • K. Cergol • A. Došen • M. Palmovic • A. Çerpja • D. Chirino • J. Chromý • V. Demberg • I. Škrjanec • N. D. Deniz • I. Fajardo • M. Giménez-Salvador • X. Mínguez-López • M. Filip • Z. Freibergs • J. Gomes • A. Janeiro • P. Luegi • J. Veríssimo • S. Gramatikov • J. Hasenäcker • A. Haveriku • N. Kote • M. M. Kamal • H. Kędzierska • D. Klimek-Jankowska • S. Kosutar • D. G. Krakowczyk • I. Krejtz • M. Łockiewicz • K. Lõo • J. Motiejūnienė • J. A. Nasir • J. S. Krog Nedergård • A. Özkan • M. Preininger • L. Pungă • D. R. Reich • C. Tschirner • Š. Rot • A. Säuberli • J. Solé-Casals • E. Strati • I. Svoboda • E. Trandafili • S. Varlokosta • M. Vulchanova • L. A. 
MultiplEYE: Creating a multilingual eye-tracking-while-reading corpus.
ETRA 2025 - ACM Symposium on Eye Tracking Research and Applications. Tokyo, Japan, May 26-29, 2025. DOI

[69]
F. Eichin • Y. Du • P. MondorfB. PlankM. A. Hedderich
Grokking ExPLAIND: Unifying Model, Data, and Training Attribution to Study Model Behavior.
Preprint (May. 2025). arXiv GitHub

[68]
R. S.-E. Shim • D. De Cristofaro • C. M. Hu • A. Vietti • B. Plank
Languages in Multilingual Speech Foundation Models Align Both Phonetically and Semantically.
Preprint (May. 2025). arXiv

[67]
R. Shim • B. Plank
Dialetto, ma Quanto Dialetto? Transcribing and Evaluating Dialects on a Continuum.
Findings @NAACL 2025 - Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025. DOI

[66] A Conference
L. Madaan • D. Esiobu • P. Stenetorp • B. Plank • D. Hupkes
Lost in Inference: Rediscovering the Role of Natural Language Inference for Large Language Models.
NAACL 2025 - Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025. DOI

[65]
V. Blaschke
Beyond 'noisy' text: How (and why) to process dialect data.
W-NUT @NAACL 2025 - 10th Workshop on Noisy and User-generated Text at the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025. Keynote Talk. PDF

[64] A* Conference
X. Wang • C. Hu • P. Röttger • B. Plank
Surgical, Cheap, and Flexible: Mitigating False Refusal in Language Models via Single Vector Ablation.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL

[63]
P. MondorfS. ZhouM. RiedlerB. Plank
Enabling Systematic Generalization in Abstract Spatial Reasoning through Meta-Learning for Compositionality.
Preprint (Apr. 2025). arXiv

[62]
S. Si • X. WangG. ZhaiN. NavabB. Plank
Think Before Refusal : Triggering Safety Reflection in LLMs to Mitigate False Refusal Behavior.
Preprint (Mar. 2025). arXiv

[61] A* Conference
J. Lan • D. Frassinelli • B. Plank
Mind the Uncertainty in Human Disagreement: Evaluating Discrepancies between Model Predictions and Human Responses in VQA.
AAAI 2025 - 39th Conference on Artificial Intelligence. Philadelphia, PA, USA, Feb 25-Mar 04, 2025. DOI

[60]
S. FeuerriegelA. MaaroufD. Bär • D. Geißler • J. Schweisthal • N. Pröllochs • C. E. Robertson • S. Rathje • J. Hartmann • S. M. Mohammad • O. Netzer • A. A. Siegel • B. Plank • J. J. Van Bavel
Using natural language processing to analyse text data in behavioural science.
Nature Reviews Psychology 4. Feb. 2025. DOI

[59]
S. Xu • T. Y. S. S. Santosh • Y. Elazar • Q. Vogel • B. Plank • M. Grabmair
Better Aligned with Survey Respondents or Training Data? Unveiling Political Leanings of LLMs on U.S. Supreme Court Cases.
Preprint (Feb. 2025). arXiv

[58]
R. Litschko • O. Kraus • V. BlaschkeB. Plank
Cross-Dialect Information Retrieval: Information Access in Low-Resource and High-Variance Languages.
COLING 2025 - The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

[57]
A. Muñoz-Ortiz • V. BlaschkeB. Plank
Evaluating Pixel Language Models on Non-Standardized Languages.
COLING 2025 - The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

[56]
V. BlaschkeF. KörnerB. Plank
Add Noise, Tasks, or Layers? MaiNLP at the VarDial 2025 Shared Task on Norwegian Dialectal Slot and Intent Detection.
VarDial @COLING 2025 - 12th Workshop on NLP for Similar Languages, Varieties and Dialects at the The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

[55]
X. Krückl • V. BlaschkeB. Plank
Improving Dialectal Slot and Intent Detection with Auxiliary Tasks: A Multi-Dialectal Bavarian Case Study.
VarDial @COLING 2025 - 12th Workshop on NLP for Similar Languages, Varieties and Dialects at the The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

[54]
A.-M. Lutgen • A. Plum • C. Purschke • B. Plank
Neural Text Normalization for Luxembourgish Using Real-Life Variation Data.
VarDial @COLING 2025 - 12th Workshop on NLP for Similar Languages, Varieties and Dialects at the The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

2024


[53]
Y. Zhang • Y. LiX. Wang • Q. Shen • B. PlankB. BischlM. Rezaei • K. Kawaguchi
FinerCut: Finer-grained Interpretable Layer Pruning for Large Language Models.
Compression Workshop @NeurIPS 2024 - Workshop on Machine Learning and Compression at the 38th Conference on Neural Information Processing Systems. Vancouver, Canada, Dec 10-15, 2024. URL

[52]
V. Basile • S. Casola • S. Frenda • S. M. Lo
PERSEID - Perspectivist Irony Detection: A CALAMITA Challenge.
CLiC-it 2024 - 10th Italian Conference on Computational Linguistics. Pisa, Italy, Dec 04-06, 2024. URL

[51]
T. Bourgeade • S. Casola • A. M. Wizani • C. Bosco
Data Augmentation through Back-Translation for Stereotypes and Irony Detection.
CLiC-it 2024 - 10th Italian Conference on Computational Linguistics. Pisa, Italy, Dec 04-06, 2024. URL

[50]
S. Frenda • A. Piergentili • B. Savoldi • M. Madeddu • M. Rosola • S. Casola • C. Ferrando • V. Patti • M. Negri • L. Bentivogli
GFG - Gender-Fair Generation: A CALAMITA Challenge.
CLiC-it 2024 - 10th Italian Conference on Computational Linguistics. Pisa, Italy, Dec 04-06, 2024. URL

[49] A* Conference
Y. J. Liu • T. Aoyama • W. Scivetti • Y. Zhu • S. Behzad • L. E. Levine • J. Lin • D. Tiwari • A. Zeldes
GDTB: Genre Diverse Data for English Shallow Discourse Parsing across Modalities, Text Types, and Domains.
EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[48] A* Conference
P. MondorfB. Plank
Liar, Liar, Logical Mire: A Benchmark for Suppositional Reasoning in Large Language Models.
EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[47]
P. F. Balestrucci • S. Casola • S. M. Lo • V. Basile • A. Mazzei
I’m sure you’re a real scholar yourself: Exploring Ironic Content Generation by Large Language Models.
Findings @EMNLP 2024 - Findings of the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[46]
B. ChenX. WangS. PengR. Litschko • A. Korhonen • B. Plank
'Seeing the Big through the Small': Can LLMs Approximate Human Judgment Distributions on NLI from a Few Explanations?
Findings @EMNLP 2024 - Findings of the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[45]
B. MaX. Wang • T. Hu • A.-C. HaenschM. A. HedderichB. PlankF. Kreuter
The Potential and Challenges of Evaluating Attitudes, Opinions, and Values in Large Language Models.
Findings @EMNLP 2024 - Findings of the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[44]
A. Sedova • R. Litschko • D. Frassinelli • B. Roth • B. Plank
To Know or Not To Know? Analyzing Self-Consistency of Large Language Models under Ambiguity.
Findings @EMNLP 2024 - Findings of the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[43]
J. Wang • L. Zuo • S. PengB. Plank
MultiClimate: Multimodal Stance Detection on Climate Change Videos.
NLP4PI @EMNLP 2024 - 3rd Workshop on NLP for Positive Impact at the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI GitHub

[42]
P. MondorfB. Plank
Beyond Accuracy: Evaluating the Reasoning Behavior of Large Language Models--A Survey.
COLM 2024 - Conference on Language Modeling. Philadelphia, PA, USA, Oct 07-09, 2024. PDF

[41]
X. Wang • C. Hu • B. Ma • P. Rottger • B. Plank
Look at the Text: Instruction-Tuned Language Models are More Robust Multiple Choice Selectors than You Think.
COLM 2024 - Conference on Language Modeling. Philadelphia, PA, USA, Oct 07-09, 2024. PDF

[40]
V. Blaschke • B. Kovačić • S. PengB. Plank
MaiBaam Annotation Guidelines.
Preprint (Oct. 2024). arXiv

[39]
Q. Chen • X. WangP. MondorfM. A. HedderichB. Plank
Understanding When Tree of Thoughts Succeeds: Larger Models Excel in Generation, Not Discrimination.
Preprint (Oct. 2024). arXiv

[38] A* Conference
V. Blaschke • C. Purschke • H. SchützeB. Plank
What Do Dialect Speakers Want? A Survey of Attitudes Towards Language Technology for German Dialects.
ACL 2024 - 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[37] A* Conference
P. MondorfB. Plank
Comparing Inferential Strategies of Humans and Large Language Models in Deductive Reasoning.
ACL 2024 - 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[36] A* Conference
L. Weber-GenzelS. Peng • M.-C. De Marneffe • B. Plank
VariErr NLI: Separating Annotation Error from Human Label Variation.
ACL 2024 - 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[35] A* Conference
S. Xu • S. T.y.s.s • O. Ichim • B. Plank • M. Grabmair
Through the Lens of Split Vote: Exploring Disagreement, Difficulty and Calibration in Legal Case Outcome Classification.
ACL 2024 - 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[34]
S. ZhouS. PengB. Plank
CLIMATELI: Evaluating Entity Linking on Climate Change Data.
ClimateNLP @ACL 2024 - 1st Workshop on Natural Language Processing Meets Climate Change at the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[33]
X. WangB. Ma • C. Hu • L. Weber-Genzel • P. Röttger • F. Kreuter • D. Hovy • B. Plank
My Answer is C: First-Token Probabilities Do Not Match Text Answers in Instruction-Tuned Language Models.
Findings @ACL 2024 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[32] A* Conference
S. Eckman • B. PlankF. Kreuter
Position: Insights from Survey Methodology can Improve Training Data.
ICML 2024 - 41st International Conference on Machine Learning. Vienna, Austria, Jul 21-27, 2024. URL

[31]
S. Zhou • H. Shan • B. PlankR. Litschko
MaiNLP at SemEval-2024 Task 1: Analyzing Source Language Selection in Cross-Lingual Textual Relatedness.
SemEval @NAACL 2024 - 18th International Workshop on Semantic Evaluation at the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Mexico City, Mexico, Jun 16-21, 2024. URL

[30]
V. Blaschke • B. Kovačić • S. PengH. SchützeB. Plank
MaiBaam: A Multi-Dialectal Bavarian Universal Dependency Treebank.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[29]
C. Müller • B. Plank
IndirectQA: Understanding Indirect Answers to Implicit Polar Questions in French and Spanish.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[28]
S. Peng • Z. Sun • H. Shan • M. Kolm • V. Blaschke • E. Artemova • B. Plank
Sebastian, Basti, Wastl?! Recognizing Named Entities in Bavarian Dialectal Data.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[27]
M. Winkler • V. Juozapaityte • R. van der Goot • B. Plank
Slot and Intent Detection Resources for Bavarian and Lithuanian: Assessing Translations vs Natural Queries to Digital Assistants.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[26]
S. ZhouL. Weissweiler • T. He • H. Schütze • D. R. Mortensen • L. Levin
Constructions Are So Difficult That Even Large Language Models Get Them Right for the Wrong Reasons.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[25]
C. Gruber • K. Hechinger • M. Aßenmacher • G. Kauermann • B. Plank
More Labels or Cases? Assessing Label Variation in Natural Language Inference.
UnImplicit 2024 - 3rd Workshop on Understanding Implicit and Underspecified Language. Malta, Mar 21, 2024. URL

[24]
S. Peng • Z. Sun • S. Loftus • B. Plank
Different Tastes of Entities: Investigating Human Label Variation in Named Entity Annotations.
UnImplicit 2024 - 3rd Workshop on Understanding Implicit and Underspecified Language. Malta, Mar 21, 2024. URL

[23] A Conference
E. Artemova • V. BlaschkeB. Plank
Exploring the Robustness of Task-oriented Dialogue Systems for Colloquial German Varieties.
EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

[22] A Conference
J. Baan • R. Fernández • B. Plank • W. Aziz
Interpreting Predictive Probabilities: Model Confidence or Human Label Variation?
EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

[21] A Conference
M. Zhang • R. van der Goot • M.-Y. Kan • B. Plank
NNOSE: Nearest Neighbor Occupational Skill Extraction.
EACL 2024 - 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

[20]
M. Zhang • R. van der Goot • B. Plank
Entity Linking in the Job Market Domain.
Findings @EACL 2024 - Findings of the 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

[19]
A. Sorensen • S. PengB. Plank • R. Goot
EEVEE: An Easy Annotation Tool for Natural Language Processing.
LAW @EACL 2024 - 18th Linguistic Annotation Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

[18]
L. Weber-GenzelR. Litschko • E. Artemova • B. Plank
Donkii: Characterizing and Detecting Errors in Instruction-Tuning Datasets.
LAW @EACL 2024 - 18th Linguistic Annotation Workshop at the 18th Conference of the European Chapter of the Association for Computational Linguistics. St. Julians, Malta, Mar 17-22, 2024. URL

2023


[17]
S. ZhangP. WickeL. K. Senel • L. Figueredo • A. Naceri • S. Haddadin • B. PlankH. Schütze
LoHoRavens: A Long-Horizon Language-Conditioned Benchmark for Robotic Tabletop Manipulation.
Robot Learning @NeurIPS 2023 - 6th Robot Learning Workshop: Pretraining, Fine-Tuning, and Generalization with Large Scale Models at the 37th Conference on Neural Information Processing Systems. New Orleans, LA, USA, Dec 10-16, 2023. URL

[16] A* Conference
M. Giulianelli • J. Baan • W. Aziz • R. Fernández • B. Plank
What Comes Next? Evaluating Uncertainty in Neural Text Generators Against Human Production Variability.
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[15] A* Conference
R. Litschko • M. Müller-Eberstein • R. van der Goot • L. Weber-GenzelB. Plank
Establishing Trustworthiness: Rethinking Tasks and Model Evaluation.
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[14] A* Conference
X. WangB. Plank
ACTOR: Active Learning with Annotator-specific Classification Heads to Embrace Human Label Variation.
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[13] A* Conference
S. Xu • S. T.y.s.s • O. Ichim • I. Risini • B. Plank • M. Grabmair
From Dissonance to Insights: Dissecting Disagreements in Rationale Construction for Case Outcome Classification.
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[12]
M. Müller-Eberstein • R. van der Goot • B. Plank • I. Titov
Subspace Chronicles: How Linguistic Information Emerges, Shifts and Interacts during Language Model Training.
Findings @EMNLP 2023 - Findings of the Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[11]
L. Weber • B. Plank
ActiveAED: A Human in the Loop Improves Annotation Error Detection.
Findings @ACL 2023 - Findings of the 61th Annual Meeting of the Association for Computational Linguistics. Toronto, Canada, Jul 09-14, 2023. DOI

[10]
J. Baan • N. Daheim • E. Ilia • D. Ulmer • H.-S. Li • R. Fernández • B. Plank • R. Sennrich • C. Zerva • W. Aziz
Uncertainty in Natural Language Generation: From Theory to Applications.
Preprint (Jul. 2023). arXiv

[9]
V. BlaschkeH. SchützeB. Plank
A Survey of Corpora for Germanic Low-Resource Languages and Dialects.
NoDaLiDa 2023 - 24th Nordic Conference on Computational Linguistics. Tórshavn, Faroe Islands, May 22-24, 2023. URL

[8] A Conference
X. WangL. WeissweilerH. SchützeB. Plank
How to Distill your BERT: An Empirical Study on the Impact of Weight Initialisation and Distillation Objectives.
EACL 2023 - 17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia, May 02-06, 2023. DOI

[7]
V. BlaschkeH. SchützeB. Plank
Does Manipulating Tokenization Aid Cross-Lingual Transfer? A Study on POS Tagging for Non-Standardized Languages.
VarDial @EACL 2023 - 10th Workshop on NLP for Similar Languages, Varieties and Dialects at the 17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia, May 02-06, 2023. DOI

2022


[6] A* Conference
J. Baan • W. Aziz • B. Plank • R. Fernandez
Stop Measuring Calibration When Humans Disagree.
EMNLP 2022 - Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[5] A* Conference
E. Bassignana • M. Müller-Eberstein • M. Zhang • B. Plank
Evidence > Intuition: Transferability Estimation for Encoder Selection.
EMNLP 2022 - Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[4] A* Conference
M. Müller-Eberstein • R. van der Goot • B. Plank
Spectral Probing.
EMNLP 2022 - Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[3] A* Conference
B. Plank
The 'Problem' of Human Label Variation: On Ground Truth in Data, Modeling and Evaluation.
EMNLP 2022 - Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[2]
E. Bassignana • B. Plank
CrossRE: A Cross-Domain Dataset for Relation Extraction.
Findings @EMNLP 2022 - Findings of the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[1]
D. Ulmer • E. Bassignana • M. Müller-Eberstein • D. Varab • M. Zhang • R. van der Goot • C. Hardmeier • B. Plank
Experimental Standards for Deep Learning in Natural Language Processing Research.
Findings @EMNLP 2022 - Findings of the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI