Home | Research | Groups | Alexander Fraser

Research Group Alexander Fraser

Link to website at TUM

Alexander Fraser

Prof. Dr.

Core PI

Data Analytics & Statistics

Alexander Fraser

holds the Chair for Data Analytics & Statistics at TU Munich.

He is renowned for his work in machine learning approaches to machine translation, language modeling, and multilingual natural language processing. He focuses on addressing data sparsity and integrating linguistic and world knowledge in AI systems. Additionally, he collaborates with language communities to develop technology for their languages. His contributions to natural language processing and machine learning emphasize both theoretical advancements and practical applications.

Team members @MCML

PostDocs

Link to website

Daryna Dementieva

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Marion Di Marco

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Lukas Edman

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Benedikt Hoock

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Wen Lai

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Shu Okabe

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

PhD Students

Link to website

Faeze Ghorbanpour

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Katharina Hämmerl

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Sophie Henning

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

JiWoo Hwang

→ Group Alexander Fraser
Data Analytics & Statistics

Volkan Özer

Volkan Özer

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Hanna Shcharbakova

→ Group Alexander Fraser
Data Analytics & Statistics

Link to website

Tsedeniya Kinfe Temesgen

→ Group Alexander Fraser
Data Analytics & Statistics

Recent News @MCML

Link to MCML at ACL 2026

01.07.2026

MCML at ACL 2026

38 Accepted Papers (20 Main, 15 Findings, and 3 Workshops)

Learn more

Link to MCML at EACL 2026

23.03.2026

MCML at EACL 2026

13 Accepted Papers (9 Main, and 4 Findings)

Learn more

Link to MCML PI Alexander Fraser Featured on Breaking Data Podcast

12.01.2026

MCML PI Alexander Fraser Featured on Breaking Data Podcast

Discussing AI, Language Diversity, and Low-Resource Languages

Learn more

Link to MCML at NeurIPS 2025

28.11.2025

MCML at NeurIPS 2025

56 Accepted Papers (42 Main, and 14 Workshops)

Learn more

Show all news of this group

Publications @MCML

2026

[58]

A. Dikhtiar • A. Viter • B. Karaziia • D. Dementieva • A. Fraser
aba_team at SemEval-2026 Task 1: Plan2joke – Humor Policies for Type-Specific Two-Pass Humor Generation.
SemEval @ACL 2026 - 20th International Workshop on Semantic Evaluation at the 64th Annual Meeting of the Association for Computational Linguistics. San Diego, CA, USA, Jul 02-07, 2026. DOI GitHub

[57]

P. Piccirilli • A. Fraser • S. Schulte im Walde
Floating or Suggesting Ideas? A Large-Scale Contrastive Analysis of Metaphorical and Literal Verb-Object Constructions.
CMCL @LREC 2026 - 15th Workshop on Cognitive Modeling and Computational Linguistics at the 15th International Conference on Language Resources and Evaluation. Palma de Mallorca, Spain, May 11-16, 2026. To be published. Preprint available. arXiv

[56]

Y.-Y. Chang • D. Dementieva • A. Fraser
Beyond Generic Responses: Target-Aware Strategies for Countering Hate Speech.
LREC 2026 - 15th International Conference on Language Resources and Evaluation. Palma de Mallorca, Spain, May 11-16, 2026. DOI

[55]

D. M. Lozano • D. Dementieva • A. Fraser
Explainable Semantic Textual Similarity via Dissimilar Span Detection.
LREC 2026 - 15th International Conference on Language Resources and Evaluation. Palma de Mallorca, Spain, May 11-16, 2026. To be published. Preprint available. arXiv

[54]

M. A. Jradi • F. Ghorbanpour • A. Fraser
Attribute-Based Diagnosis of LLM Alignment with Hate Speech Annotations.
Preprint (May. 2026). arXiv

[53]

R. Niazi • F. Ghorbanpour • A. Fraser
PersLitEval: Fine-grained Benchmark and Evaluation of LLMs on Persian Literature Questions.
Preprint (May. 2026). arXiv

[52]

A. Karamolegkou • A. Borah • E. Cho • S. R. Choudhury • M. Galletti • P. Gupta • O. Ignat • P. Kargupta • N. Kotonya • H. Lamba • S.-J. Lee • A. Mangla • I. Mondal • F. Z. Moudakir • D. Nazar • P. Nemkova • D. Pisarevskaya • N. Rizwan • N. Sabri • K. Samway • D. Stammbach • A. Steinberg Schulten • D. Tomás • S. R. Wilson • B. Yi • J. H. Zhu • A. Zubiaga • A. Søgaard • A. Fraser • Z. Jin • R. Mihalcea • J. R. Tetreault • D. Dementieva
NLP for Social Good: A Survey and Outlook of Challenges, Opportunities and Responsible Deployment.
EACL 2026 - 19th Conference of the European Chapter of the Association for Computational Linguistics. Rabat, Morocco, Mar 24-29, 2026. DOI

[51]

S. Okabe • D. Zelo • A. Fraser
Optical Character Recognition for the International Phonetic Alphabet.
EACL 2026 - 19th Conference of the European Chapter of the Association for Computational Linguistics. Rabat, Morocco, Mar 24-29, 2026. DOI

[50]

S. Reichbauer • S. Okabe • A. Fraser
Evaluating Latin and Ancient Greek Sentence Alignment through Parallel Sentence Mining.
NLP4DH 2026 - 6th International Conference on Natural Language Processing for Digital Humanities. San Diago, CA, USA, Jul 04, 2025. DOI GitHub

2025

[49]

Y. Shen • W. Lai • S. Wang • X. Zhang • K. Luo • A. Fraser • M. Sun
DCAD-2000: A Multilingual Dataset across 2000+ Languages with Data Cleaning as Anomaly Detection.
NeurIPS 2025 - 39th Conference on Neural Information Processing Systems. San Diego, CA, USA, Nov 30-Dec 07, 2025. URL

[48]

W. Lai
Adaptation to Data Sparsity in Machine Translation and Large Language Models.
Dissertation TU München. Nov. 2025. URL

[47]

L. Edman • A. Fraser
Mask and You Shall Receive: Optimizing Masked Language Modeling For Pretraining BabyLMs.
BabyLM @EMNLP 2025 - 1st BabyLM Workshop: Accelerating Language Modeling Research with Cognitively Plausible Data at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[46]

F. Ghorbanpour • D. Dementieva • A. Fraser
Data-Efficient Hate Speech Detection via Cross-Lingual Nearest Neighbor Retrieval with Limited Labeled Data.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[45]

Y. Shen • W. Lai • S. Wang • G. Gao • K. Luo • A. Fraser • M. Sun
From Unaligned to Aligned: Scaling Multilingual LLMs with Multi-Way Parallel Corpora.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[44]

T. K. Temesgen • M. Di Marco • A. Fraser
Extracting Linguistic Information from Large Language Models: Syntactic Relations and Derivational Knowledge.
EMNLP 2025 - Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[43]

D. Dementieva • N. Babakov • A. Fraser
EmoBench-UA: A Benchmark Dataset for Emotion Detection in Ukrainian.
Findings @EMNLP 2025 - Findings of the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[42]

S. Okabe • D. Dementieva • M. Di Marco • L. Edman • K. Hämmerl • M. Měškank • A. Hendrichowa • A. Fraser
Findings of the WMT 2025 Shared Task LLMs with Limited Resources for Slavic Languages: MT and QA.
WMT @EMNLP 2025 - 10th Conference on Machine Translation at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025. DOI

[41]

D. Dementieva • E. Sukhodolskaya • A. Fraser
CrossNews-UA: A Cross-lingual News Semantic Similarity Benchmark for Ukrainian, Polish, Russian, and English.
Preprint (Oct. 2025). arXiv

[40]

F. Ghorbanpour • A. Fraser
Evaluating the Sensitivity of LLMs to Harmful Contents in Long Input.
Preprint (Oct. 2025). arXiv

[39]

F. Friedrich • K. Hämmerl • P. Schramowski • M. Brack • J. Libovicky • K. Kersting • A. Fraser
Multilingual Text-to-Image Generation Magnifies Gender Stereotypes and Prompt Engineering May Not Help You.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. DOI

[38]

L. Kinder • L. Edman • A. Fraser • T. Käfer
Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models using Set Encoding.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. DOI

[37]

S. Okabe • K. Hämmerl • A. Fraser
Improving Parallel Sentence Mining for Low-Resource and Endangered Languages.
ACL 2025 - 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. DOI

[36]

H. Asadpour • S. Okabe • A. Fraser
A Practical Tool to Help Automate Interlinear Glossing: a Study on Mukrī Kurdish.
Field Matters @ACL 2025 - 4th Workshop on NLP Applications to Field Linguistics at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[35]

L. Edman • H. Schmid • A. Fraser
EXECUTE: A Multilingual Benchmark for LLM Token Understanding.
Findings @ACL 2025 - Findings at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. DOI

[34]

F. Ghorbanpour • D. Dementieva • A. Fraser
Can Prompting LLMs Unlock Hate Speech Detection across Languages? A Zero-shot and Few-shot Study.
WOAH @ACL 2025 - 9th Workshop on Online Abuse and Harms at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025. URL

[33]

W. Lai • A. Fraser • I. Titov
Joint Localization and Activation Editing for Low-Resource Fine-Tuning.
ICML 2025 - 42nd International Conference on Machine Learning. Vancouver, Canada, Jul 13-19, 2025. URL

[32]

F. Ghorbanpour • T. Z. Malaguth • A. Akbaritabar
Differentiating Emigration from Return Migration of Scholars Using Name-Based Nationality Detection Models.
ICWSM 2025 - 19th International AAAI Conference on Web and Social Media. Copenhagen, Denmark, Jun 23-26, 2025. DOI

[31]

F. Ghorbanpour • V. Hangya • A. Fraser
Fine-Grained Transfer Learning for Harmful Content Detection through Label-Specific Soft Prompt Tuning.
NAACL 2025 - Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025. DOI

[30]

K. Hämmerl • T. Limisiewicz • J. Libovický • A. Fraser
Beyond Literal Token Overlap: Token Alignability for Multilinguality.
NAACL 2025 - Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025. DOI

[29]

S. Okabe • A. Fraser
Bilingual Sentence Mining for Low-Resource Languages: a Case Study on Upper and Lower Sorbian.
Compute-EL @ICLDC 2025 - 8th Workshop on The Use of Computational Methods in the Study of Endangered Languages at the 9th International Conference on Language Documentation and Conservation. Honolulu, Hawaii, USA, Mar 06-06, 2025. URL

[28]

Y. Zhang • V. Hangya • A. Fraser
LLM Sensitivity Challenges in Abusive Language Detection: Instruction-Tuned vs. Human Feedback.
COLING 2025 - The 31st International Conference on Computational Linguistics. Abu Dhabi, United Arab Emirates, Jan 19-24, 2025. URL

2024

[27]

L. Edman • L. Bylinina • F. Ghorbanpour • A. Fraser
Are BabyLMs Second Language Learners?
BabyLM Challenge @CoNLL 2024) - 2nd BabyLM Challenge at the 28th Conference on Computational Natural Language Learning. Miami, FL, USA, Nov 12-16, 2024. URL

[26]

M. Di Marco • A. Fraser
Subword Segmentation in LLMs: Looking at Inflection and Consistency.
EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[25]

L. Edman • H. Schmid • A. Fraser
CUTE: Measuring LLMs’ Understanding of Their Tokens.
EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[24]

W. Lai • V. Hangya • A. Fraser
Style-Specific Neurons for Steering LLMs in Text Style Transfer.
EMNLP 2024 - Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[23]

K. Hämmerl • A. Manea • G. Vico • J. Helcl • J. Libovický
CUNI and LMU Submission to the MRL 2024 Shared Task on Multi-lingual Multi-task Information Retrieval.
MRL @EMNLP 2024 - 4th Multilingual Representation Learning Workshop at the Conference on Empirical Methods in Natural Language Processing. Miami, FL, USA, Nov 12-16, 2024. DOI

[22]

A. Dimmelmeier • H. Doll • M. Schierholz • E. Kormanyos • M. Fehr • B. Ma • J. Beck • A. Fraser • F. Kreuter
Informing climate risk analysis using textual information - A research agenda.
ClimateNLP @ACL 2024 - 1st Workshop on Natural Language Processing Meets Climate Change at the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[21]

K. Hämmerl • J. Libovický • A. Fraser
Understanding Cross-Lingual Alignment—A Survey.
Findings @ACL 2024 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[20]

W. Lai • M. Mesgar • A. Fraser
LLMs Beyond English: Scaling the Multilingual Capability of LLMs with Cross-Lingual Feedback.
Findings @ACL 2024 - Findings of the 62nd Annual Meeting of the Association for Computational Linguistics. Bangkok, Thailand, Aug 11-16, 2024. DOI

[19]

P. Piccirilli • A. Fraser • S. Schulte im Walde
VOLIMET: A Parallel Corpus of Literal and Metaphorical Verb-Object Pairs for English–German and English–French.
*SEM 2024 - 13th Joint Conference on Lexical and Computational Semantics co-located with NAACL 2024. Mexico City, Mexico, Jun 20-21, 2024. DOI

[18]

Y. Zhang • V. Hangya • A. Fraser
A Study of the Class Imbalance Problem in Abusive Language Detection.
WOAH @NAACL 2024 - 8th Workshop on Online Abuse and Harms at the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Mexico City, Mexico, Jun 16-21, 2024. DOI

[17]

V. Hangya • A. Fraser
How to Solve Few-Shot Abusive Content Detection Using the Data We Actually Have.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[16]

M. Marco • A. Fraser
Analyzing the Understanding of Morphologically Complex Words in Large Language Models.
LREC-COLING 2024 - Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024. URL

[15]

A. Chronopoulou
Efficient multilingual and domain adaptation of language models under resource constraints.
Dissertation LMU München. Jan. 2024. DOI

2023

[14]

M. Weller-Di Marco • K. Hämmerl • A. Fraser
A Study on Accessing Linguistic Information in Pre-Trained Language Models by Using Prompts.
EMNLP 2023 - Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[13]

W. Lai • A. Chronopoulou • A. Fraser
Mitigating Data Imbalance and Representation Degeneration in Multilingual Machine Translation.
Findings @EMNLP 2023 - Findings of the Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[12]

V. Hangya • S. Severini • R. Ralev • A. Fraser • H. Schütze
Multilingual Word Embeddings for Low-Resource Languages using Anchors and a Chain of Related Languages.
MRL @EMNLP 2023 - 3rd Workshop on Multi-lingual Representation Learning at the Conference on Empirical Methods in Natural Language Processing. Singapore, Dec 06-10, 2023. DOI

[11]

W. Lai • V. Hangya • A. Fraser
Extending Multilingual Machine Translation through Imitation Learning.
Preprint (Nov. 2023). arXiv

[10]

V. Hangya • A. Fraser
LMU at HaSpeeDe3: Multi-Dataset Training for Cross-Domain Hate Speech Detection.
EVALITA 2023 - Final Workshop of the 8th evaluation campaign. Parma, Italy, Sep 07-08, 2023. PDF

[9]

K. Hämmerl • B. Deiseroth • P. Schramowski • J. Libovický • C. Rothkopf • A. Fraser • K. Kersting
Speaking Multiple Languages Affects the Moral Bias of Language Models.
Findings @ACL 2023 - Findings of the 61th Annual Meeting of the Association for Computational Linguistics. Toronto, Canada, Jul 09-14, 2023. DOI

[8]

K. Hämmerl • A. Fastowski • J. Libovický • A. Fraser
Exploring Anisotropy and Outliers in Multilingual Language Models for Cross-Lingual Semantic Sentence Similarity.
Findings @ACL 2023 - Findings of the 61th Annual Meeting of the Association for Computational Linguistics. Toronto, Canada, Jul 09-14, 2023. DOI

[7]

Y. Liu • A. Chronopoulou • H. Schütze • A. Fraser
On the Copying Problem of Unsupervised NMT: A Training Schedule with a Language Discriminator Loss.
IWSLT 2023 - 20th International Conference on Spoken Language Translation. Toronto, Canada, Jul 09-14, 2023. DOI

[6]

A. Chronopoulou • M. Peters • A. Fraser • J. Dodge
AdapterSoup: Weight Averaging to Improve Generalization of Pretrained Language Models.
Findings @EACL 2023 - Findings of the 17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia, May 02-06, 2023. DOI

[5]

A. Chronopoulou • D. Stojanovski • A. Fraser
Language-Family Adapters for Low-Resource Multilingual Neural Machine Translation.
LoResMT @EACL 2023 - 6th Workshop on Technologies for Machine Translation of Low-Resource Languages at the 17th Conference of the European Chapter of the Association for Computational Linguistics. Dubrovnik, Croatia, May 02-06, 2023. DOI

2022

[4]

V. Hangya • H. S. Saadi • A. Fraser
Improving Low-Resource Languages in Pre-Trained Multilingual Language Models.
EMNLP 2022 - Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[3]

W. Lai • A. Chronopoulou • A. Fraser
m4 Adapter: Multilingual Multi-Domain Adaptation for Machine Translation with a Meta-Adapter.
Findings @EMNLP 2022 - Findings of the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[2]

H. S. Saadi • V. Hangya • T. Eder • A. Fraser
Comparative Analysis of Cross-lingual Contextualized Word Embeddings.
MRL 2022 @EMNLP 2022 - 2nd Workshop on Multi-lingual Representation Learning at the Conference on Empirical Methods in Natural Language Processing. Abu Dhabi, United Arab Emirates, Nov 07-11, 2022. DOI

[1]

S. Severini • V. Hangya • M. J. Sabet • A. Fraser • H. Schütze
Don’t Forget Cheap Training Signals Before Building Unsupervised Bilingual Word Embeddings.
BUCC @LREC 2022 - 15th Workshop on Building and Using Comparable Corpora at the 13th International Conference on Language Resources and Evaluation. Marseille, France, Jun 21-23, 2022. URL

©all images: LMU | TUM

2024-12-27 - Last modified: 2026-07-01