Home | Publications | LZA+26

Evaluating Robustness of Large Language Models Against Multilingual Typographical Errors

MCML Authors

Yihong Liu

→ Group Hinrich Schütze
Computational Linguistics

Raoyuan Zhao

→ Group Michael Hedderich
AI and Computational Linguistics

Hinrich Schütze

Prof. Dr.

Core PI

Computational Linguistics

Michael Hedderich

Dr.

JRG Leader Human-Centered NLP

AI and Computational Linguistics

Abstract

Large language models (LLMs) are increasingly deployed in multilingual, real-world applications with user inputs -- naturally introducing emph{typographical errors} (typos). Yet most benchmarks assume clean input, leaving the robustness of LLMs to typos across languages largely underexplored. To address this gap, we introduce MulTypo, a multilingual typo generation algorithm that simulates human-like errors based on language-specific keyboard layouts and typing behavior. We evaluate 18 open-source LLMs across three model families and five downstream tasks spanning language inference, multi-choice question answering, mathematical reasoning, and machine translation tasks. Our results show that typos consistently degrade performance, particularly in generative tasks and those requiring reasoning -- while the natural language inference task is comparatively more robust. Instruction tuning improves clean-input performance but may increase brittleness under noise. We also observe language-dependent robustness: high-resource languages are generally more robust than low-resource ones, and translation from English is more robust than translation into English. Our findings underscore the need for noise-aware training and multilingual robustness evaluation. We release a Python package for MulTypo and make the source code publicly available.

inproceedings LZA+26

ACL 2026

64th Annual Meeting of the Association for Computational Linguistics. San Diego, CA, USA, Jul 02-07, 2026. To be published. Preprint available.

Authors

Y. Liu • R. Zhao • L. Altinger • H. Schütze • M. A. Hedderich

Links

arXiv GitHub

Research Area

B2 | Natural Language Processing

BibTeXKey: LZA+26

#p-hedderich #p-schuetze