Home | Publications | BWP26

Standard-to-Dialect Transfer Trends Differ Across Text and Speech: A Case Study on Intent and Topic Classification in German Dialects

MCML Authors

Verena Blaschke

→ Group Barbara Plank
AI and Computational Linguistics

Barbara Plank

Prof. Dr.

Core PI

AI and Computational Linguistics

Abstract

Research on cross-dialectal transfer from a standard to a non-standard dialect variety has typically focused on text data. However, dialects are primarily spoken, and non-standard spellings cause issues in text processing. We compare standard-to-dialect transfer in three settings: text models, speech models, and cascaded systems where speech first gets automatically transcribed and then further processed by a text model. We focus on German dialects in the context of written and spoken intent classification – releasing the first dialectal audio intent classification dataset – with supporting experiments on topic classification. The speech-only setup provides the best results on the dialect data while the text-only setup works best on the standard data. While the cascaded systems lag behind the text-only models for German, they perform relatively well on the dialectal data if the transcription system generates normalized, standard-like output.

inproceedings BWP26

ACL 2026

64th Annual Meeting of the Association for Computational Linguistics. San Diego, CA, USA, Jul 02-07, 2026.

Authors

V. Blaschke • M. Winkler • B. Plank

Links

DOI

Research Area

B2 | Natural Language Processing

BibTeXKey: BWP26

#p-plank