Findings of the WMT 2025 Shared Task LLMs With Limited Resources for Slavic Languages: MT and QA
MCML Authors
Abstract
Abstract
We present the findings of the WMT 2025 Shared Task LLMs with Limited Resources for Slavic Languages. This shared task focuses on training LLMs using limited data and compute resources for three Slavic languages: Upper Sorbian (hsb), Lower Sorbian (dsb), and Ukrainian (uk), with the objective to develop and improve LLMs for these languages. We consider two tasks which are to be evaluated jointly: Machine Translation (MT) and Multiple-Choice Question Answering (QA). In total, three teams participated in this shared task, with submissions from all three teams for the Sorbian languages and one submission for Ukrainian. All submissions led to an improvement compared to the baseline Qwen2.5-3B model through varying fine-tuning strategies. We note, however, that training purely on MT degrades original QA capabilities. We also report further analyses on the submissions, including MT evaluation using advanced neural metrics for Ukrainian, as well as manual annotation and comparison to the current Sorbian machine translator.
inproceedings ODD+25
WMT @EMNLP 2025
10th Conference on Machine Translation at the Conference on Empirical Methods in Natural Language Processing. Suzhou, China, Nov 04-09, 2025.Authors
S. Okabe • D. Dementieva • M. Di Marco • L. Edman • K. Hämmerl • M. Měškank • A. Hendrichowa • A. FraserLinks
DOIResearch Area
BibTeXKey: ODD+25