Home  | Publications | KSS24

SilverAlign: MT-Based Silver Data Algorithm for Evaluating Word Alignment

MCML Authors

Link to Profile Hinrich Schütze PI Matchmaking

Hinrich Schütze

Prof. Dr.

Principal Investigator

Abstract

Word alignments are essential for a variety of NLP tasks. Therefore, choosing the best approaches for their creation is crucial. However, the scarce availability of gold evaluation data makes the choice difficult. We propose SilverAlign, a new method to automatically create silver data for the evaluation of word aligners by exploiting machine translation and minimal pairs. We show that performance on our silver data correlates well with gold benchmarks for 9 language pairs, making our approach a valid resource for evaluation of different domains and languages when gold data are not available. This addresses the important scenario of missing gold data alignments for low-resource languages.

inproceedings


LREC-COLING 2024

Joint International Conference on Computational Linguistics, Language Resources and Evalutaion. Torino, Italy, May 20-25, 2024.

Authors

A. Köksal • S. Severini • H. Schütze

Links

URL

Research Area

 B2 | Natural Language Processing

BibTeXKey: KSS24

Back to Top