Home  | Publications | LMS25

XAMPLER: Learning to Retrieve Cross-Lingual In-Context Examples

MCML Authors

Link to Profile Hinrich Schütze PI Matchmaking

Hinrich Schütze

Prof. Dr.

Principal Investigator

Abstract

Recent studies indicate that leveraging off-the-shelf or fine-tuned retrievers, capable of retrieving relevant in-context examples tailored to the input query, enhances few-shot in-context learning of English. However, adapting these methods to other languages, especially low-resource ones, poses challenges due to the scarcity of cross-lingual retrievers and annotated data. Thus, we introduce XAMPLER: Cross-Lingual Example Retrieval, a method tailored to tackle the challenge of cross-lingual in-context learning using only annotated English data. XAMPLER first trains a retriever based on Glot500, a multilingual small language model, using positive and negative English examples constructed from the predictions of a multilingual large language model, i.e., MaLA500. Leveraging the cross-lingual capacity of the retriever, it can directly retrieve English examples as few-shot examples for in-context learning of target languages. Experiments on the multilingual text classification benchmark SIB200 with 176 languages show that XAMPLER substantially improves the in-context learning performance across languages.

inproceedings


Findings @NAACL 2025

Findings of the Annual Conference of the North American Chapter of the Association for Computational Linguistics. Albuquerque, NM, USA, Apr 29-May 04, 2025.
Conference logo
A Conference

Authors

P. Lin • A. F. T. Martins • H. Schütze

Links

DOI GitHub

Research Area

 B2 | Natural Language Processing

BibTeXKey: LMS25

Back to Top