Home  | Publications | OZF26

Optical Character Recognition for the International Phonetic Alphabet

MCML Authors

Link to Profile Alexander Fraser PI Matchmaking

Alexander Fraser

Prof. Dr.

Principal Investigator

Abstract

As grammar books are increasingly used as additional reference resources specifically for very low-resource languages, a significant portion comes from scans and relies on the quality of the Optical Character Recognition (OCR) tool. We focus here on a particular script used in linguistics to transcribe sounds: the International Phonetic Alphabet (IPA). We consider two data sources: actual grammar book PDFs for two languages under documentation, Japhug and Kagayanen, and a synthetically generated dataset based on Wiktionary. We compare two neural OCR frameworks, Tesseract and Calamari, and a recent large vision-language model, Qwen2.5-VL-7B, all three in an off-the-shelf setting and with fine-tuning. While their zero-shot performance is relatively poor for IPA characters in general due to character set mismatch, fine-tuning with the synthetic dataset leads to notable improvements.

inproceedings OZF26


EACL 2026

19th Conference of the European Chapter of the Association for Computational Linguistics. Rabat, Morocco, Mar 24-29, 2026.
Conference logo
A Conference

Authors

S. Okabe • D. Zelo • A. Fraser

Links

DOI

Research Area

 B2 | Natural Language Processing

BibTeXKey: OZF26

Back to Top