Home  | Publications | AOF+25

A Practical Tool to Help Automate Interlinear Glossing: A Study on Mukrī Kurdish

MCML Authors

Link to Profile Alexander Fraser PI Matchmaking

Alexander Fraser

Prof. Dr.

Principal Investigator

Abstract

Interlinear gloss generation aims to predict linguistic annotations (gloss) for a sentence in a language that is usually under ongoing documentation. Such output is a first draft for the linguist to work with and should reduce the manual workload.This article studies a simple glossing pipeline based on a Conditional Random Field and applies it to a small fieldwork corpus in Mukrī Kurdish, a variety of Central Kurdish.We mainly focus on making the tool as accessible as possible for field linguists, so it can run on standard computers without the need for GPUs. Our pipeline predicts common grammatical patterns robustly and, more generally, frequent combinations of morphemes and glosses. Although more advanced neural models do reach better results, our feature-based system still manages to be competitive and to provide interpretability.To foster further collaboration between field linguistics and NLP, we also provide some recommendations regarding documentation endeavours and release our pipeline code alongside.

inproceedings


Field Matters @ACL 2025

4th Workshop on NLP Applications to Field Linguistics at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025.

Authors

H. Asadpour • S. OkabeA. Fraser

Links

URL

Research Area

 B2 | Natural Language Processing

BibTeXKey: AOF+25

Back to Top