Home | Publications | SMK+26

Language Models Learn Universal Representations of Numbers and Here's Why You Should Care

MCML Authors

Philipp Mondorf

→ Group Barbara Plank
AI and Computational Linguistics

Abstract

Prior work has shown that large language models (LLMs) often converge to accurate input embedding for numbers, based on sinusoidal representations. In this work, we quantify that these representations are in fact strikingly systematic, to the point of being almost perfectly universal: different LLM families develop equivalent sinusoidal structures, and number representations are broadly interchangeable in a large swathe of experimental setups. We show that properly factoring in this characteristic is crucial when it comes to assessing how accurately LLMs encode numeric and other ordinal information, and that mechanistically enhancing this sinusoidality can also lead to reductions of LLMs' arithmetic errors.

inproceedings SMK+26