Home | Publications | KEF+25

Positional Overload: Positional Debiasing and Context Window Extension for Large Language Models Using Set Encoding

MCML Authors

Lukas Edman

Dr.

→ Group Alexander Fraser
Data Analytics & Statistics

Alexander Fraser

Prof. Dr.

Principal Investigator

Data Analytics & Statistics

Abstract

Large Language Models (LLMs) typically track the order of tokens using positional encoding, which causes the following problems: positional bias, where the model is influenced by an ordering within the prompt, and a fixed context window, as models struggle to generalize to positions beyond those encountered during training. To address these limitations, we developed a novel method called set encoding. This method allows multiple pieces of text to be encoded in the same position, thereby eliminating positional bias entirely. Another promising use case for set encoding is to increase the size of the input an LLM can handle. Our experiments demonstrate that set encoding allows an LLM to solve tasks with far more tokens than without set encoding. To our knowledge, set encoding is the first technique to effectively extend an LLM’s context window without requiring any additional training.

inproceedings KEF+25

ACL 2025

63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025.

Authors

L. Kinder • L. Edman • A. Fraser • T. Käfer

Links

DOI

Research Area

B2 | Natural Language Processing

BibTeXKey: KEF+25

#p-fraser