Home | Publications | Gar26

A Credal-Set Perspective on Task-Induced Distributional Drift in Text Generation

MCML Authors

Esteban Garces Arias

→ Group Bernd Bischl
Statistical Learning and Data Science

Abstract

When a text generation system is applied under different task constraints, the distribution of its outputs may change even if model parameters remain fixed. Characterizing such task-induced drift is important for understanding and monitoring generative systems in deployment. In this work, we study how within-input production variability shifts when humans and four large language models move from open-ended storytelling to constrained translation. Rather than relying solely on scalar variability summaries, we represent each input’s pairwise distance profile as a probability vector on the simplex and analyze the resulting distributions geometrically using credal sets—convex regions that capture the shape and spread of variability distributions. Across semantic, lexical, and syntactic distance measures, both human and model output distributions exhibit systematic variability contraction under translation constraints. At the population level, the human task-conditioned distributions show the largest observed cross-task displacement in this study and approximately preserve overall variability-distribution shape, whereas model distributions tend to redistribute variability unevenly across dimensions under the present operationalization, with lexical contraction accompanied by syntactic expansion. Drift directions are moderately aligned at the semantic level but less consistent for surface-level variability. These findings suggest that geometric variability signatures provide an informative, reference-free signal for characterizing how generative systems respond to changing task conditions.

inproceedings Gar26

CAO @ICLR 2026

Workshop Catch, Adapt, and Operate: Monitoring ML Models Under Drift at the 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available.