When a text generation system is applied under different task constraints, the distribution of its outputs may change even if model parameters remain fixed. Characterizing such task-induced drift is important for understanding and monitoring generative systems in deployment. In this work, we study how within-input production variability shifts when humans and four large language models move from open-ended storytelling to constrained translation. Rather than relying solely on scalar variability summaries, we represent each input’s pairwise distance profile as a probability vector on the simplex and analyze the resulting distributions geometrically using credal sets—convex regions that capture the shape and spread of variability distributions. Across semantic, lexical, and syntactic distance measures, both human and model output distributions exhibit systematic variability contraction under translation constraints. At the population level, the human task-conditioned distributions show the largest observed cross-task displacement in this study and approximately preserve overall variability-distribution shape, whereas model distributions tend to redistribute variability unevenly across dimensions under the present operationalization, with lexical contraction accompanied by syntactic expansion. Drift directions are moderately aligned at the semantic level but less consistent for surface-level variability. These findings suggest that geometric variability signatures provide an informative, reference-free signal for characterizing how generative systems respond to changing task conditions.
inproceedings Gar26
BibTeXKey: Gar26