TimeSAE: Sparse Decoding for Faithful Explanations of Black-Box Time Series Models
MCML Authors
Quentin Bouniot
Dr.
* Former Member
Abstract
Quentin Bouniot
Dr.
* Former Member
Abstract
Post-training adaptation of large language models is commonly achieved through parameter updates or input based methods such as fine-tuning, parameter-efficient adaptation, and prompting. In parallel, a growing body of work modifies internal activations at inference time to influence model behavior, an approach known as *steering*. Despite increasing use, steering is rarely analyzed within the same conceptual framework as established adaptation methods.In this work, we argue that steering should be regarded as a form of model adaptation. We introduce a set of functional criteria for adaptation methods and use them to compare steering approaches with classical alternatives. This analysis positions steering as a distinct adaptation paradigm based on targeted interventions in activation space, enabling local and reversible behavioral change without parameter updates. The resulting framing clarifies how steering relates to existing methods, motivating a unified taxonomy for model adaptation.
inproceedings OBG+26
ICML 2026
43rd International Conference on Machine Learning. Seoul, South Korea, Jul 06-11, 2026. To be published. Preprint available.Authors
K. Oublal • Q. Bouniot • Q. Gan • S. Clémençon • Z. AkataLinks
URL GitHubResearch Area
BibTeXKey: OBG+26