Home | Publications | OBG+26

TimeSAE: Causal Sparse Decoding for Faithful Explanations of Black-Box Time Series Models

MCML Authors

Quentin Bouniot

Dr.

* Former Member

→ Group Zeynep Akata
Interpretable and Reliable Machine Learning

Zeynep Akata

Prof. Dr.

Core PI

Interpretable and Reliable Machine Learning

Abstract

Post-training adaptation of large language models is commonly achieved through parameter updates or input based methods such as fine-tuning, parameter-efficient adaptation, and prompting. In parallel, a growing body of work modifies internal activations at inference time to influence model behavior, an approach known as *steering*. Despite increasing use, steering is rarely analyzed within the same conceptual framework as established adaptation methods.In this work, we argue that steering should be regarded as a form of model adaptation. We introduce a set of functional criteria for adaptation methods and use them to compare steering approaches with classical alternatives. This analysis positions steering as a distinct adaptation paradigm based on targeted interventions in activation space, enabling local and reversible behavioral change without parameter updates. The resulting framing clarifies how steering relates to existing methods, motivating a unified taxonomy for model adaptation.

inproceedings OBG+26

ICML 2026

43rd International Conference on Machine Learning. Seoul, South Korea, Jul 06-11, 2026. To be published. Preprint available.

Authors

K. Oublal • Q. Bouniot • Q. Gan • S. Clémençon • Z. Akata

Links

URL GitHub

Research Area

B1 | Computer Vision

BibTeXKey: OBG+26

#p-akata