Home  | Publications | AFR+26

Cached Foundation Model Summaries for Memory-Efficient Clinical Time Series Inference

MCML Authors

Link to Profile Peter Schüffler

Peter Schüffler

Prof. Dr.

Core PI

Abstract

Transformer-based models for clinical time series face a deployment bottleneck: patient histories can span thousands of irregularly spaced events, yet inference hardware imposes strict memory budgets. We study a simple decoupling strategy in which a pretrained foundation model compresses a patient's historical events into a fixed-size cached summary offline, and a lightweight prediction model processes only a short window of recent events conditioned on that summary at inference time. Through 252 experiments on MIMIC-IV we characterize when this strategy is worthwhile. The central finding is a clear pattern of diminishing returns: cached summaries yield a 6.5% relative AUROC gain when the recent window is limited to 8 events (p < 0.001), but the benefit shrinks to a statistically insignificant 0.1% once the window reaches 256 events. We further show that modulating event representations with the summary (FiLM) outperforms treating it as an additional input token (p < 0.001), and that summaries of recent history are more informative than those of distant history (p < 0.01). Together, these results provide actionable guidance for allocating context budgets when deploying sequence models on long, irregular time series under memory constraints.

inproceedings AFR+26


TSALM @ICLR 2026

Workshop on Time Series in the Age of Large Models at the 14th International Conference on Learning Representations. Rio de Janeiro, Brazil, Apr 23-27, 2026. To be published. Preprint available.

Authors

R. Al Attrach • R. Fani • D. Restrepo • Y. jia • L. A. Celi • P. J. Schüffler

Links

URL

Research Area

 C1 | Medicine

BibTeXKey: AFR+26

Back to Top