Home  | Publications | KSR+25

Laplace Sample Information: Data Informativeness Through a Bayesian Lens

MCML Authors

Georgios Kaissis

Dr.

Associate

* Former Associate

Abstract

Accurately estimating the informativeness of individual samples in a dataset is an important objective in deep learning, as it can guide sample selection, which can improve model efficiency and accuracy by removing redundant or potentially harmful samples. We propose Laplace Sample Information (LSI) measure of sample informativeness grounded in information theory widely applicable across model architectures and learning settings. LSI leverages a Bayesian approximation to the weight posterior and the KL divergence to measure the change in the parameter distribution induced by a sample of interest from the dataset. We experimentally show that LSI is effective in ordering the data with respect to typicality, detecting mislabeled samples, measuring class-wise informativeness, and assessing dataset difficulty. We demonstrate these capabilities of LSI on image and text data in supervised and unsupervised settings. Moreover, we show that LSI can be computed efficiently through probes and transfers well to the training of large models.

inproceedings


ICLR 2025

13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025.
Conference logo
A* Conference

Authors

J. Kaiser • K. Schwethelm • D. RückertG. Kaissis

Links

URL

Research Area

 C1 | Medicine

BibTeXKey: KSR+25

Back to Top