Home | Publications | LRB+25

Calibrating LLMs With Information-Theoretic Evidential Deep Learning

MCML Authors

Yawei Li

→ Group Bernd Bischl
Statistical Learning and Data Science

David Rügamer

Prof. Dr.

Principal Investigator

Statistics, Data Science and Machine Learning

Bernd Bischl

Prof. Dr.

Director

Statistical Learning and Data Science

Mina Rezaei

Dr.

→ Group Bernd Bischl
Statistical Learning and Data Science

Abstract

Fine-tuned large language models (LLMs) often exhibit overconfidence, particularly when trained on small datasets, resulting in poor calibration and inaccurate uncertainty estimates. Evidential Deep Learning (EDL), an uncertainty-aware approach, enables uncertainty estimation in a single forward pass, making it a promising method for calibrating fine-tuned LLMs. However, despite its computational efficiency, EDL is prone to overfitting, as its training objective can result in overly concentrated probability distributions. To mitigate this, we propose regularizing EDL by incorporating an information bottleneck (IB). Our approach IB-EDL suppresses spurious information in the evidence generated by the model and encourages truly predictive information to influence both the predictions and uncertainty estimates. Extensive experiments across various fine-tuned LLMs and tasks demonstrate that IB-EDL outperforms both existing EDL and non-EDL approaches. By improving the trustworthiness of LLMs, IB-EDL facilitates their broader adoption in domains requiring high levels of confidence calibration.

inproceedings LRB+25

ICLR 2025

13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025.

Authors

Y. Li • D. Rügamer • B. Bischl • M. Rezaei

Links

URL GitHub

Research Area

A1 | Statistical Foundations & Explainability

BibTeXKey: LRB+25

#p-bischl #p-ruegamer