Home  | News

04.12.2025

Teaser image to When to Say "I’m Not Sure": Making Language Models More Self-Aware

When to Say "I’m Not Sure": Making Language Models More Self-Aware

MCML Research Insight - With Yawei Li, David Rügamer, Bernd Bischl, and Mina Rezaei

Large language models like ChatGPT or Gemini are now everywhere, from summarizing text to writing code or answering simple questions. But there’s one thing they still struggle with: admitting uncertainty. Ask a fine-tuned LLM a tricky question, and it might sound quite confident, even when it’s completely wrong. This "overconfidence" can be risky in areas such as healthcare, finance, and science, where trust and uncertainty matter. As these tools become daily information sources, many people treat them as unquestionable authorities, skipping fact-checking or deeper research. The danger isn’t just that AI might be wrong now, but that its confident mistakes can spread, reinforce, and grow, teaching future models what "truth" sounds like.

«...fine-tuned LLMs often display overconfidence in their predictions, which compromises their reliability and limits their applicability in critical domains where trustworthiness is essential.»


Yawei Li et al.

MCML Junior Members

In their ICLR 2025 paper “Calibrating LLMs with Information-Theoretic Evidential Deep Learning”, MCML Junior Members Yawei Li and Mina Rezaei, with MCML PIs David Rügamer and Bernd Bischl, offer a solution. Using statistical methods, they help LLMs stay honest about what they know and what they don’t, and when to admit it.


How it Works

Normally, to check how confident an AI model is, you have to run it many times and compare the answers, a bit like asking ten friends the same question and seeing if they agree. That’s slow and expensive. This paper uses a smarter approach called Evidential Deep Learning (EDL). It lets a model estimate its own confidence in one quick run. Along with an answer, the model gives a number showing how strong its “evidence” is:

  • More evidence means higher confidence
  • Less evidence means more uncertainty

However, when fine-tuned on small datasets, models can still fake confidence, producing too much evidence even when they shouldn’t. That’s where the Information Bottleneck (IB) comes in.


The “Information Bottleneck” Idea

Imagine the model’s brain as a pipeline: data goes in, predictions come out. Along the way, it gathers a lot of information, some useful, while some is noisy or misleading.

The bottleneck acts like a filter, squeezing out unnecessary details and keeping only what truly helps the model decide. For instance, when answering a science question, a model might guess “gravity” just because the word “space” appears. The bottleneck stops this by focusing only on genuinely relevant clues.

In the combined IB-EDL approach, this filter also gives a penalty to the model whenever it tries to increase its confidence using that irrelevant information, helping it stay accurate and humble about what it knows.

The expected result: predictions that are still strong, but realistically cautious.


«By improving the trustworthiness of LLMs, IB-EDL facilitates their broader adoption in domains requiring high levels of confidence calibration.»


Yawei Li et al.

MCML Junior Members

Results

The team tested IB-EDL on popular large language models like Llama-2, Llama-3, and Mistral-7B using commonsense reasoning and reading-comprehension tasks.

Compared to normal fine-tuning, IB-EDL:

  • Reduced overconfidence by about 70%, making the models become much better at knowing when to be unsure.
  • Kept accuracy just as high or even slightly better.
  • Got better at spotting unfamiliar questions, realizing when something was outside its training experience.

Even when the researchers intentionally added mistakes to some of the training data, the models trained with IB-EDL still stayed stable and reliable, showing that this method helps them stay grounded even when there is noise in the data.


Why it Matters

Being smart isn’t enough for AI, it must also know when to be unsure and to let us know. IB-EDL gives these large models a built-in sense of uncertainty, helping them communicate confidence realistically.

That makes them safer, fairer, and far more trustworthy, which is a key step toward truly honest AI.


Challenges

While this method works well for question-answering and classification, measuring uncertainty in open-ended text generation, like creative writing, remains tricky. Future work aims to extend this approach so models can also stay honest when generating longer, freer responses.


Further Reading & Reference

If you’d like to learn more about how the team measured and reduced overconfidence in large language models, or explore the method in your own experiments, check out the full paper appeared at ICLR 2025, one of the world’s leading machine learning conferences.

A* Conference
Y. LiD. RügamerB. BischlM. Rezaei
Calibrating LLMs with Information-Theoretic Evidential Deep Learning.
ICLR 2025 - 13th International Conference on Learning Representations. Singapore, Apr 24-28, 2025. URL GitHub
Paper on OpenReview
Code & Data

Share Your Research!


Get in touch with us!

Are you an MCML Junior Member and interested in showcasing your research on our blog?

We’re happy to feature your work—get in touch with us to present your paper.

#blog #research #bischl #ruegamer
Subscribe to RSS News feed

Related

Link to Research Stay at Princeton University

01.12.2025

Research Stay at Princeton University

Abdurahman Maarouf spent three months at Princeton with the AI X-Change Program, advancing causal ML and studying short-form video platform effects.

Link to

28.11.2025

MCML at NeurIPS 2025

MCML researchers are represented with 46 papers at NeurIPS 2025 (37 Main, and 9 Workshops).

Link to Seeing the Bigger Picture – One Detail at a Time

27.11.2025

Seeing the Bigger Picture – One Detail at a Time

FLAIR, introduced by Zeynep Akata’s group at CVPR 2025, brings fine-grained, text-guided detail recognition to vision-language models.

Link to InterACT Workshop 2025

25.11.2025

InterACT Workshop 2025

InterACT gathered 24 researchers in Munich for intensive xAI collaboration, generating new ideas, joint projects, and a EurIPS 2025 workshop paper.

Link to Daniel Rückert Among the World’s Most Cited Researchers

25.11.2025

Daniel Rückert Among the World’s Most Cited Researchers

MCML Director Daniel Rückert is among the world’s most cited researchers for AI in healthcare, part of 17 TUM scientists recognized in 2025.

Back to Top