Home  | News

16.04.2026

Teaser image to Do Language Models Reason Like Humans?

Do Language Models Reason Like Humans?

MCML Research Insight – With Philipp Mondorf and Barbara Plank

Imagine reading the sentence: “If it rains, the streets will be wet.” Most people would consider it perfectly reasonable. Now consider a different statement: “If the moon is made of cheese, the streets will be wet.” Even if the outcome might sometimes happen, the connection suddenly feels strange. Humans constantly evaluate such “if–then” statements, using both probability and meaning to judge whether they make sense.

 Key Insight


LLMs consider both probability and meaning when judging “if–then” statements, but their reasoning does not always align with human judgments.


In their paper “If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models,” MCML Junior Member Philipp Mondorf and MCML PI Barbara Plank, together with collaborator Jasmin Orth, investigate how large language models (LLMs) evaluate the plausibility of conditional statements, and whether their judgments resemble those of humans.


 Key Question


Do language models judge the acceptability of “if–then” statements using the same signals as humans?


The Idea

Conditional statements of the form “If A, then B” play a central role in communication and reasoning. People use them to make predictions, evaluate arguments, and reason about hypothetical situations. When humans judge whether such statements are acceptable, two factors are particularly important: the conditional probability of the outcome given the premise, and the semantic relevance between the two parts of the statement—whether the premise meaningfully supports the conclusion.

As LLMs become widely used in applications that generate explanations, arguments, or recommendations, it becomes important to understand how they interpret such logical structures. Do they evaluate conditional statements using the same cues that humans rely on?

Illustration of two conditionals

Figure 1: Illustration of two conditionals with equally high conditional probabilities but differing evidential relevance. While both are probable, only the left one encodes a plausible causal or evidential link between antecedent and consequent, appearing more acceptable.


 Core Idea


Compare LLM judgments with human reasoning by analyzing how probability and semantic relevance influence the perceived acceptability of conditional statements.


Research Approach

To explore this question, the researchers conducted a systematic study of how different large language models evaluate conditional statements. They tested multiple model families, model sizes, and prompting strategies, presenting models with a range of conditional statements and asking them to judge how acceptable they were.

The analysis focused on two key factors known to influence human reasoning: conditional probability and semantic relevance. Using statistical techniques such as linear mixed-effects models and ANOVA, the researchers examined how strongly these factors influenced the models’ judgments. The results were then compared with human judgment data from earlier cognitive studies.


 Key Finding


Language models incorporate probabilistic and semantic cues when evaluating conditionals, but their reasoning patterns still differ from human judgments.


Key Findings

The study reveals several interesting patterns in how language models evaluate conditional statements:

  • Models respond to probability. Statements where the conclusion is likely given the premise tend to be rated as more acceptable.
  • Semantic relevance matters as well. When the premise meaningfully supports the conclusion, models judge the statement as more plausible.
  • Alignment with humans is incomplete. While LLMs rely on similar cues, they apply them less consistently than humans do.
  • Larger models are not necessarily more human-like. Increasing model size does not automatically lead to judgments that better match human reasoning.

 Takeaway


Studying how AI evaluates “if–then” statements helps reveal both the strengths and limitations of machine reasoning.


Why This Matters

As language models are increasingly used to assist with reasoning, explanation, and decision support, understanding how they interpret logical statements becomes crucial. If AI systems evaluate conditional statements differently from humans, this could influence how they generate arguments, explain conclusions, or guide decision-making.

Research like this helps uncover where AI reasoning aligns with human cognition, and where important differences remain. By better understanding these differences, researchers can develop models that reason more transparently and communicate more effectively with human users.


Further Reading & Reference

If you would like to learn more about how language models judge the acceptability of conditional statements and how these judgments compare with human reasoning, you can explore the full paper. The work will be presented at EACL 2026 (Conference of the European Chapter of the Association for Computational Linguistics), one of the leading international conferences in natural language processing research.

A Conference
J. Orth • P. MondorfB. Plank
If Probable, Then Acceptable? Understanding Conditional Acceptability Judgments in Large Language Models.
EACL 2026 - 19th Conference of the European Chapter of the Association for Computational Linguistics. Rabat, Morocco, Mar 24-29, 2026. DOI

Share Your Research!


Get in touch with us!

Are you an MCML Junior Member and interested in showcasing your research on our blog?

We’re happy to feature your work—get in touch with us to present your paper.

#blog #research #plank

Related

Link to Research Stay at École Polytechnique

20.04.2026

Research Stay at École Polytechnique

Viktoria Ehm joined a research stay at École Polytechnique via MCML AI X-Change, working on 3D shape analysis and LLM-based methods.

Read more
Link to MCML at CHI 2026

10.04.2026

MCML at CHI 2026

MCML researchers are represented with 6 papers at CHI 2026.

Read more
Link to MCML at ICPC 2026

10.04.2026

MCML at ICPC 2026

MCML researchers are represented with 1 paper at ICPC 2026.

Read more
Link to Nikita Araslanov Receives Prestigious Emmy Noether Grant

09.04.2026

Nikita Araslanov Receives Prestigious Emmy Noether Grant

Nikita Araslanov, MCML Junior Member, awarded Emmy Noether Grant to establish an independent AI research group at TUM.

Read more
Link to How AI Avatars Shape Perceived Fairness

02.04.2026

How AI Avatars Shape Perceived Fairness

Accepted at CHI 2026, this study shows how the race and gender of AI interview avatars shape perceptions of fairness and bias in automated hiring.

Read more
Back to Top