Home  | Publications | BBC+25

Testing Spatial Intuitions of Humans and Large Language and Multimodal Models in Analogies

MCML Authors

Abstract

Language and Vision-Language Models exhibit impressive language capabilities akin to human reasoning. However, unlike humans who acquire language through embodied, interactive experiences, these models learn from static datasets without real-world interaction. This difference raises questions about how they conceptualize abstract notions and whether their reasoning aligns with human cognition. We investigate spatial conceptualizations of LLMs and VLMs by conducting analogy prompting studies with LLMs, VLMs, and human participants. We assess their ability to generate and interpret analogies for spatial concepts. We quantitatively compare the analogies produced by each group, examining the impact of multimodal inputs and reasoning mechanisms. Our findings indicate that generative models can produce and interpret analogies but differ significantly from human reasoning in their abstraction of spatial concepts - variability influenced by input modality, model size, and prompting methods, with analogy-based prompts not consistently enhancing alignment. Contributions include a methodology for probing generative models through analogies; a comparative analysis of analogical reasoning among models, and humans; and insights into the effect of multimodal inputs on reasoning.

inproceedings


Analogy-Angle II @ACL 2025

2nd Workshop on Analogical Abstraction in Cognition, Perception, and Language at the 63rd Annual Meeting of the Association for Computational Linguistics. Vienna, Austria, Jul 27-Aug 01, 2025.

Authors

I. Bueno • A. Bavaresco • J. M. Cunha • P. Wicke

Links

URL

Research Areas

 B2 | Natural Language Processing

 B3 | Multimodal Perception

BibTeXKey: BBC+25

Back to Top