Author:
Ando Risako,Ozeki Kentaro,Morishita Takanobu,Abe Hirohiko,Mineshima Koji,Okada Mitsuhiro
Abstract
AbstractIn recent years, research on large language models (LLMs) has been advancing rapidly, making the evaluation of their reasoning abilities a crucial issue. Within cognitive science, there has been extensive research on human reasoning biases. It is widely observed that humans often use graphical representations as auxiliary tools during inference processes to avoid reasoning biases. However, currently, the evaluation of LLMs’ reasoning abilities has largely focused on linguistic inferences, with insufficient attention given to inferences using diagrams. In this study, we concentrate on syllogisms, a basic form of logical reasoning, and evaluate the reasoning abilities of LLMs supplemented by Euler diagrams. We systematically investigate how accurately LLMs can perform logical reasoning when using diagrams as auxiliary input and whether they exhibit similar reasoning biases to those of humans. Our findings indicate that, overall, providing diagrams as auxiliary input tends to improve models’ performance, including in problems that show reasoning biases, but the effect varies depending on the conditions, and the improvement in accuracy is not as high as that seen in humans. We present results from experiments conducted under multiple conditions, including a Chain-of-Thought setting, to highlight where there is room to improve logical diagrammatic reasoning abilities of LLMs.
Publisher
Springer Nature Switzerland
Reference29 articles.
1. Ando, R., Morishita, T., Abe, H., Mineshima, K., Okada, M.: Evaluating large language models with NeuBAROCO: Syllogistic reasoning ability and human-like biases. In: Proceedings of the 4th NALOMA Workshop, pp. 1–11 (2023)
2. Barwise, J., Shimojima, A.: Surrogate reasoning. Cogn. Stud.: Bull. Jpn. Cogn. Sci. Soc. 2(4), 7–27 (1995)
3. Brown, T., et al.: Language models are few-shot learners. Adv. Neural. Inf. Process. Syst. 33, 1877–1901 (2020)
4. Chater, N., Oaksford, M.: The probability heuristics model of syllogistic reasoning. Cogn. Psychol. 38(2), 191–258 (1999)
5. Dagan, I., Roth, D., Zanzotto, F., Sammons, M.: Recognizing Textual Entailment: Models and Applications. Springer, Heidelberg (2022). https://doi.org/10.1007/978-3-031-02151-0