Thought flow nets: From single predictions to trains of model thought-Reference-Cited by-同舟云学术

Thought flow nets: From single predictions to trains of model thought

Published:2024-09-06 Issue: Volume: Page:1-32
ISSN:2977-0424
Container-title:Natural Language Processing
language:en
Short-container-title:Nat. lang. processing

Author:

Schuff Hendrik^ORCID,Adel Heike,Vu Ngoc Thang

Abstract

Abstract When humans solve complex problems, they typically construct, reflect, and revise sequences of ideas, hypotheses, and beliefs until a final decision or conclusion is reached. Contrary to this, current machine learning models are mostly trained to map an input to one single and fixed output. In this paper, we investigate how we can equip models with the ability to represent, construct, and evaluate a second, third, and

$k$

-th thought within their prediction process. Drawing inspiration from Hegel’s dialectics, we propose and evaluate the thought flow concept which constructs a sequence of predictions. We present a self-correction mechanism which (a) is trained to estimate the model’s correctness and which (b) performs iterative prediction updates based on the gradient of the correctness prediction. We introduce our method focusing initially on question answering (QA) and carry out extensive experiments which demonstrate that (i) our method is able to correct its own predictions and that (ii) it can improve model performance by a large margin. In addition, we conduct a qualitative analysis of thought flow correction patterns and explore how thought flow predictions affect users’ human-AI collaboration in a crowdsourcing study. We find that (iii) thought flows improve user performance and are perceived as more natural, correct, and intelligent regarding single and/or top-3 predictions.

Publisher

Cambridge University Press (CUP)

Reference59 articles.

1. Banino, A. , Balaguer, J. and Blundell, C. (2021). Pondernet: Learning to ponder. CoRR, abs/2107.05407.

2. Gal, Y. and Ghahramani, Z. (2016). Dropout as a Bayesian approximation: Representing model uncertainty in deep learning. In Proceedings of the 33nd International Conference on Machine Learning, ICML 2016, June 19-24, 2016, New York City, NY, USA, vol. 48, pp. 1050–1059. JMLR Workshop and Conference Proceedings, JMLR.org.

3. Guo, C. , Pleiss, G. , Sun, Y. and Weinberger, K.Q. (2017). On calibration of modern neural networks. In Proceedings of the 34th International Conference on Machine Learning, ICML 2017, 6-11 August 2017, Sydney, NSW, Australia, vol. 70, pp. 1321–1330, Proceedings of Machine Learning Research, PMLR.

4. An iterative prediction and correction method for automatic stereocomparison