Abstract
AbstractThe transformers that drive chatbots and other AI systems constitute large language models (LLMs). These are currently the focus of a lively discussion in both the scientific literature and the popular media. This discussion ranges from hyperbolic claims that attribute general intelligence and sentience to LLMs, to the skeptical view that these devices are no more than “stochastic parrots”. I present an overview of some of the weak arguments that have been presented against LLMs, and I consider several of the more compelling criticisms of these devices. The former significantly underestimate the capacity of transformers to achieve subtle inductive inferences required for high levels of performance on complex, cognitively significant tasks. In some instances, these arguments misconstrue the nature of deep learning. The latter criticisms identify significant limitations in the way in which transformers learn and represent patterns in data. They also point out important differences between the procedures through which deep neural networks and humans acquire knowledge of natural language. It is necessary to look carefully at both sets of arguments in order to achieve a balanced assessment of the potential and the limitations of LLMs.
Publisher
Springer Science and Business Media LLC
Subject
Linguistics and Language,Philosophy,Computer Science (miscellaneous)
Reference48 articles.
1. Bai, J., Wang, Y., Chen, Y., Yang, Y., Bai, J., Yu, J., & Tong, Y. (2021). Syntax-BERT: Improving pre-trained transformers with syntax trees. In: Proceedings of the 16th Conference of the European Chapter of the Association for Computational Linguistics, (pp. 3011–3020). Association for Computational Linguistics
2. Baroni, M. (2023). On the proper role of linguistically oriented deep net analysis in linguistic theorising. In S. Lappin & J.-P. Bernardy (Eds.), Algebraic Structures in Natural Language (pp. 1–16). D: CRC Press.
3. Bender, E. M., & Koller, A. (2020). Climbing towards NLU: On meaning, form, and understanding in the age of data. In: Proceedings of the 58th Annual Meeting of the Association for Computational Linguistics, (pp. 5185–5198). Association for Computational Linguistics,
4. Bender, E. M., Gebru, T., McMillan-Major, A., & Shmitchell, S. (2021). On the dangers of stochastic parrots: Can language models be too big? FAccT ’21, (pp. 610–623). Association for Computing Machinery
5. Bernardy, J.-P. & S. Lappin (2023).: Unitary recurrent networks. In S. Lappin & J.-P. Bernardy (Eds.), Algebraic Structures in Natural Language (pp. 243–277). CRC Press.
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献