Abstract
AbstractDeparting from traditional linguistic models, advances in deep learning have resulted in a new type of predictive (autoregressive) deep language models (DLMs). Using a self-supervised next-word prediction task, these models are trained to generate appropriate linguistic responses in a given context. We provide empirical evidence that the human brain and autoregressive DLMs share three fundamental computational principles as they process natural language: 1) both are engaged in continuous next-word prediction before word-onset; 2) both match their pre-onset predictions to the incoming word to calculate post-onset surprise (i.e., prediction error signals); 3) both represent words as a function of the previous context. In support of these three principles, our findings indicate that: a) the neural activity before word-onset contains context-dependent predictive information about forthcoming words, even hundreds of milliseconds before the words are perceived; b) the neural activity after word-onset reflects the surprise level and prediction error; and c) autoregressive DLM contextual embeddings capture the neural representation of context-specific word meaning better than arbitrary or static semantic embeddings. Together, our findings suggest that autoregressive DLMs provide a novel and biologically feasible computational framework for studying the neural basis of language.
Publisher
Cold Spring Harbor Laboratory
Reference96 articles.
1. Syntactic Structure from Deep Learning;Annu. Rev. Linguist,2021
2. Syntactic Structures
3. Jacobs, R. A. & Rosenbaum, P. S. English transformational grammar. (1968).
4. Radford, A. et al. Language models are unsupervised multitask learners. OpenAI Blog (2019).
5. Brown, T. B. et al. Language Models are Few-Shot Learners. arXiv [cs.CL] (2020).
Cited by
28 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献