Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects-Reference-Cited by-同舟云学术

Strong Prediction: Language Model Surprisal Explains Multiple N400 Effects

Published:2024-01-04 Issue: Volume: Page:1-29
ISSN:2641-4368
Container-title:Neurobiology of Language
language:en
Short-container-title:

Author:

Michaelov James A.¹^ORCID,Bardolph Megan D.¹^ORCID,Van Petten Cyma K.²^ORCID,Bergen Benjamin K.¹^ORCID,Coulson Seana¹^ORCID

Affiliation:

1. Department of Cognitive Science, University of California, San Diego, La Jolla, CA, USA

2. Department of Psychology, Binghamton University, State University of New York, Binghamton, NY, USA

Abstract

Abstract Theoretical accounts of the N400 are divided as to whether the amplitude of the N400 response to a stimulus reflects the extent to which the stimulus was predicted, the extent to which the stimulus is semantically similar to its preceding context, or both. We use state-of-the-art machine learning tools to investigate which of these three accounts is best supported by the evidence. GPT-3, a neural language model trained to compute the conditional probability of any word based on the words that precede it, was used to operationalize contextual predictability. In particular, we used an information-theoretic construct known as surprisal (the negative logarithm of the conditional probability). Contextual semantic similarity was operationalized by using two high-quality co-occurrence-derived vector-based meaning representations for words: GloVe and fastText. The cosine between the vector representation of the sentence frame and final word was used to derive contextual cosine similarity estimates. A series of regression models were constructed, where these variables, along with cloze probability and plausibility ratings, were used to predict single trial N400 amplitudes recorded from healthy adults as they read sentences whose final word varied in its predictability, plausibility, and semantic relationship to the likeliest sentence completion. Statistical model comparison indicated GPT-3 surprisal provided the best account of N400 amplitude and suggested that apparently disparate N400 effects of expectancy, plausibility, and contextual semantic similarity can be reduced to variation in the predictability of words. The results are argued to support predictive coding in the human language network.

Funder

Center for Academic Research and Training in Anthropogeny

Publisher

MIT Press

Subject

Neurology,Linguistics and Language

Link

https://direct.mit.edu/nol/article-pdf/doi/10.1162/nol_a_00105/2203041/nol_a_00105.pdf

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Demystifying large language models in second language development research;Computer Speech & Language;2025-01

2. Clinical efficacy of pre-trained large language models through the lens of aphasia;Scientific Reports;2024-07-06

3. A Psycholinguistics-inspired Method to Counter IP Theft Using Fake Documents;ACM Transactions on Management Information Systems;2024-06-12

4. Driving and suppressing the human language network using large language models;Nature Human Behaviour;2024-01-03

5. On the Mathematical Relationship Between Contextual Probability and N400 Amplitude;Open Mind;2024