Abstract
ABSTRACTAccumulated evidence suggests that Large Language Models (LLMs) are beneficial in predicting neural signals related to narrative processing. The way LLMs integrate context over large timescales, however, is fundamentally different from the way the brain does it. In this study, we show that unlike LLMs that apply parallel processing of large contextual windows, the incoming context to the brain is limited to short windows of a few tens of words. We hypothesize that whereas lower-level brain areas process short contextual windows, higher-order areas in the default-mode network (DMN) engage in an online incremental mechanism where the incoming short context is summarized and integrated with information accumulated across long timescales. Consequently, we introduce a novel LLM that instead of processing the entire context at once, it incrementally generates a concise summary of previous information. As predicted, we found that neural activities at the DMN were better predicted by the incremental model, and conversely, lower-level areas were better predicted with short-context-window LLM.
Publisher
Cold Spring Harbor Laboratory
Reference36 articles.
1. Pereira, F. et al. Toward a universal decoder of linguis8c meaning from brain ac8va8on. Nat. Commun. 9, 963 (2018).
2. Schwartz, D. , Toneva, M. & Wehbe, L . Inducing brain-relevant bias in natural language processing models. Adv. Neural Inf. Process. Syst. 32, (2019).
3. Schrimpf, M. et al. Artificial neural networks accurately predict language processing in the brain. BioRxiv 2020.06. 26.174482 (2020).
4. Caucheteux, C. , Gramfort, A. & King, J .-R. Model-based analysis of brain ac8vity reveals the hierarchy of language in 305 subjects. in EMNLP 2021-Conference on Empirical Methods in Natural Language Processing (2021).
5. Caucheteux, C. , Gramfort, A. & King, J.-R . Evidence of a predic8ve coding hierarchy in the human brain listening to speech. Nat. Hum. Behav. 7, 430–441 (2023).