The Natural Stories corpus: a reading-time corpus of English texts containing rare syntactic constructions
-
Published:2020-09-04
Issue:1
Volume:55
Page:63-77
-
ISSN:1574-020X
-
Container-title:Language Resources and Evaluation
-
language:en
-
Short-container-title:Lang Resources & Evaluation
Author:
Futrell Richard,Gibson Edward,Tily Harry J.,Blank Idan,Vishnevetsky Anastasia,Piantadosi Steven T.,Fedorenko Evelina
Abstract
AbstractIt is now a common practice to compare models of human language processing by comparing how well they predict behavioral and neural measures of processing difficulty, such as reading times, on corpora of rich naturalistic linguistic materials. However, many of these corpora, which are based on naturally-occurring text, do not contain many of the low-frequency syntactic constructions that are often required to distinguish between processing theories. Here we describe a new corpus consisting of English texts edited to contain many low-frequency syntactic constructions while still sounding fluent to native speakers. The corpus is annotated with hand-corrected Penn Treebank-style parse trees and includes self-paced reading time data and aligned audio recordings. We give an overview of the content of the corpus, review recent work using the corpus, and release the data.
Funder
Division of Behavioral and Cognitive Sciences National Institutes of Health Division of Information and Intelligent Systems
Publisher
Springer Science and Business Media LLC
Subject
Library and Information Sciences,Linguistics and Language,Education,Language and Linguistics
Reference40 articles.
1. Bachrach, A., Roark, B., Marantz, A., Whitfield-Gabrieli, S., Cardenas, C., Gabrieli. J. D. E. (2009). Incremental prediction in naturalistic langauge procesing: An fMRI study. Unpublished manuscript. 2. Barrett, M., Agić, Ž., Søgaard, A. (2015). The Dundee treebank. In The 14th international workshop on treebanks and linguistic theories (TLT 14), pp. 242–248. 3. Boston, M. F., Hale, J. T., Kliegl, R., Patil, U., Vasishth, S. (2008). Parsing costs as predictors of reading difficulty: An evaluation using the Potsdam Sentence Corpus. Journal of Eye Movement Research, 2(1). 4. Boston, M. F., Hale, J. T., Vasishth, S., & Reinhold, K. (2011). Parallel processing and sentence comprehension difficulty. Language and Cognitive Processes, 26(3), 301–349. 5. Boyce, V., Futrell, R., & Levy, R. P. (2020). Maze made easy: Better and easier measurement of incremental processing difficulty. Journal of Memory and Language, 111, 104082.
Cited by
28 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|