Author:
Räsänen Okko,Seshadri Shreyas,Lavechin Marvin,Cristia Alejandrina,Casillas Marisa
Abstract
AbstractRecordings captured by wearable microphones are a standard method for investigating young children’s language environments. A key measure to quantify from such data is the amount of speech present in children’s home environments. To this end, the LENA recorder and software—a popular system for measuring linguistic input—estimates the number of adult words that children may hear over the course of a recording. However, word count estimation is challenging to do in a language- independent manner; the relationship between observable acoustic patterns and language-specific lexical entities is far from uniform across human languages. In this paper, we ask whether some alternative linguistic units, namely phone(me)s or syllables, could be measured instead of, or in parallel with, words in order to achieve improved cross-linguistic applicability and comparability of an automated system for measuring child language input. We discuss the advantages and disadvantages of measuring different units from theoretical and technical points of view. We also investigate the practical applicability of measuring such units using a novel system called Automatic LInguistic unit Count Estimator (ALICE) together with audio from seven child-centered daylong audio corpora from diverse cultural and linguistic environments. We show that language-independent measurement of phoneme counts is somewhat more accurate than syllables or words, but all three are highly correlated with human annotations on the same data. We share an open-source implementation of ALICE for use by the language research community, enabling automatic phoneme, syllable, and word count estimation from child-centered audio recordings.
Funder
James S. McDonnell Foundation
Agence Nationale de la Recherche
Academy of Finland
Nederlandse Organisatie voor Wetenschappelijk Onderzoek
Publisher
Springer Science and Business Media LLC
Subject
General Psychology,Psychology (miscellaneous),Arts and Humanities (miscellaneous),Developmental and Educational Psychology,Experimental and Cognitive Psychology
Reference80 articles.
1. Allen, S. E. M., & Dench, C. (2015). Calculating mean length of utterance for Eastern Canadian Inuktitut. First Language, 35, 377–406.
2. Bates, E., & Goodman, J. (1997). On the inseparability of grammar and the lexicon: evidence from acquisition, aphasia, and real-time processing. Language and Cognitive Processes, 12(5/6), 507–584.
3. Bergelson (2016). Bergelson Seedlings HomeBank Corpus. doi:https://doi.org/10.21415/T5PK6D.
4. Bergelson, E., Amatuni, A., Dailey, S., Koorathota, S., & Tor, S. (2019). Day by day, hour by hour: Naturalistic language input to infants. Developmental Science, 22(1), e12715.
5. Bergelson, E., & Aslin, R. N. (2017). Nature and origins of the lexicon in 6-mo-olds. Proceedings of the National Academy of Sciences, 114(49), 12916–12921.
Cited by
23 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献