The 385+ million word Corpus of Contemporary American English (1990

The 385+ million word Corpus of Contemporary American English (1990–2008+)

Published:2009-06-10 Issue:2 Volume:14 Page:159-190
ISSN:1384-6655
Container-title:International Journal of Corpus Linguistics
language:en
Short-container-title:IJCL

Author:

Davies Mark¹

Affiliation:

1. Brigham Young University

Abstract

The Corpus of Contemporary American English (COCA), which was released online in early 2008, is the first large and diverse corpus of American English. In this paper, we first discuss the design of the corpus — which contains more than 385 million words from 1990–2008 (20 million words each year), balanced between spoken, fiction, popular magazines, newspapers, and academic journals. We also discuss the unique relational databases architecture, which allows for a wide range of queries that are not available (or are quite difficult) with other architectures and interfaces. To conclude, we consider insights from the corpus on a number of cases of genre-based variation and recent linguistic variation, including an extended analysis of phrasal verbs in contemporary American English.

Publisher

John Benjamins Publishing Company

Subject

Linguistics and Language,Language and Linguistics

Link

http://www.jbe-platform.com/deliver/fulltext/ijcl.14.2.02dav.pdf

Cited by 357 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Challenges and possibilities in compiling Aeronautical English corpora: The case of the Aerocorpus;Research Methods in Applied Linguistics;2024-12

2. Perceptual inference corrects function word errors in reading: Errors that are not noticed do not disrupt eye movements;Cognitive Psychology;2024-11

3. The red dress is cute: why subjective adjectives are more often predicative;Corpus Linguistics and Linguistic Theory;2024-09-09

4. Adversarial Machine Learning for Social Good: Reframing the Adversary as an Ally;IEEE Transactions on Artificial Intelligence;2024-09

5. Cultural framing of giftedness in recent US fictional texts;PLOS ONE;2024-08-29