Affiliation:
1. University of California, Santa Barbara
Abstract
The most frequent statistics in corpus linguistics are frequencies of occurrence and frequencies of co-occurrence of two or more linguistic variables. However, such frequencies in isolation may sometimes be misleading since they do not take into consideration the degree of dispersion of the relevant linguistic variable. Many dispersion measures and adjusted frequency measures have been suggested but are neither widely known nor applied. Another unfortunate aspect of such measures is that many also come with a variety of problems. I pursue three objectives with this article. First, I want to raise awareness of this issue and make the available measures more widely known, so I present an overview of many measures of dispersion and adjusted frequencies. Second, I propose a conceptually simple alternative measure, DP, explain and exemplify it, and compare it to previously discussed measures. Third and most importantly, I urge corpus linguists to explore the notion of dispersion in more detail and outline a few proposals which steps to take next.
Publisher
John Benjamins Publishing Company
Subject
Linguistics and Language,Language and Linguistics
Cited by
240 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献