An algorithm to identify periods of establishment and obsolescence of linguistic items in a diachronic corpus
Author:
Cunha Evandro L.T.P.,Wichmann Søren
Abstract
When exploring diachronic corpora, it is often beneficial for linguists to pinpoint not only the first or the last attestation dates of certain linguistic items, but also the moments in which they become more strongly established in the corpus or, conversely, the moments in which they, despite still being part of the language, become obsolete. In this paper, we propose an algorithm to assist the identification of such periods based on the frequency of items in a corpus. Our simple and generalisable algorithm can be used for the investigation of any linguistic item in any corpus which is divided into time-frames. We also demonstrate the applicability of our method using lexical data from the Corpus of Historical American English (coha), providing case studies on the statistics and characteristics of words that appear in or disappear from this corpus in different periods.
Publisher
Edinburgh University Press
Subject
Linguistics and Language,Language and Linguistics