Abstract
This paper reports on the construction of the Sydney Corpus of Television Dialogue (SydTV). SydTV comprises approximately 275,000-words of dialogue from sixty-six episodes of recent US American fictional television series. The paper first provides a brief overview of existing TV dialogue corpora and then outlines the basic corpus composition, the corpus design principles, and the processes of data collection and storage. SydTV is a small, specialised corpus designed with the objective of being representative of fictional US TV dialogue. TV dialogue is defined as the dialogue uttered by actors on screen as they are performing characters in fictional TV series. The corpus is fairly balanced, since it contains 116,295 words from drama genres and 158,779 words from comedy genres as well as 135,887 words from ‘quality’ and 139,187 words from ‘mainstream’ TV series, in addition to a healthy mix of different types of episodes in terms of textual time (pilot episodes, final episodes, episodes occurring at the beginning, middle or end of a season). The corpus is available for educational (teaching and research) purposes through an online interface 2 and has a companion website 3 where frequency lists are provided.
Publisher
Edinburgh University Press
Subject
Linguistics and Language,Language and Linguistics
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献