Abstract
Abstract
This article explores challenges in the corpus linguistic analysis of Shakespeare’s language, and Early Modern English more generally, with particular focus on elaborating possible solutions and the benefits they bring. An account of work that took place within the Encyclopedia of Shakespeare’s Language Project (2016–2019) is given, which discusses the development of the project’s data resources, specifically, the Enhanced Shakespearean Corpus. Topics covered include the composition of the corpus and its subcomponents; the structure of the XML markup; the design of the extensive character metadata; and the word-level corpus annotation, including spelling regularisation, part-of-speech tagging, lemmatisation and semantic tagging. The challenges that arise from each of these undertakings are not exclusive to a corpus-based treatment of Shakespeare’s plays but it is in the context of Shakespeare’s language that they are so severe as to seem almost insurmountable. The solutions developed for the Enhanced Shakespearean Corpus – often combining automated manipulation with manual interventions, and always principled – offer a way through.
Reference41 articles.
1. Abbott, Edwin, A. 1870. A Shakespearian grammar. Third edition. London: Macmillan.
2. Alexander, Marc, Fraser Dallachy, Scott Piao, Alistair Baron and Paul Rayson. 2015. Metaphor, popular science and semantic tagging: Distant reading with the Historical Thesaurus of English. Digital Scholarship in the Humanities 30(suppl_1): i16–i27. https://doi.org/10.1093/llc/fqv045
3. Archer, Dawn and Jonathan Culpeper. 2003. Sociopragmatic annotation: New directions and possibilities in historical corpus linguistics. In A. Wilson, P. Rayson and A.M. McEnery (eds.). Corpus linguistics by the lune: A festschrift for Geoffrey Leech, 37–58. Frankfurt/Main: Peter Lang.
4. Archer, Dawn, Merja Kytö, Alistair Baron and Paul Rayson. 2015. Guidelines for normalising early modern English corpora: Decisions and justifications. ICAME Journal 39: 5–24. https://doi.org/10.1515/icame-2015-0001
5. Baron, Alistair and Paul Rayson. 2008. VARD 2: A tool for dealing with the spelling variation in historical corpora. In Proceedings of the Postgraduate Conference in Corpus Linguistics, Aston University, Birmingham, U.K., 22 May 2008.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献