Author:
Rennick Stephanie,Roberts Seán
Abstract
This paper presents the Video Game Dialogue Corpus, the first large-scale, consistently coded, open source corpus of dialogue from video games. It contains over 6.2 million words of English dialogue from fifty games in the Role Playing Game (rpg) genre. This includes games produced between 1985 and 2020, rated for children, teenagers and adults, and in both ‘Western’ and ‘Japanese’ sub-genres. The corpus design is described, including custom data formats for representing branching dialogue. We demonstrate the use of the corpus by comparing the dialogue of female and male characters, where we find reflections of gendered language in other media as well as patterns that seem specific to video games. We provide the source code for a ‘self-inflating corpus’ – a pipeline that obtains the data then processes and parses it into a standard format. This makes the corpus available for teaching and research purposes, providing the first such resource for empirical analysis of video game dialogue.
Publisher
Edinburgh University Press
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献