Affiliation:
1. University of Helsinki
Abstract
The availability of large digital archives has great potential for corpus linguistic research, but their use is not without problems. These problems can often be traced to fundamentally different ideas of what might constitute “good data” in Digital Humanities and in corpus linguistics, leading to different expectations regarding how the data is made available to researchers. This chapter discusses the specific challenges involved in using the British Library Newspapers database for corpus linguistics and considers potential solutions for them. It is argued that, to take full advantage of the database, it is necessary to adopt a flexible approach enabling a critical reflection on the digital materials, how they have been collected, processed, and made available.
Publisher
John Benjamins Publishing Company