From Keyness to Distinctiveness – Triangulation and Evaluation in Computational Literary Studies

Author:

Schröter Julian1,Du Keli2,Dudar Julia2,Rok Cora2,Schöch Christof2

Affiliation:

1. University of Würzburg Institute of German Philology Würzburg Germany

2. Trier University Computational Linguistics and Digital Humanities Trier Germany

Abstract

Abstract There is a set of statistical measures developed mostly in corpus and computational linguistics and information retrieval, known as keyness measures, which are generally expected to detect textual features that account for differences between two texts or groups of texts. These measures are based on the frequency, distribution, or dispersion of words (or other features). Searching for relevant differences or similarities between two text groups is also an activity that is characteristic of traditional literary studies, whenever two authors, two periods in the work of one author, two historical periods or two literary genres are to be compared. Therefore, applying quantitative procedures in order to search for differences seems to be promising in the field of computational literary studies as it allows to analyze large corpora and to base historical hypotheses on differences between authors, genres and periods on larger empirical evidence. However, applying quantitative procedures in order to answer questions relevant to literary studies in many cases raises methodological problems, which have been discussed on a more general level in the context of integrating or triangulating quantitative and qualitative methods in mixed methods research of the social sciences. This paper aims to solve these methodological issues concretely for the concept of distinctiveness and thus to lay the methodological foundation permitting to operationalize quantitative procedures in order to use them not only as rough exploratory tools, but in a hermeneutically meaningful way for research in literary studies. Based on a structural definition of potential candidate measures for analyzing distinctiveness in the first section, we offer a systematic description of the issue of integrating quantitative procedures into a hermeneutically meaningful understanding of distinctiveness by distinguishing its epistemological from the methodological perspective. The second section develops a systematic strategy to solve the methodological side of this issue based on a critical reconstruction of the widespread non-integrative strategy in research on keyness measures that can be traced back to Rudolf Carnap’s model of explication. We demonstrate that it is, in the first instance, mandatory to gain a comprehensive qualitative understanding of the actual task. We show that Carnap’s model of explication suffers from a shortcoming that consists in ignoring the need for a systematic comparison of what he calls the explicatum and the explicandum. Only if there is a method of systematic comparison, the next task, namely that of evaluation can be addressed, which verifies whether the output of a quantitative procedure corresponds to the qualitative expectation that must be clarified in advance. We claim that evaluation is necessary for integrating quantitative procedures to a qualitative understanding of distinctiveness. Our reconstruction shows that both steps are usually skipped in empirical research on keyness measures that are the most important point of reference for the development of a measure of distinctiveness. Evaluation, which in turn requires thorough explication and conceptual clarification, needs to be employed to verify this relation. In the third section we offer a qualitative clarification of the concept of distinctiveness by spanning a three-dimensional conceptual space. This flexible framework takes into account that there is no single and proper concept of distinctiveness but rather a field of possible meanings depending on research interest, theoretical framework, and access to the perceptibility or salience of textual features. Therefore, we shall, instead of stipulating any narrow and strict definition, take into account that each of these aspects – interest, theoretical framework, and access to perceptibility – represents one dimension of the heuristic space of possible uses of the concept of distinctiveness. The fourth section discusses two possible strategies of operationalization and evaluation that we consider to be complementary to the previously provided clarification, and that complete the task of establishing a candidate measure successfully as a measure of distinctiveness in a qualitatively ambitious sense. We demonstrate that two different general strategies are worth considering, depending on the respective notion of distinctiveness and the interest as elaborated in the third section. If the interest is merely taxonomic, classification tasks based on multi-class supervised machine learning are sufficient. If the interest is aesthetic, more complex and intricate evaluation strategies are required, which have to rely on a thorough conceptual clarification of the concept of distinctiveness, in particular on the idea of salience or perceptibility. The challenge here is to correlate perceivable complex features of texts such as plot, theme (aboutness), style, form, or roles and constellation of fictional characters with the unperceived frequency and distribution of word features that are calculated by candidate measures of distinctiveness. Existing research did not clarify, so far, how to correlate such complex features with individual word features. The paper concludes with a general reflection on the possibility of mixed methods research for computational literary studies in terms of explanatory power and exploratory use. As our strategy of combining explication and evaluation shows, integration should be understood as a strategy of combining two different perspectives on the object area: in our evaluation scenarios, that of empirical reader response and that of a specific quantitative procedure. This does not imply that measures of distinctiveness, which proved to reach explanatory power in one qualitative aspect, should be supposed to be successful in all fields of research. As long as evaluation is omitted, candidate measures of distinctiveness lack explanatory power and are limited to exploratory use. In contrast with a skepticism that has sometimes been expressed from literary scholars with regard to the relevance of computational literary studies on proper issues of the humanities, we believe that integrating computational methods into hermeneutic literary studies can be achieved in a way that reaches higher explanatory power than the usual exploratory use of keyness measures, but it can only be achieved individually for concrete tasks and not once and for all based on a general theoretical demonstration.

Publisher

Walter de Gruyter GmbH

Subject

Pharmaceutical Science

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3