Affiliation:
1. South Westphalia University of Applied Sciences, Hagen, Germany
Abstract
The large and constantly growing amounts of available text documents hold great potential for the exploration of knowledge. However, in the light of the vast quantity and variety of available documents, one fact should not be forgotten: the results of a knowledge discovery in texts are only as good as the underlying document collection. That is why analysts have to ensure that document collections adequately represent the specific area under examination and thereby to minimise the bias and to maximise the generalisable nature of the knowledge brought to light. Surprisingly, knowledge management research has barely paid any attention to the problems of such a document quality assessment and rigorous document selection. This paper addresses that research gap and makes two contributions: In the first step, building on a cross-disciplinary exchange with social research, development of a framework for the quality assessment and collection of documents. This artefact provides concrete guidance for compiling suitable, high-quality document collections and makes a contribution to ensuring “document collection quality” within the context of knowledge discovery in texts. In the second step, the framework is evaluated in a practical demonstration. In this context, the demonstration also exemplifies how different document collections influence the results of knowledge discoveries.
Publisher
World Scientific Pub Co Pte Lt
Subject
Library and Information Sciences,Computer Networks and Communications,Computer Science Applications
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献