Veracity Roadmap: Is Big Data Objective, Truthful and Credible?-Reference-Cited by-同舟云学术

Veracity Roadmap: Is Big Data Objective, Truthful and Credible?

Published:2014-01-09 Issue:1 Volume:24 Page:4
ISSN:2324-9773
Container-title:Advances in Classification Research Online
language:
Short-container-title:ACRO

Author:

Lukoianova Tatiana,Rubin Victoria L.

Abstract

This paper argues that big data can possess different characteristics, which affect its quality. Depending on its origin, data processing technologies, and methodologies used for data collection and scientific discoveries, big data can have biases, ambiguities, and inaccuracies which need to be identified and accounted for to reduce inference errors and improve the accuracy of generated insights. Big data veracity is now being recognized as a necessary property for its utilization, complementing the three previously established quality dimensions (volume, variety, and velocity), But there has been little discussion of the concept of veracity thus far. This paper provides a roadmap for theoretical and empirical definitions of veracity along with its practical implications. We explore veracity across three main dimensions: 1) objectivity/subjectivity, 2) truthfulness/deception, 3) credibility/implausibility – and propose to operationalize each of these dimensions with either existing computational tools or potential ones, relevant particularly to textual data analytics. We combine the measures of veracity dimensions into one composite index – the big data veracity index. This newly developed veracity index provides a useful way of assessing systematic variations in big data quality across datasets with textual information. The paper contributes to the big data research by categorizing the range of existing tools to measure the suggested dimensions, and to Library and Information Science (LIS) by proposing to account for heterogeneity of diverse big data, and to identify information quality dimensions important for each big data type.

Publisher

University of Washington Libraries

Cited by 109 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Estudio comparativo de la capacidad de aprendizaje de ChatGPT en la resolución de preguntas de especialización médica;Open Respiratory Archives;2024-10

2. Assessing veracity of big data: An in-depth evaluation process from the comparison of Mobile phone traces and groundtruth data in traffic monitoring;Journal of Transport Geography;2024-06

3. Exploring the landscape of big data applications in librarianship: a bibliometric analysis of research trends and patterns;Library Hi Tech;2024-03-26

4. The power and potentials of Flexible Query Answering Systems: A critical and comprehensive analysis;Data & Knowledge Engineering;2024-01

5. Background and Technologies;Synthetic Data;2024