How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases-Reference-Cited by-同舟云学术

How to Inspect and Measure Data Quality about Scientific Publications: Use Case of Wikipedia and CRIS Databases

Published:2020-04-26 Issue:5 Volume:13 Page:107
ISSN:1999-4893
Container-title:Algorithms
language:en
Short-container-title:Algorithms

Author:

Azeroual Otmane^ORCID,Lewoniewski Włodzimierz^ORCID

Abstract

The quality assurance of publication data in collaborative knowledge bases and in current research information systems (CRIS) becomes more and more relevant by the use of freely available spatial information in different application scenarios. When integrating this data into CRIS, it is necessary to be able to recognize and assess their quality. Only then is it possible to compile a result from the available data that fulfills its purpose for the user, namely to deliver reliable data and information. This paper discussed the quality problems of source metadata in Wikipedia and CRIS. Based on real data from over 40 million Wikipedia articles in various languages, we performed preliminary quality analysis of the metadata of scientific publications using a data quality tool. So far, no data quality measurements have been programmed with Python to assess the quality of metadata from scientific publications in Wikipedia and CRIS. With this in mind, we programmed the methods and algorithms as code, but presented it in the form of pseudocode in this paper to measure the quality related to objective data quality dimensions such as completeness, correctness, consistency, and timeliness. This was prepared as a macro service so that the users can use the measurement results with the program code to make a statement about their scientific publications metadata so that the management can rely on high-quality data when making decisions.

Publisher

MDPI AG

Subject

Computational Mathematics,Computational Theory and Mathematics,Numerical Analysis,Theoretical Computer Science

Link

https://www.mdpi.com/1999-4893/13/5/107/pdf

Reference37 articles.

1. The role of information and communication technologies in socioeconomic development: towards a multi-dimensional framework

2. International Data on Measuring Management Practices

3. Data measurement in research information systems: metrics for the evaluation of data quality

4. Data Integration under Integrity Constraints;Calì,2002

5. Analyzing data quality issues in research information systems via data profiling

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A novel data quality framework for assessment of scientific lecture video indexing;Library Hi Tech;2023-07-14

2. A Non-Iterative Constrained Measure of Research Impact;Information;2022-06-29

3. Thematic coverage of CRIS in WoS, Scopus and Dimensions (2000-2020);Procedia Computer Science;2022

4. Roles and education of information and data professionals;Research Data Management and Data Literacies;2022

5. Treatment of Bad Big Data in Research Data Management (RDM) Systems;Big Data and Cognitive Computing;2020-10-18