Methods for detecting and correcting contextual data quality problems-Reference-Cited by-同舟云学术

Methods for detecting and correcting contextual data quality problems

Published:2021-07-09 Issue:4 Volume:25 Page:763-787
ISSN:1088-467X
Container-title:Intelligent Data Analysis
language:
Short-container-title:IDA

Author:

Ngueilbaye Alladoumbaye¹,Wang Hongzhi¹,Mahamat Daouda Ahmat²,Elgendy Ibrahim A.¹,Junaidu Sahalu B.³

Affiliation:

1. School of Computer Science and Technology, Harbin Institute of Technology, Harbin, Heilongjiang, China

2. Department d’Informatique, Université de N’Djamena, , N’Djamena, Tchad

3. Department of Computer Science, Ahmadu Bello University, Zaria, Nigeria

Abstract

Knowledge extraction, data mining, e-learning or web applications platforms use heterogeneous and distributed data. The proliferation of these multifaceted platforms faces many challenges such as high scalability, the coexistence of complex similarity metrics, and the requirement of data quality evaluation. In this study, an extended complete formal taxonomy and some algorithms that utilize in achieving the detection and correction of contextual data quality anomalies were developed and implemented on structured data. Our methods were effective in detecting and correcting more data anomalies than existing taxonomy techniques, and also highlighted the demerit of Support Vector Machine (SVM). These proposed techniques, therefore, will be of relevance in detection and correction of errors in large contextual data (Big data).

Publisher

IOS Press

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Theoretical Computer Science

Reference41 articles.

1. The impact of poor data quality on the typical enterprise;Redman;Communications of the ACM,1998

2. Data and information quality issues in ambient assisted living systems;McNaull;Journal of Data and Information Quality (JDIQ),2012

3. W. Li, J. Zhang and R. Bheemavaram, Efficient algorithms for grouping data to improve data quality, in: Proc. 2006 International Conference on Information and Knowledge Engineering, Las Vegas, 2006.

4. Open data: quality over quantity;Sadiq;International Journal of Information Management,2017

5. Data cleaning: problems and current approaches;Rahm;IEEE Data Eng. Bull.,2000

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Data quality model for assessing public COVID-19 big datasets;The Journal of Supercomputing;2023-05-31