Author:
Li Chuan,Hou Yunqi,Yu Zhang
Abstract
Abstract
In the era of current data explosion, data cleaning becomes an important part of data analysis, and it is also one of the important means to improve data quality. In this paper, the concept, principle, process, detection method and related cleaning algorithm of structural data cleaning are introduced in detail through the data cleaning technology based on instance level. In view of the outstanding data quality problems based on instance level, relevant experiment is designed, the operation and verification process of structural data cleaning will explain concretely through visual programming technology and machine learning algorithm. Finally, the research of data cleaning technology in the future is prospected.
Subject
General Physics and Astronomy
Reference6 articles.
1. Extending dependencies with conditions for data cleaning[C];Wenfei,2008
2. Automatic Linkage of Vital Records[j];Newcombe;Science,1959
3. Data Clearning in Microsoft SQL Server 2005[C];Chaudhuri,2005
4. Domain-Independent Data Clearning via Analysis of Entity-Relationship Graph[J];Kalashnikov;ACM Transactions on Database Systems,2006
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Data Redundancy Detection Algorithm based on Multidimensional Similarity;2023 International Conference on Frontiers of Robotics and Software Engineering (FRSE);2023-06
2. Study on Landslide Early Warning Based on Logistic Regression and Collaborative Filtering;2023 3rd Asia Conference on Information Engineering (ACIE);2023-01