Affiliation:
1. INRIA Rocquencourt, France
2. Courant Institute, NY
Abstract
@@@@
groups together matching pairs with a high similarity value by applying a given grouping criteria (e.g. by transitive closure). Finally,
ging
collapses each individual cluster into a tuple of the resulting data source. AJAX provides @@@@ for specifying data cleaning programs, which consists of SQL statements enriched with a set of specific primitives to express these transformations.
AJAX also @@@@. It allows the user to interact with an executing data cleaning program to handle exceptional cases and to inspect intermediate results. Finally, AJAX provides @@@@ @@@@ that permits users to determine the source and processing of data for debugging purposes.
We will present the AJAX system applied to two real world problems: the consolidation of a telecommunication database, and the conversion of a dirty database of bibliographic references into a set of clean, normalized, and redundancy free relational tables maintaining the same data.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Reference1 articles.
1. http//carat inri# fr/-gallxda/ajax html. http//carat inri# fr/-gallxda/ajax html.
Cited by
17 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Debugging inputs;Proceedings of the ACM/IEEE 42nd International Conference on Software Engineering;2020-06-27
2. Network security analysis using big data technology and improved neural network;Journal of Ambient Intelligence and Humanized Computing;2020-05-20
3. Engineering complex data integration, harmonization and visualization systems;Journal of Industrial Information Integration;2019-12
4. On Detecting and Removing Superficial Redundancy in Vector Databases;Mathematical Problems in Engineering;2018
5. Human-in-the-Loop Challenges for Entity Matching;Proceedings of the 2nd Workshop on Human-In-the-Loop Data Analytics;2017-05-14