Affiliation:
1. Tsinghua University, Beijing, China
2. The Hong Kong University of Science and Technology, Kowloon, Hong Kong
3. The Chinese University of Hong Kong, N. T., Hong Kong
4. The Chinese University of Hong Kong, Shatin, N. T., Hong Kong
Abstract
Matching dependencies
(MDs) have recently been proposed to make data dependencies tolerant to various information representations, and found useful in data quality applications such as record matching. Instead of the strict equality function used in traditional dependency syntax (e.g., functional dependencies), MDs specify constraints based on similarity and identification. However, in practice, MDs may still be too strict and applicable only in a subset of tuples in a relation. Thereby, we study the
conditional matching dependencies
(CMDs), which bind matching dependencies only in a certain part of a table, i.e., MDs conditionally applicable in a subset of tuples. Compared to MDs, CMDs have more expressive power that enables them to satisfy wider application needs. In this article, we study several important theoretical and practical issues of CMDs, including irreducible CMDs with respect to the implication, discovery of CMDs from data, reliable CMDs agreed most by a relation, approximate CMDs almost satisfied in a relation, and finally applications of CMDs in record matching and missing value repairing. Through an extensive experimental evaluation in real data sets, we demonstrate the efficiency of proposed CMDs discovery algorithms and effectiveness of CMDs in real applications.
Funder
NSFC
Tsinghua University Initiative Scientific Research Program
National Key Research Program of China
Publisher
Association for Computing Machinery (ACM)
Reference61 articles.
1. Temporal rules discovery for web data cleaning
2. Profiling relational data: a survey
3. Serge Abiteboul Richard Hull and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley. Serge Abiteboul Richard Hull and Victor Vianu. 1995. Foundations of Databases. Addison-Wesley.
4. Discovering conditional inclusion dependencies
Cited by
21 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献