Affiliation:
1. University of Wisconsin-Milwaukee, USA
Abstract
Identifying matching attributes across heterogeneous data sources is a critical and time-consuming step in integrating the data sources. In this paper, the author proposes a method for matching the most frequently encountered types of attributes across overlapping heterogeneous data sources. The author uses mutual information as a unified measure of dependence on various types of attributes. An example is used to demonstrate the utility of the proposed method, which is useful in developing practical attribute matching tools.
Reference53 articles.
1. Non-parametric entropy estimation: an overview.;J.Beirlant;International Journal of Mathematical and Statistical Sciences,1997
2. Bernstein, P. A., Melnik, S., & Churchill, J. E. (2006). Incremental schema matching. In Proceedings of the 32nd International Conference on Very Large Data Bases (pp. 1167-1170).
3. Bilke, A., & Naumann, F. (2005). Schema Matching Using Duplicates. In Proceedings of the 21st International Conference on Data Engineering (pp. 69-80).