Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information-Reference-Cited by-同舟云学术

Matching Attributes across Overlapping Heterogeneous Data Sources Using Mutual Information

Published:2010-10 Issue:4 Volume:21 Page:91-110
ISSN:1063-8016
Container-title:Journal of Database Management
language:en
Short-container-title:

Author:

Zhao Huimin¹

Affiliation:

1. University of Wisconsin-Milwaukee, USA

Abstract

Identifying matching attributes across heterogeneous data sources is a critical and time-consuming step in integrating the data sources. In this paper, the author proposes a method for matching the most frequently encountered types of attributes across overlapping heterogeneous data sources. The author uses mutual information as a unified measure of dependence on various types of attributes. An example is used to demonstrate the utility of the proposed method, which is useful in developing practical attribute matching tools.

Publisher

IGI Global

Subject

Hardware and Architecture,Information Systems,Software

Reference53 articles.

1. Non-parametric entropy estimation: an overview.;J.Beirlant;International Journal of Mathematical and Statistical Sciences,1997

2. Bernstein, P. A., Melnik, S., & Churchill, J. E. (2006). Incremental schema matching. In Proceedings of the 32nd International Conference on Very Large Data Bases (pp. 1167-1170).

3. Bilke, A., & Naumann, F. (2005). Schema Matching Using Duplicates. In Proceedings of the 21st International Conference on Data Engineering (pp. 69-80).

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Implicit Semantics Based Metadata Extraction and Matching of Scholarly Documents;Journal of Database Management;2018-04

2. A Scalable Algorithm for One-to-One, Onto, and Partial Schema Matching with Uninterpreted Column Names and Column Values;Journal of Database Management;2014-10

3. Language independent semantic kernels for short-text classification;Expert Systems with Applications;2014-02

4. An Algorithm for Matching Heterogeneous Financial Databases: A Case Study for COMPUSTAT/CRSP and I/B/E/S Databases;SSRN Electronic Journal;2014