Abstract
AbstractSchema matching is the problem of finding semantic correspondences between elements from different schemas. This is a challenging problem since disparate elements in the schemas often represent the same concept. Traditional instances of this problem involved a pair of schemas. However, recently, there has been an increasing interest in matching several related schemas at once, a problem known as schema matching networks. The goal is to identify elements from several schemas that correspond to a single concept. We propose a family of methods for schema matching networks based on machine learning, which proved to be a competitive alternative for the traditional matching problem in several domains. To overcome the issue of requiring a large amount of training data, we also propose a bootstrapping procedure to generate training data automatically. In addition, we leverage constraints that arise in network scenarios to improve the quality of this data. We also study a strategy for receiving user feedback to assert some of the matchings generated and, relying on this feedback, improve the final result’s quality. Our experiments show that our methods can outperform baselines, reaching F1-score up to 0.83.
Publisher
Springer Science and Business Media LLC
Reference44 articles.
1. Bonifati A, Velegrakis Y (2011) Schema matching and mapping: from usage to evaluation In: Proceedings of the 14th International Conference on Extending Database Technology, 527–529.. Association for Computing Machinery, New York.
2. Do H-H, Rahm E (2002) COMA: a system for flexible combination of schema matching approaches In: Proceedings of the 28th International Conference on Very Large Data Bases, 610–621.. Morgan Kaufmann Publishers, San Francisco.
3. Madhavan J, Bernstein PA, Rahm E (2001) Generic schema matching with cupid In: Proceedings of the 27th International Conference on Very Large Data Bases, 49–58.. The VLDB Endowment, New York.
4. Doan A, Domingos P, Halevy AY (2001) Reconciling schemas of disparate data sources: a machine-learning approach In: Proceedings of the 2001 ACM SIGMOD International Conference on Management of Data, 509–520.. Association for Computing Machinery, New York.
5. Bernstein PA, Madhavan J, Rahm E (2011) Generic schema matching, ten years later. PVLDB 4(11):695–701.
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献