Affiliation:
1. Open University of Cyprus and Technical University of Crete
2. Technical University of Crete
Abstract
We introduce holistic in-database query processing over information extraction pipelines. This requires considering the joint conditional distribution over generic Conditional Random Fields that uses factor graphs to encode extraction tasks. Our approach introduces
Canopy Factor Graphs
, a novel probabilistic model for effectively capturing the joint conditional distribution given a canopy clustering of the data, and special query operators for retrieving resolution information. Since inference on such models is intractable, we introduce an approximate technique for query processing and optimizations that cut across the integrated tasks for reducing the required processing time. Effectiveness and scalability are verified through an extensive experimental evaluation using real and synthetic data.
Subject
General Earth and Planetary Sciences,Water Science and Technology,Geography, Planning and Development
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献