Affiliation:
1. Computer Sciences Department, University of Wisconsin-Madison, USA
Abstract
Researchers have approached knowledge-base construction (KBC) with a wide range of data resources and techniques. The authors present Elementary, a prototype KBC system that is able to combine diverse resources and different KBC techniques via machine learning and statistical inference to construct knowledge bases. Using Elementary, they have implemented a solution to the TAC-KBP challenge with quality comparable to the state of the art, as well as an end-to-end online demonstration that automatically and continuously enriches Wikipedia with structured data by reading millions of webpages on a daily basis. The authors describe several challenges and their solutions in designing, implementing, and deploying Elementary. In particular, the authors first describe the conceptual framework and architecture of Elementary to integrate different data resources and KBC techniques in a principled manner. They then discuss how they address scalability challenges to enable Web-scale deployment. The authors empirically show that this decomposition-based inference approach achieves higher performance than prior inference approaches. To validate the effectiveness of Elementary’s approach to KBC, they experimentally show that its ability to incorporate diverse signals has positive impacts on KBC quality.
Subject
Computer Networks and Communications,Information Systems
Reference73 articles.
1. Ailon, N., Charikar, M., & Newman, A. (2008). Aggregating inconsistent information: Ranking and clustering. Journal of the ACM, 55(23), 1-23-27.
2. Andrzejewski, D., Livermore, L., Zhu, X., Craven, M., & Recht, B. (2011, July 16-22). A framework for incorporating general domain knowledge into latent Dirichlet allocation using first-order logic. In Proceedings of the International Joint Conference on Artificial Intelligence, Catalonia, Spain (pp. 1171-1177).
3. Two “well-known” properties of subgradient optimization
4. Arasu, A., & Garcia-Molina, H. (2003). Extracting structured data from web pages. In Proceedings of the 2003 ACM SIGMOD International Conference on Management of Data, San Diego, CA (pp. 337-348).
5. Arasu, A., Ré, C., & Suciu, D. (2009, March 29-April 2). Large-scale deduplication with constraints using Dedupalog. In Proceedings of the International Conference on Data Engineering, Shanghai, China (pp. 952-963).
Cited by
57 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献