Affiliation:
1. CSD, UCLA, Los Angeles, CA
2. University of Cagliari, Cagliari, Italy
Abstract
Wikipedia's InfoBoxes play a crucial role in advanced applications and provide the main knowledge source for DBpedia and the powerful structured queries it supports. However, InfoBoxes, which were created by crowdsourcing for human rather than computer consumption, suffer from incompleteness, inconsistencies, and inaccuracies. To overcome these problems, we have developed (i) the IBminer system that extracts InfoBox information by text-mining Wikipedia pages, (ii) the IKBStore system that integrates the information derived by IBminer with that of DBpedia, YAGO2,WikiData,WordNet, and other sources, and (iii) SWiPE and InfoBox Editor (IBE) that provide a user-friendly interfaces for querying and revising the knowledge base. Thus, IBminer uses a deep NLP-based approach to extract from text a semantic representation structure called TextGraph from which the system detects patterns and derives subject-attribute-value relations, as well as domain-specific synonyms for the knowledge base. IKBStore and IBE complement the powerful, user-friendly, by-example structured queries of SWiPE by supporting the validation and provenance history for the information contained in the knowledge base, along with the ability of upgrading its knowledge when this is found incomplete, incorrect, or outdated.
Publisher
Association for Computing Machinery (ACM)
Subject
Information Systems,Software
Reference21 articles.
1. Apache Jena. http://jena.apache.org/. Apache Jena. http://jena.apache.org/.
2. Geonames. http://www.geonames.org/. Geonames. http://www.geonames.org/.
3. Hoffman2 Cluster UCLA. http://hpc.ucla.edu/hoffman2/. Hoffman2 Cluster UCLA. http://hpc.ucla.edu/hoffman2/.
4. Musicbrainz. http://musicbrainz.org/. Musicbrainz. http://musicbrainz.org/.
5. Opencyc. http://www.cyc.com/platform/opencyc Opencyc. http://www.cyc.com/platform/opencyc
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献