Affiliation:
1. Hefei University of Technology, Anhui Province, China
2. Amazon, Palo Alto, CA
3. Alibaba Group, Zhejiang Province, China
4. University of Louisiana at Lafayett, Lafayett, LA
Abstract
A huge amount of texts available on the World Wide Web presents an unprecedented opportunity for information extraction (IE). One important assumption in IE is that frequent extractions are more likely to be correct. Sparse IE is hence a challenging task because no matter how big a corpus is, there are extractions supported by only a small amount of evidence in the corpus. However, there is limited research on sparse IE, especially in the assessment of the validity of sparse IEs. Motivated by this, we introduce a lightweight, explicit semantic approach for assessing sparse IE.
1
We first use a large semantic network consisting of millions of concepts, entities, and attributes to explicitly model the context of any semantic relationship. Second, we learn from three semantic contexts using different base classifiers to select an optimal classification model for assessing sparse extractions. Finally, experiments show that as compared with several state-of-the-art approaches, our approach can significantly improve the
F
-score in the assessment of sparse extractions while maintaining the efficiency.
Funder
the Program for Changjiang Scholas and Innovative Research Team in University (PCSIRT) of the Ministry of Education
the Natural Science Foundation of China
the US National Science Foundation
National Key Research and Development Program of China
the Natural Science Foundation of Anhui province
Publisher
Association for Computing Machinery (ACM)
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献