Affiliation:
1. Norwegian University of Science and Technology, Gjøvik, Norway
Abstract
This paper presents a novel concept enrichment objective metric combining contextual and semantic information of terms extracted from the domain documents. The proposed metric is called SEMCON which stands for semantic and contextual objective metric. It employs a hybrid learning approach utilizing functionalities from statistical and linguistic ontology learning techniques. The metric also introduced for the first time two statistical features that have shown to improve the overall score ranking of highly relevant terms for concept enrichment. Subjective and objective experiments are conducted in various domains. Experimental results (F1) from computer domain show that SEMCON achieved better performance in contrast to tf*idf, and LSA methods, with 12.2%, 21.8%, and 24.5% improvement over them respectively. Additionally, an investigation into how much each of contextual and semantic components contributes to the overall task of concept enrichment is conducted and the obtained results suggest that a balanced weight gives the best performance.
Subject
Computer Networks and Communications,Information Systems
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献