Author:
Nguyen Hoang-Minh,Nguyen Hong-Quang,Tran Khoi-Nguyen,Vo Xuan-Vinh
Abstract
Purpose
– This paper aims to improve the semantic-disambiguation capability of an information-retrieval system by taking advantages of a well-crafted classification tree. The unstructured nature and sheer volume of information accessible over networks have made it drastically difficult for users to seek relevant information. Many information-retrieval methods have been developed to address this problem, and keyword-based approach is amongst the most common approach. Such an approach is often inadequate to cope with the conceptualization associated with user needs and contents. This brings about the problem of semantic ambiguation that refers to the disagreement in meaning of terms between involving parties of a communication due to polysemy, leading to increased complexity and lesser accuracy in information integration, migration, retrieval and other related activities.
Design/methodology/approach
– A novel ontology-based search approach, named GeTFIRST (short for Graph-embedded Tree Fostering Information Retrieval SysTem), is proposed to disambiguate keywords semantically. The contribution is twofold. First, a search strategy is proposed to prune irrelevant concepts for accuracy improvement using our Graph-embedded Tree (GeT)-based ontology. Second, a path-based ranking algorithm is proposed to incorporate and reward the content specificity.
Findings
– An empirical evaluation was performed on United States Patent And Trademark Office (USPTO) patent datasets to compare our approach with full-text patent search approaches. The results showed that GeTFIRST handled the ambiguous keywords with higher keyword-disambiguation accuracy than traditional search approaches.
Originality/value
– The search approach of this paper copes with the semantic ambiguation by using our proposed GeT-based ontology and a path-based ranking algorithm.
Subject
Computer Networks and Communications,Information Systems
Reference42 articles.
1. Association for Computing Machinery
(2012), “The 2012 ACM computing classification system”, available at: www.acm.org/ (accessed 4 May 2015).
2. Bhattacharjee, S.
and
Ghosh, S.
(2014), “Automatic resolution of semantic heterogeneity in gis: an ontology based approach”,
Advanced Computing, Networking and Informatics
, Springer International Publishing Switzerland, Switzerland, Vol. 1, pp. 585-591.
3. Bouramoul, A.
,
Kholladi, M.
and
Doan, B.L.
(2012), “An ontology-based approach for semantics ranking of the web search engines results”, in 3rd IEEE International Conference on Multimedia Computing and Systems (ICMCS’2012), Tangier, pp. 797-802.
4. Cantrell, S.J.
(2013), “Ontology-based search engine in support of a decision support system”, US Patent No: US 20130246382 a1.
5. Castells, P.
,
Fernandez, M.
and
Vallet, D.
(2007), “An adaptation of the vector-space model for ontology-based information retrieval”,
Knowledge and Data Engineering, IEEE Transactions on
, Vol. 19 No. 2, pp. 261-272.
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献