Unsupervised similarity-based word sense disambiguation using context vectors and sentential word importance

Author:

Abdalgader Khaled1,Skabar Andrew1

Affiliation:

1. La Trobe University, Bundoora, Australia

Abstract

The process of identifying the actual meanings of words in a given text fragment has a long history in the field of computational linguistics. Due to its importance in understanding the semantics of natural language, it is considered one of the most challenging problems facing this field. In this article we propose a new unsupervised similarity-based word sense disambiguation (WSD) algorithm that operates by computing the semantic similarity between glosses of the target word and a context vector. The sense of the target word is determined as that for which the similarity between gloss and context vector is greatest. Thus, whereas conventional unsupervised WSD methods are based on measuring pairwise similarity between words, our approach is based on measuring semantic similarity between sentences. This enables it to utilize a higher degree of semantic information, and is more consistent with the way that human beings disambiguate; that is, by considering the greater context in which the word appears. We also show how performance can be further improved by incorporating a preliminary step in which the relative importance of words within the original text fragment is estimated, thereby providing an ordering that can be used to determine the sequence in which words should be disambiguated. We provide empirical results that show that our method performs favorably against the state-of-the-art unsupervised word sense disambiguation methods, as evaluated on several benchmark datasets through different models of evaluation.

Publisher

Association for Computing Machinery (ACM)

Subject

Computational Mathematics,Computer Science (miscellaneous)

Cited by 7 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. AcX;Proceedings of the VLDB Endowment;2022-07

2. Dynamics of topic formation and quantitative analysis of hot trends in physical science;Scientometrics;2020-07-13

3. Improving Semantic Graph Connectivity for Word Sense Identification;Proceedings of the 12th International Conference on Computer Modeling and Simulation;2020-06-22

4. Trends in Document Analysis;Data Management, Analytics and Innovation;2018-08-10

5. Centroid-Based Lexical Clustering;Recent Applications in Data Clustering;2018-08-01

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3