Affiliation:
1. La Trobe University, Bundoora, Australia
Abstract
The process of identifying the actual meanings of words in a given text fragment has a long history in the field of computational linguistics. Due to its importance in understanding the semantics of natural language, it is considered one of the most challenging problems facing this field. In this article we propose a new unsupervised similarity-based word sense disambiguation (WSD) algorithm that operates by computing the semantic similarity between glosses of the target word and a context vector. The sense of the target word is determined as that for which the similarity between gloss and context vector is greatest. Thus, whereas conventional unsupervised WSD methods are based on measuring pairwise similarity between words, our approach is based on measuring semantic similarity between sentences. This enables it to utilize a higher degree of semantic information, and is more consistent with the way that human beings disambiguate; that is, by considering the greater context in which the word appears. We also show how performance can be further improved by incorporating a preliminary step in which the relative importance of words within the original text fragment is estimated, thereby providing an ordering that can be used to determine the sequence in which words should be disambiguated. We provide empirical results that show that our method performs favorably against the state-of-the-art unsupervised word sense disambiguation methods, as evaluated on several benchmark datasets through different models of evaluation.
Publisher
Association for Computing Machinery (ACM)
Subject
Computational Mathematics,Computer Science (miscellaneous)
Cited by
7 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. AcX;Proceedings of the VLDB Endowment;2022-07
2. Dynamics of topic formation and quantitative analysis of hot trends in physical science;Scientometrics;2020-07-13
3. Improving Semantic Graph Connectivity for Word Sense Identification;Proceedings of the 12th International Conference on Computer Modeling and Simulation;2020-06-22
4. Trends in Document Analysis;Data Management, Analytics and Innovation;2018-08-10
5. Centroid-Based Lexical Clustering;Recent Applications in Data Clustering;2018-08-01