Affiliation:
1. Computer Science and Engineering Department, Shiraz University, Shiraz, Iran
Abstract
Abstract
Word sense disambiguation (WSD) is the task of selecting correct sense for an ambiguous word in its context. Since WSD is one of the most challenging tasks in various text processing systems, improving its accuracy can be very beneficial. In this article, we propose a new unsupervised method based on co-occurrence graph created by monolingual corpus without any dependency on the structure and properties of the language itself. In the proposed method, the context of an ambiguous word is represented as a sub-graph extracted from a large word co-occurrence graph built based on a corpus. Most of the words are connected in this graph. To clarify the exact sense of an ambiguous word, its senses and relations are added to the context graph, and various similarity functions are employed based on the senses and context graph. In the disambiguation process, we select senses with highest similarity to the context graph. As opposite to other WSD methods, the proposed method does not use any language-dependent resources (e.g. WordNet) and it just uses a monolingual corpus. Therefore, the proposed method can be employed for other languages. Moreover, by increasing the size of corpus, it is possible to enhance the accuracy of WSD. Experimental results on English and Persian datasets show that the proposed method is competitive with existing supervised and unsupervised WSD approaches.
Publisher
Oxford University Press (OUP)
Subject
Computer Science Applications,Linguistics and Language,Language and Linguistics,Information Systems
Reference63 articles.
1. Random walks for knowledge-based word sense disambiguation;Agirre;Computational Linguistics,2014
2. Word sense disambiguation;Agirre;Text Speech and Language Technology,2007
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献