Author:
Park Jeong Yeon,Shin Hyeong Jin,Lee Jae Sung
Abstract
Sequence labeling models for word sense disambiguation have proven highly effective when the sense vocabulary is compressed based on the thesaurus hierarchy. In this paper, we propose a method for compressing the sense vocabulary without using a thesaurus. For this, sense definitions in a dictionary are converted into sentence vectors and clustered into the compressed senses. First, the very large set of sense vectors is partitioned for less computational complexity, and then it is clustered hierarchically with awareness of homographs. The experiment was done on the English Senseval and Semeval datasets and the Korean Sejong sense annotated corpus. This process demonstrated that the performance greatly increased compared to that of the uncompressed sense model and is comparable to that of the thesaurus-based model.
Funder
National Research Foundation of Korea
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference48 articles.
1. Word sense disambiguation
2. Word Sense Disambiguation: An Overview
3. Word sense disambiguation: An empirical survey;Sreedhar;Int. J. Soft Comput. Eng. (IJSCE),2012
4. Approaches for word sense disambiguation—A survey;Borah;Int. J. Recent Technol. Eng.,2014
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献