Abstract
Word sense disambiguation as a central research topic in natural language processing can promote the development of many applications such as information retrieval, speech synthesis, machine translation, summarization and question answering. Previous approaches can be grouped into three categories: supervised, unsupervised and knowledge-based. The accuracy of supervised methods is the highest, but they suffer from knowledge acquisition bottleneck. Unsupervised method can avoid knowledge acquisition bottleneck, but its effect is not satisfactory. With the built-up of large-scale knowledge, knowledge-based approach has attracted more and more attention. This paper introduces a new context weighting method, and based on which proposes a novel semi-supervised approach for word sense disambiguation. The significant contribution of our method is that thesaurus and machine learning techniques are integrated in word sense disambiguation. Compared with the state of the art on the test data of the English all words disambiguation task in Sensaval-3, our method yields obvious improvements over existing methods in nouns, adjectives and verbs disambiguation.
Publisher
Trans Tech Publications, Ltd.
Reference41 articles.
1. C. Stokoe. Differentiating Homonymy and Polysemy in Information Retrieval. Proc. Conf. Human Language Technology and Empirical Methods in Natural Language Processing. 2005, pp: 403-410.
2. R. Sproat, J. Hirschberg, D. Yarowsky. A corpus-based synthesizer. Proceedings of the International Conference on Spoken Language Processing. (1992).
3. D. Vickrey, L. Biewald, M. Teyssier, D. Koller. Word-Sense Disambiguation for Machine Translation. Proc. Conf. Human Language Technology and Empirical Methods in Natural Language Processing. 2005, pp: 771-778.
4. R. Barzilay, M. Elhadad. Using Lexical Chains for Text Summarization. Proc. ACL Workshop Intelligent Scalable Text Summarization. 1997, pp: 10-17.
5. G. Ramakrishnan, A. Jadhav, A. Joshi, S. Chakrabarti, P. Bhattacharyya. Question Answering via Bayesian Inference on Lexical Relations. Proc. ACL Workshop Multilingual Summarization and Question Answering. 2003, pp: 1-10, (2003).
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. Disentangled Representation for Long-tail Senses of Word Sense Disambiguation;Proceedings of the 31st ACM International Conference on Information & Knowledge Management;2022-10-17