Author:
Scarlini Bianca,Pasini Tommaso,Navigli Roberto
Abstract
Contextual representations of words derived by neural language models have proven to effectively encode the subtle distinctions that might occur between different meanings of the same word. However, these representations are not tied to a semantic network, hence they leave the word meanings implicit and thereby neglect the information that can be derived from the knowledge base itself. In this paper, we propose SensEmBERT, a knowledge-based approach that brings together the expressive power of language modelling and the vast amount of knowledge contained in a semantic network to produce high-quality latent semantic representations of word meanings in multiple languages. Our vectors lie in a space comparable with that of contextualized word embeddings, thus allowing a word occurrence to be easily linked to its meaning by applying a simple nearest neighbour approach.We show that, whilst not relying on manual semantic annotations, SensEmBERT is able to either achieve or surpass state-of-the-art results attained by most of the supervised neural approaches on the English Word Sense Disambiguation task. When scaling to other languages, our representations prove to be equally effective as their English counterpart and outperform the existing state of the art on all the Word Sense Disambiguation multilingual datasets. The embeddings are released in five different languages at http://sensembert.org.
Publisher
Association for the Advancement of Artificial Intelligence (AAAI)
Cited by
29 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
1. GC-PCWR+ for Word Sense Disambiguation;2024 International Conference on Asian Language Processing (IALP);2024-08-04
2. A survey on semantic processing techniques;Information Fusion;2024-01
3. Models and Strategies for Russian Word Sense Disambiguation: A Comparative Analysis;Lecture Notes in Computer Science;2024
4. Challenges and Overcoming Methods for Word Sense Disambiguation;2023 International Conference on Intelligent Technologies for Sustainable Electric and Communications Systems (iTech SECOM);2023-12-18
5. Connecting AI: Merging Large Language Models and Knowledge Graph;Computer;2023-11