Affiliation:
1. The Faculty of Fundamental Sciences, Vilnius Gediminas Technical University, Saulėtekio al. 11, LT-10223 Vilnius, Lithuania
Abstract
Word sense disambiguation (WSD) remains a persistent challenge in the natural language processing (NLP) community. While various NLP packages exist, the Lesk algorithm in the NLTK library demonstrates suboptimal accuracy. In this research article, we propose an innovative methodology and an open-source framework that effectively addresses the challenges of WSD by optimizing memory usage without compromising accuracy. Our system seamlessly integrates WSD into NLP tasks, offering functionality similar to that provided by the NLTK library. However, we go beyond the existing approaches by introducing a novel idea related to WSD. Specifically, we leverage deep neural networks and consider the language patterns learned by these models as the new gold standard. This approach suggests modifying existing semantic dictionaries, such as WordNet, to align with these patterns. Empirical validation through a series of experiments confirmed the effectiveness of our proposed method, achieving state-of-the-art performance across multiple WSD datasets. Notably, our system does not require the installation of additional software beyond the well-known Python libraries. The classification model is saved in a readily usable text format, and the entire framework (model and data) is publicly available on GitHub for the NLP research community.
Reference43 articles.
1. Laukaitis, A., Ostašius, E., and Plikynas, D. (2021). Deep semantic parsing with upper ontologies. Appl. Sci., 11.
2. Word sense disambiguation: A survey;Navigli;ACM Comput. Surv. (CSUR),2009
3. Loureiro, D., and Jorge, A. (August, January 28). Language Modelling Makes Sense: Propagating Representations through WordNet for Full-Coverage Word Sense Disambiguation. Proceedings of the 57th Annual Meeting of the Association for Computational Linguistics, Florence, Italy.
4. Baker, C.F., Fillmore, C.J., and Lowe, J.B. (1998, January 10–14). The berkeley framenet project. Proceedings of the 36th Annual Meeting of the Association for Computational Linguistics and 17th International Conference on Computational Linguistics, Montreal, QC, Canada.
5. Poli, R., Healy, M., and Kameas, A. (2010). WordNet. Theory and Applications of Ontology: Computer Applications, Springer.