Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation-Reference-Cited by-同舟云学术

Investigating the Feasibility of Deep Learning Methods for Urdu Word Sense Disambiguation

Published:2022-03-31 Issue:2 Volume:21 Page:1-16
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Saeed Ali¹,Nawab Rao Muhammad Adeel²,Stevenson Mark³

Affiliation:

1. Department of Software Engineering, The University of Lahore, and Department of Computer Sciences, COMSATS University Islamabad, Lahore, Punjab, Pakistan

2. Department of Computer Sciences, COMSATS UniversityIslamabad, Lahore Campus, Lahore, Punjab, Pakistan

3. Department of Computer Sciences, University of Sheffield, Western Bank, Sheffield, UK

Abstract

Word Sense Disambiguation (WSD), the process of automatically identifying the correct meaning of a word used in a given context, is a significant challenge in Natural Language Processing. A range of approaches to the problem has been explored by the research community. The majority of these efforts has focused on a relatively small set of languages, particularly English. Research on WSD for South Asian languages, particularly Urdu, is still in its infancy. In recent years, deep learning methods have proved to be extremely successful for a range of Natural Language Processing tasks. The main aim of this study is to apply, evaluate, and compare a range of deep learning methods approaches to Urdu WSD (both Lexical Sample and All-Words) including Simple Recurrent Neural Networks, Long-Short Term Memory, Gated Recurrent Units, Bidirectional Long-Short Term Memory, and Ensemble Learning. The evaluation was carried out on two benchmark corpora: (1) the ULS-WSD-18 corpus and (2) the UAW-WSD-18 corpus. Results (Accuracy = 63.25% and F1-Measure = 0.49) show that a deep learning approach outperforms previously reported results for the Urdu All-Words WSD task, whereas performance using deep learning approaches (Accuracy = 72.63% and F1-Measure = 0.60) are low in comparison to previously reported for the Urdu Lexical Sample task.

Publisher

Association for Computing Machinery (ACM)

Subject

General Computer Science

Link

https://dl.acm.org/doi/pdf/10.1145/3477578

Reference74 articles.

1. Urdu word sense disambiguation using machine learning approach;Abid Muhammad;Cluster Comput.,2018

2. E. Agirre O. Lopez de Lacalle C. Fellbaum A. Marchetti A. Toral P. T. J. M. Vossen L. Màrques and R. Wicentowski. 2009. All-words Word Sense Disambiguation on a Specific Domain (SemEval-2010 Task 17). In SEW2009@ NAACL-HLT2009 te Boulder Colorado USA . Association for Computational Linguistics (ACL) 123–128.

3. Bidirectional Recurrent Neural Network Approach for Arabic Named Entity Recognition

4. Corpus specificity in LSA and Word2vec: The role of out-of-domain documents;Altszyler Edgar;arXiv preprint arXiv:1712.10054,2017

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparison of Pre-trained vs Custom-trained Word Embedding Models for Word Sense Disambiguation;ADCAIJ: Advances in Distributed Computing and Artificial Intelligence Journal;2023-11-01