Part-of-speech Tagging for Low-resource Languages: Activation Function for Deep Learning Network to Work with Minimal Training Data-Reference-Cited by-同舟云学术

Part-of-speech Tagging for Low-resource Languages: Activation Function for Deep Learning Network to Work with Minimal Training Data

Published:2024-05-10 Issue:5 Volume:23 Page:1-31
ISSN:2375-4699
Container-title:ACM Transactions on Asian and Low-Resource Language Information Processing
language:en
Short-container-title:ACM Trans. Asian Low-Resour. Lang. Inf. Process.

Author:

Baishya Diganta¹^ORCID,Baruah Rupam¹^ORCID

Affiliation:

1. Computer Science and Engineering, Assam Science and Technology University, Guwahati, India and Computer Science and Engineering, Jorhat Engineering College, Jorhat, India

Abstract

Numerous natural language processing (NLP) applications exist today, especially for the most commonly spoken languages such as English, Chinese, and Spanish. Popular traditional methods such as Rule based methods, Naive Bayes classifiers, Hidden Markov models, Conditional Random field-based classifiers, and other stochastic methods have contributed to this improvement in the past. Recently, deep learning has led to exciting breakthroughs in several areas of artificial intelligence, including image processing and natural language processing. It is important to label words as parts of speech to begin developing most of the NLP applications. A deep study in this area reveals that many popular approaches used for this purpose require massive training data. Therefore, these approaches have not been helpful for languages not rich in digital resources. Applying these methods with very little training data prompts the need for innovative problem-solving. This article describes our research, which examines the strengths and weaknesses of well-known approaches, such as conditional random fields and state-of-the-art deep learning models, when applied for part-of-speech tagging using minimal training data for Assamese and English. We also examine the factors affecting them. We discuss our deep learning architecture and the proposed activation function, which shows promise with little training data. The activation function categorizes words belonging to different classes with more confidence by using the outcomes of statistical methods with SMTaylor SoftMax in our deep learning model. With minimal training, our deep learning architecture using the proposed modification of SM-Taylor SoftMax improves accuracy upto 4%, for our small dataset. This technique is a combination of SMTaylor SoftMax and statistical probability distribution of words over tags.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3655023

Reference47 articles.

1. Part-of-speech tagging

2. Cícero Nogueira dos Santos and Bianca Zadrozny. 2014. Learning character-level representations for part-of-Speech Tagging. International Conference on Machine Learning.

3. Part of speech tagging: a systematic review of deep learning and machine learning approaches