Deep learning-based idiomatic expression recognition for the Amharic language-Reference-Cited by-同舟云学术

Deep learning-based idiomatic expression recognition for the Amharic language

Published:2023-12-14 Issue:12 Volume:18 Page:e0295339
ISSN:1932-6203
Container-title:PLOS ONE
language:en
Short-container-title:PLoS ONE

Author:

Endalie Demeke^ORCID,Haile Getamesay,Taye Wondmagegn

Abstract

Idiomatic expressions are built into all languages and are common in ordinary conversation. Idioms are difficult to understand because they cannot be deduced directly from the source word. Previous studies reported that idiomatic expression affects many Natural language processing tasks in the Amharic language. However, most natural language processing models used with the Amharic language, such as machine translation, semantic analysis, sentiment analysis, information retrieval, question answering, and next-word prediction, do not consider idiomatic expressions. As a result, in this paper, we proposed a convolutional neural network (CNN) with a FastText embedding model for detecting idioms in an Amharic text. We collected 1700 idiomatic and 1600 non-idiomatic expressions from Amharic books to test the proposed model’s performance. The proposed model is then evaluated using this dataset. We employed an 80 by 10,10 splitting ratio to train, validate, and test the proposed idiomatic recognition model. The proposed model’s learning accuracy across the training dataset is 98%, and the model achieves 80% accuracy on the testing dataset. We compared the proposed model to machine learning models like K-Nearest Neighbor (KNN), Support Vector Machine (SVM), and Random Forest classifiers. According to the experimental results, the proposed model produces promising results.

Publisher

Public Library of Science (PLoS)

Subject

Multidisciplinary

Reference28 articles.

1. Are figurative interpretations of idioms directly retrieved, compositionally built, or both? Evidence from eye movement measures of reading;Kyle Lovseth Debra A Titone;Canadian Journal of Experimental Psychology,2019

2. Language, Culture, Idioms, and Their Relationship with the Foreign Language;Oktay Yağiz;Journal of Language Teaching and Research,2013

3. Idiom Token Classification using Sentential Distributed Semantics;Giancarlo Salton;Proceedings of the 54th Annual Meeting of the Association for Computational Linguistics,2016

4. Catching Idiomatic Expressions in EFL Essays;Michael Flor;Proceedings of the Workshop on Figurative Language Processing,2018