Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device-Reference-Cited by-同舟云学术

Deep Learning Methods for Arabic Autoencoder Speech Recognition System for Electro-Larynx Device

Published:2023-02-28 Issue: Volume:2023 Page:1-11
ISSN:1687-5907
Container-title:Advances in Human-Computer Interaction
language:en
Short-container-title:Advances in Human-Computer Interaction

Author:

Mohammed Ameen Zinah J.¹^ORCID,Abdulrahman Kadhim Abdulkareem¹^ORCID

Affiliation:

1. College of Information Engineering, Al-Nahrain University, Baghdad, Iraq

Abstract

Recent advances in speech recognition have achieved remarkable performance comparable with human transcribers’ abilities. But this significant performance is not the same for all the spoken languages. The Arabic language is one of them. Arabic speech recognition is bounded to the lack of suitable datasets. Artificial intelligence algorithms have shown promising capabilities for Arabic speech recognition. Arabic is the official language of 22 countries, and it has been estimated that 400 million people speak the Arabic language worldwide. Speech disabilities have been one of the expanding problems in the last decades, even in kids. Some devices can be used to generate speech for those people. One of these devices is the Servox Digital Electro-Larynx (EL). In this research, we developed an autoencoder with a combination of long short-term memory (LSTM) and gated recurrent units (GRU) models to recognize recorded signals from Servox Digital EL Electro-Larynx. The proposed framework consisted of three steps: denoising, feature extraction, and Arabic speech recognition. The experimental results show 95.31% accuracy for Arabic speech recognition with the proposed model. In this research, we evaluated different combinations of LSTM and GRU for constructing the best autoencoder. A rigorous evaluation process indicates better performance with the use of GRU in both encoder and decoder structures. The proposed model achieved a 4.69% word error rate (WER). Experimental results confirm that the proposed model can be used for developing a real-time app to recognize common Arabic spoken words.

Publisher

Hindawi Limited

Subject

Human-Computer Interaction

Link

http://downloads.hindawi.com/journals/ahci/2023/7398538.pdf

Reference28 articles.

1. Arabic natural language processing: an overview;I. Guellil;Journal of King Saud University-Computer and Information Sciences,2021

2. Natural language processing for dialectical Arabic: A survey;A. Shoufan

3. Surface Electromyography–Based Recognition, Synthesis, and Perception of Prosodic Subvocal Speech

4. Arabizi Detection and Conversion to Arabic

5. Sentiment analysis for Arabizi text

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Speaker Recognition Using Convolutional Autoencoder in Mismatch Condition with Small Dataset in Noisy Background;Lecture Notes of the Institute for Computer Sciences, Social Informatics and Telecommunications Engineering;2024