Bidirectional deep architecture for Arabic speech recognition-Reference-Cited by-同舟云学术

Bidirectional deep architecture for Arabic speech recognition

Published:2019-04-20 Issue:1 Volume:9 Page:92-102
ISSN:2299-1093
Container-title:Open Computer Science
language:
Short-container-title:

Author:

Zerari Naima¹,Abdelhamid Samir¹,Bouzgou Hassen²,Raymond Christian³

Affiliation:

1. Laboratory of Automation and Manufacturing, Department of Industrial Engineering, University of Batna 2 Mostefa Ben Boulaid, Batna, 05000, Algeria

2. Department of Industrial Engineering, University of Batna 2 Mostefa Ben Boulaid, Batna, 05000, Algeria

3. INSA Rennes, IRISA/INRIA, Rennes, France

Abstract

AbstractNowadays, the real life constraints necessitates controlling modern machines using human intervention by means of sensorial organs. The voice is one of the human senses that can control/monitor modern interfaces. In this context, Automatic Speech Recognition is principally used to convert natural voice into computer text as well as to perform an action based on the instructions given by the human. In this paper, we propose a general framework for Arabic speech recognition that uses Long Short-Term Memory (LSTM) and Neural Network (Multi-Layer Perceptron: MLP) classifier to cope with the nonuniform sequence length of the speech utterances issued fromboth feature extraction techniques, (1)Mel Frequency Cepstral Coefficients MFCC (static and dynamic features), (2) the Filter Banks (FB) coefficients. The neural architecture can recognize the isolated Arabic speech via classification technique. The proposed system involves, first, extracting pertinent features from the natural speech signal using MFCC (static and dynamic features) and FB. Next, the extracted features are padded in order to deal with the non-uniformity of the sequences length. Then, a deep architecture represented by a recurrent LSTM or GRU (Gated Recurrent Unit) architectures are used to encode the sequences of MFCC/FB features as a fixed size vector that will be introduced to a Multi-Layer Perceptron network (MLP) to perform the classification (recognition). The proposed system is assessed using two different databases, the first one concerns the spoken digit recognition where a comparison with other related works in the literature is performed, whereas the second one contains the spoken TV commands. The obtained results show the superiority of the proposed approach.

Publisher

Walter de Gruyter GmbH

Subject

General Computer Science

Link

https://www.degruyter.com/downloadpdf/journals/comp/9/1/article-p92.xml

Reference44 articles.

1. UCIMachine Learning Repository University of California http archive ics uci edu ml;Lichman,2013

2. Bidirectional recurrent end - to - end neural network classifier for spoken Arab digit recognition International Conference on Natural Language and Speech Processing;Zerari,2018

3. - spectral cepstral coefficients for robust speech recognition international conference on acoustics speech and signal processing;Kumar;Delta IEEE,2011

4. Deep neural networks for acoustic modeling in speech recognition processing magazine;Hinton;IEEE Signal,2012

5. speech and - speaker identifcation system : feature extraction description and classification of speech signal image transactions on industrial;Saeed;IEEE electronics,2007

Cited by 33 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Isolated word recognition based on a hyper-tuned cross-validated CNN-BiLSTM from Mel Frequency Cepstral Coefficients;Multimedia Tools and Applications;2024-07-04

2. A Study on Speech Recognition by a Neural Network Based on English Speech Feature Parameters;Journal of Advanced Computational Intelligence and Intelligent Informatics;2024-05-20

3. Amharic spoken digits recognition using convolutional neural network;Journal of Big Data;2024-05-04

4. Unsupervised phoneme segmentation of continuous Arabic speech;International Journal of Speech Technology;2024-05-02

5. Speech Recognition Utilizing Deep Learning: A Systematic Review of the Latest Developments;HUM-CENT COMPUT INFO;2024