Convolutional neural network acoustic model for robust Indonesian speech recognition in noisy environment-Reference-Cited by-同舟云学术

Convolutional neural network acoustic model for robust Indonesian speech recognition in noisy environment

Published:2020-04-01 Issue:1 Volume:803 Page:012027
ISSN:1757-8981
Container-title:IOP Conference Series: Materials Science and Engineering
language:
Short-container-title:IOP Conf. Ser.: Mater. Sci. Eng.

Author:

Budiman M J,Lestari D P

Abstract

Abstract Noise causes the decreasing accuracy of automatic speech recognition (ASR). Several techniques have been developed and proposed to overcome this problem. Using artificial neural network (ANN) as acoustic model is one of the techniques. Convolutional neural network (CNN) is a variant of ANN that has been used for acoustic modeling. Another approach is to do pre-processing to the speech signal or to the extracted acoustic feature from speech signal, such as cepstral mean and variance normalization (CMVN). On this work, CNN acoustic models were trained by using CMVN pre-processed acoustic feature to make a noise-robust speech recognition system. Two group of models were made, each to handle 2 kinds of noise (babble noise and street noise). Those acoustic models were tested with noisy speech at different SNR (signal-to-noise ratio) value. Testing results from CNN acoustic models were compared with the ones from Gaussian Mixture Model-Hidden Markov Model (GMM-HMM) acoustic models. Testing results showed the increasing accuracy scores of acoustic models when models were trained using more variation of training data. CNN acoustic models that were trained using FBANK feature have higher accuracy scores than GMM-HMM models that were built using the same feature.

Publisher

IOP Publishing

Subject

General Medicine

Link

https://iopscience.iop.org/article/10.1088/1757-899X/803/1/012027/pdf

Reference8 articles.

1. Overview of noise-robust automatic speech recognition;Li;IEEE/ACM Transactions on Audio, Speech, and Language Processing,2014

2. Convolutional neural networks for speech recognition;Abdel-Hamid;IEEE/ACM Transactions on Audio, Speech, and Language Processing,2014

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Speech Recognition System for Writing Dentist Medical Records;2021 24th Conference of the Oriental COCOSDA International Committee for the Co-ordination and Standardisation of Speech Databases and Assessment Techniques (O-COCOSDA);2021-11-18