Optimizing Integrated Features for Hindi Automatic Speech Recognition System-Reference-Cited by-同舟云学术

Optimizing Integrated Features for Hindi Automatic Speech Recognition System

Published:2018-10-01 Issue:1 Volume:29 Page:959-976
ISSN:2191-026X
Container-title:Journal of Intelligent Systems
language:
Short-container-title:

Author:

Dua Mohit¹,Aggarwal Rajesh Kumar¹,Biswas Mantosh¹

Affiliation:

1. Department of Computer Engineering, National Institute of Technology, Kurukshetra 136119, India

Abstract

Abstract An automatic speech recognition (ASR) system translates spoken words or utterances (isolated, connected, continuous, and spontaneous) into text format. State-of-the-art ASR systems mainly use Mel frequency (MF) cepstral coefficient (MFCC), perceptual linear prediction (PLP), and Gammatone frequency (GF) cepstral coefficient (GFCC) for extracting features in the training phase of the ASR system. Initially, the paper proposes a sequential combination of all three feature extraction methods, taking two at a time. Six combinations, MF-PLP, PLP-MFCC, MF-GFCC, GF-MFCC, GF-PLP, and PLP-GFCC, are used, and the accuracy of the proposed system using all these combinations was tested. The results show that the GF-MFCC and MF-GFCC integrations outperform all other proposed integrations. Further, these two feature vector integrations are optimized using three different optimization methods, particle swarm optimization (PSO), PSO with crossover, and PSO with quadratic crossover (Q-PSO). The results demonstrate that the Q-PSO-optimized GF-MFCC integration show significant improvement over all other optimized combinations.

Publisher

Walter de Gruyter GmbH

Subject

Artificial Intelligence,Information Systems,Software

Link

https://www.degruyter.com/document/doi/10.1515/jisys-2018-0057/pdf

Reference70 articles.

1. Developments and directions in speech recognition and understanding, Part 1 [DSP Education];IEEE Signal Process. Mag.,2009

2. Filterbank optimization for robust ASR using GA and PSO;Int. J. Speech Technol.,2012

3. Discriminative training using noise robust integrated features and refined HMM modeling,2020

4. Speech recognition using ANN and predator-influenced civilized swarm optimization algorithm;Turk. J. Elect. Eng. Comput. Sci.,2016

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Feature extraction using GTCC spectrogram and ResNet50 based classification for audio spoof detection;International Journal of Speech Technology;2024-03

2. NRASV: Noise Robust ASV System for Audio Replay Attack Detection;Lecture Notes in Networks and Systems;2024

3. Noise robust automatic speech recognition: review and analysis;International Journal of Speech Technology;2023-06-24

4. Comparative Analysis of Different Parameters used for Optimization in the Process of Speaker and Speech Recognition using Deep Neural Network;2022 International Conference on Future Trends in Smart Communities (ICFTSC);2022-12-01

5. Developing a Speech Recognition System for Recognizing Tonal Speech Signals Using a Convolutional Neural Network;Applied Sciences;2022-06-19