Affiliation:
1. KIRŞEHİR AHİ EVRAN ÜNİVERSİTESİ
Abstract
One of the essential factors affecting recognition rates in speech recognition studies is environmental background noise. This study used a speech database containing different noise types to perform speaker-independent isolated word recognition. Thus, it will be possible to understand the effects of speech signals having noise on the recognition performance of classifiers. In the study, K-Nearest Neighbors (KNN), Fisher Linear Discriminant Analysis-KNN (FLDA-KNN), Discriminative Common Vector Approach (DCVA), Support Vector Machines (SVM), Convolutional Neural Network (CNN), and Recurrent Neural Network with Long Short-Term Memory (RNN-LSTM) were used as classifiers. MFCC and PLP coefficients were used as feature vectors. The DCVA classifier has been deeply tested for isolated word recognition for the first time in the literature. The recognition process was carried out using various distance measures for the KNN, FLDA-KNN, and DCVA classifiers. In addition, new (DCVA)PCA and (FLDA-KNN)PCA classifiers were designed as hybrid algorithms using Principle Component Analysis (PCA), and better recognition results were obtained from those of DCVA and FLDA-KNN classifiers. The highest recognition rate of RNN-LSTM was 93.22% in experimental studies. For the other classifiers, the highest recognition rates of the CNN, KNN, DCVA, (DCVA)PCA, SVM, FLDA-KNN, and (FLDA-KNN)PCA were 87.56%, 86.51%, 74.23%, 79%, 77.78%, 71.37% and 84.90%, respectively.
Publisher
Kirsehir Ahi Evran University