SPEAKER IDENTIFICATION BY AGGREGATING GAUSSIAN MIXTURE MODELS (GMMs) BASED ON UNCORRELATED MFCC-DERIVED FEATURES-Reference-Cited by-同舟云学术

SPEAKER IDENTIFICATION BY AGGREGATING GAUSSIAN MIXTURE MODELS (GMMs) BASED ON UNCORRELATED MFCC-DERIVED FEATURES

Published:2014-06 Issue:04 Volume:28 Page:1456006
ISSN:0218-0014
Container-title:International Journal of Pattern Recognition and Artificial Intelligence
language:en
Short-container-title:Int. J. Patt. Recogn. Artif. Intell.

Author:

PAL AMITA¹,BOSE SMARAJIT¹,BASAK GOPAL K.²,MUKHOPADHYAY AMITAVA³

Affiliation:

1. Applied Statistics Division, Indian Statistical Institute, Kolkata, India

2. Theoretical Statistics and Mathematics Division, Indian Statistical Institute, Kolkata, India

3. Interra Information Technologies, Kolkata, India

Abstract

For solving speaker identification problems, the approach proposed by Reynolds [IEEE Signal Process. Lett.2 (1995) 46–48], using Gaussian Mixture Models (GMMs) based on Mel Frequency Cepstral Coefficients (MFCCs) as features, is one of the most effective available in the literature. The use of GMMs for modeling speaker identity is motivated by the interpretation that the Gaussian components represent some general speaker-dependent spectral shapes, and also by the capability of Gaussian mixtures to model arbitrary densities. In this work, we have initially illustrated, with the help of a new bilingual speech corpus, how the well-known principal component transformation, in conjunction with the principle of classifier combination can be used to enhance the performance of the MFCC-GMM speaker recognition systems significantly. Subsequently, we have emphatically and rigorously established the same using the benchmark speech corpus NTIMIT. A significant outcome of this work is that the proposed approach has the potential to enhance the performance of any speaker recognition system based on correlated features.

Publisher

World Scientific Pub Co Pte Lt

Subject

Artificial Intelligence,Computer Vision and Pattern Recognition,Software

Link

https://www.worldscientific.com/doi/pdf/10.1142/S0218001414560060

Reference21 articles.

1. Speaker identification by combining multiple classifiers using Dempster–Shafer theory of evidence

2. Subband architecture for automatic speaker recognition

3. Bagging predictors

4. Speaker recognition: a tutorial

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A Survey of Speaker Recognition: Fundamental Theories, Recognition Methods and Opportunities;IEEE Access;2021

2. Speaker identification features extraction methods: A systematic review;Expert Systems with Applications;2017-12

3. Feature Extraction Methods for Speaker Recognition: A Review;International Journal of Pattern Recognition and Artificial Intelligence;2017-09-17

4. Speaker Identification Using Semi-supervised Learning;Speech and Computer;2015