Speaker Verification Under Degraded Conditions Using Empirical Mode Decomposition Based Voice Activity Detection Algorithm

Author:

Rudramurthy M. S.1,Prasad V. Kamakshi2,Kumaraswamy R.3

Affiliation:

1. 1Department of Information Science and Engineering, S.I.T., Tumkur 572 103, Karnataka State, India

2. 2Department of Computer Science, JNTUH, Kukatpally, Hyderabad 500 085, A.P. State, India

3. 3Department of Electronics and Communication Engineering, S.I.T., Tumkur 572 103, Karnataka State, India

Abstract

AbstractThe performance of most of the state-of-the-art speaker recognition (SR) systems deteriorates under degraded conditions, owing to mismatch between the training and testing sessions. This study focuses on the front end of the speaker verification (SV) system to reduce the mismatch between training and testing. An adaptive voice activity detection (VAD) algorithm using zero-frequency filter assisted peaking resonator (ZFFPR) was integrated into the front end of the SV system. The performance of this proposed SV system was studied under degraded conditions with 50 selected speakers from the NIST 2003 database. The degraded condition was simulated by adding different types of noises to the original speech utterances. The different types of noises were chosen from the NOISEX-92 database to simulate degraded conditions at signal-to-noise ratio levels from 0 to 20 dB. In this study, widely used 39-dimension Mel frequency cepstral coefficient (MFCC; i.e., 13-dimension MFCCs augmented with 13-dimension velocity and 13-dimension acceleration coefficients) features were used, and Gaussian mixture model–universal background model was used for speaker modeling. The proposed system’s performance was studied against the energy-based VAD used as the front end of the SV system. The proposed SV system showed some encouraging results when EMD-based VAD was used at its front end.

Publisher

Walter de Gruyter GmbH

Subject

Artificial Intelligence,Information Systems,Software

Reference112 articles.

1. Speaker recognition a tutorial;Campbell;Proc IEEE,1976

2. Significance of vowel - like regions for speaker verification under degraded conditions Process;Mahadeva Prasanna;IEEE Trans Audio Speech,2011

3. based VAD as preprocessing for speech recognition in noisy environment in National Conference on Recent Advances in;Nalina;Electronics Communication Engineering,2013

4. Recent mathematical developments on empirical mode decomposition Adapt Data;Xu;Anal,2009

5. Voice activity detection for speech enhancement applications;Verteletskaya;Acta,2010

Cited by 4 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Semantic speech analysis using machine learning and deep learning techniques: a comprehensive review;Multimedia Tools and Applications;2023-12-19

2. Online Signature Recognition: A Biologically Inspired Feature Vector Splitting Approach;Cognitive Computation;2023-09-15

3. A Review on Emotion Based Harmful Speech Detection Using Machine Learning;2022 IEEE 22nd International Symposium on Computational Intelligence and Informatics and 8th IEEE International Conference on Recent Achievements in Mechatronics, Automation, Computer Science and Robotics (CINTI-MACRo);2022-11-21

4. Role of Speech Separation in Verifying the Speaker Under Degraded Conditions Using EMD and Hilbert Transform;Algorithms for Intelligent Systems;2022

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3