Speaker Recognition Using Wavelet Cepstral Coefficient, I-Vector, and Cosine Distance Scoring and Its Application for Forensics-Reference-Cited by-同舟云学术

Speaker Recognition Using Wavelet Cepstral Coefficient, I-Vector, and Cosine Distance Scoring and Its Application for Forensics

Published:2016 Issue: Volume:2016 Page:1-11
ISSN:2090-0147
Container-title:Journal of Electrical and Computer Engineering
language:en
Short-container-title:Journal of Electrical and Computer Engineering

Author:

Lei Lei¹,Kun She¹^ORCID

Affiliation:

1. Laboratory of Cyberspace, School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China

Abstract

An important application of speaker recognition is forensics. However, the accuracy of speaker recognition in forensic cases often drops off rapidly because of the ill effect of ambient noise, variable channel, different duration of speech data, and so on. Therefore, finding a robust speaker recognition model is very important for forensics. This paper builds a new speaker recognition model based on wavelet cepstral coefficient (WCC), i-vector, and cosine distance scoring (CDS). This model firstly uses the WCC to transform the speech into spectral feature vecors and then uses those spectral feature vectors to train the i-vectors that represent the speeches having different durations. CDS is used to compare the i-vectors to give out the evidence. Moreover, linear discriminant analysis (LDA) and the within-class covariance normalization (WCNN) are added to the CDS algorithm to deal with the channel variability problem. Finally, the likelihood ratio estimates the strength of the evidence. We use the TIMIT database to evaluate the performance of the proposed model. The experimental results show that the proposed model can effectively solve the troubles of forensic scenario, but the time cost of the method is high.

Funder

Technology Support Program of Sichuan Province

Publisher

Hindawi Limited

Subject

Electrical and Electronic Engineering,General Computer Science,Signal Processing

Link

http://downloads.hindawi.com/journals/jece/2016/4908412.pdf

Reference23 articles.

1. Robust estimation, interpretation and assessment of likelihood ratios in forensic speaker recognition

2. An overview of text-independent speaker recognition: From features to supervectors

3. The inference of identity in forensic speaker recognition

Cited by 12 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A New Facial Expression Recognition Algorithm Based on DWT Feature Extraction and Selection;The International Arab Journal of Information Technology;2024

2. Voiceprint Recognition under Cross-Scenario Conditions Using Perceptual Wavelet Packet Entropy-Guided Efficient-Channel-Attention–Res2Net–Time-Delay-Neural-Network Model;Mathematics;2023-10-09

3. Wavelet Packet Sub-band Cepstral Coefficient for Speaker Verification;2022 IEEE 6th Advanced Information Technology, Electronic and Automation Control Conference (IAEAC );2022-10-03

4. A Discrete Wavelet Transform-Based Voice Activity Detection and Noise Classification with Sub-Band Selection;2021 IEEE International Symposium on Circuits and Systems (ISCAS);2021-05

5. On Cluster-Aware Supervised Learning: Frameworks, Convergent Algorithms, and Applications;INFORMS Journal on Computing;2021-03-09