Abstract
AbstractSpeaker verification is a biometric-based method for individual authentication. However, there are still several challenging problems in achieving high performance in short utterance text-independent conditions, maybe for weak speaker-specific features. Recently, deep learning algorithms have been used extensively in speech processing. This manuscript uses a deep belief network (DBN) as a deep generative method for feature extraction in speaker verification systems. This study aims to show the impact of using the proposed method in various challenging issues, including short utterances, text independence, language variation, and large-scale speaker verification. The proposed DBN uses MFCC as input and tries to extract more efficient features. This new representation of speaker information is evaluated in two popular speaker verification systems: GMM-UBM and i-vector-PLDA methods. The results show that, for the i-vector-PLDA system, the proposed feature decreases the EER considerably from 15.24 to 10.97%. In another experiment, DBN is used to reduce feature dimension and achieves significant results in decreasing computational time and increasing system response speed. In a case study, all the evaluations are performed for 1270 speakers of the NIST SRE2008 dataset. We show deep belief networks can be used in state-of-the-art acoustic modeling methods and more challenging datasets.
Publisher
Springer Science and Business Media LLC
Reference59 articles.
1. M.P. Alvin, A. Martin, NIST speaker recognition evaluation chronicles. In: The Speaker and Language Recognition Workshop (ODYSSEY, 2004)
2. L Alzubaidi J Bai A Al-Sabaawi J Santamaría A Albahri BSN Al-dabbagh MA Fadhel M Manoufali J Zhang AH Al-Timemy 2023 A survey on deep learning tools dealing with data scarcity: definitions, challenges, solutions, tips, and applications J. Big Data 10 46 127
3. Z Bai XL Zhang 2021 Speaker recognition based on deep learning: an overview Neural Netw. 140 65 99
4. A. Banerjee, A. Dubey, A. Menon, S. Nanda, G.C. Nandi, Speaker recognition using deep belief networks. arXiv:1805.08865 (2018)
5. I Bisio F Lavagetto C Garibotto A Sciarrone 2017 Speaker recognition exploiting D2D communications paradigm: performance evaluation of multiple observations approaches Mob. Netw. Appl. 22 1045 1057
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献