Affiliation:
1. Electrical and Instrumentation Engineering Department, Thapar Institute of Engineering and Technology, Patiala 147004, India
Abstract
While speaker verification represents a critically important application of speaker recognition, it is also the most challenging and least well-understood application. Robust feature extraction plays an integral role in enhancing the efficiency of forensic speaker verification. Although the speech signal is a continuous one-dimensional time series, most recent models depend on recurrent neural network (RNN) or convolutional neural network (CNN) models, which are not able to exhaustively represent human speech, thus opening themselves up to speech forgery. As a result, to accurately simulate human speech and to further ensure speaker authenticity, we must establish a reliable technique. This research article presents a Two-Tier Feature Extraction with Metaheuristics-Based Automated Forensic Speaker Verification (TTFEM-AFSV) model, which aims to overcome the limitations of the previous models. The TTFEM-AFSV model focuses on verifying speakers in forensic applications by exploiting the average median filtering (AMF) technique to discard the noise in speech signals. Subsequently, the MFCC and spectrograms are considered as the inputs to the deep convolutional neural network-based Inception v3 model, and the Ant Lion Optimizer (ALO) algorithm is utilized to fine-tune the hyperparameters related to the Inception v3 model. Finally, a long short-term memory with a recurrent neural network (LSTM-RNN) mechanism is employed as a classifier for automated speaker recognition. The performance validation of the TTFEM-AFSV model was tested in a series of experiments. Comparative study revealed the significantly improved performance of the TTFEM-AFSV model over recent approaches.
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference26 articles.
1. Machado, T.J., Vieira Filho, J., and de Oliveira, M.A. (2019). Forensic speaker verification using ordinary least squares. Sensors, 19.
2. Wang, Z., Xia, W., and Hansen, J.H. (2020). Cross-domain adaptation with discrepancy minimization for text-independent forensic speaker verification. arXiv.
3. Stefanus, I., Sarwono, R.J., and Mandasari, M.I. (2017, January 9–11). GMM-based automatic speaker verification system development for forensics in Bahasa Indonesia. Proceedings of the 2017 5th International Conference on Instrumentation, Control, and Automation (ICA), Yogyakarta, Indonesia.
4. Automatic speaker recognition for mobile forensic applications;Algabri;Mob. Inf. Syst.,2017
5. An efficient speaker identification framework based on Mask R-CNN classifier parameter optimized using hosted cuckoo optimization (HCO);Gaurav;J. Ambient Intell. Human. Comput.,2022
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献