Replay Attack Detection Using Integrated Glottal Excitation Based Group Delay Function and Cepstral Features
Author:
Chaudhari Amol1ORCID, Shedge Dnyandeo1, Bairagi Vinayak1ORCID, Nanthaamornphong Aziz2ORCID
Affiliation:
1. Department of Electronics and Telecommunication Engineering, AISSMS Institute of Information Technology, Pune 411001, India 2. College of Computing, Prince of Songkla University, Phuket Campus, Phuket 83120, Thailand
Abstract
The automatic speaker verification system is susceptible to replay attacks. Recent literature has focused on score-level integration of multiple features, phase information-based features, high frequency-based features, and glottal excitation for the detection of replay attacks. This work presents glottal excitation-based all-pole group delay function (GAPGDF) features for replay attack detection. The essence of a group delay function based on the all-pole model is to exploit information from the speech signal phase spectrum in an effective manner. Further, the performance of integrated high-frequency-based CQCC features with cepstral features, subband spectral centroid-based features (SCFC and SCMC), APGDF, and LPC-based features is evaluated on the ASVspoof 2017 version 2.0 database. On the development set, an EER of 3.08% is achieved, and on the evaluation set, an EER of 9.86% is achieved. The proposed GAPGDF features provide an EER of 10.5% on the evaluation set. Finally, integrated GAPGDF and GCQCC features provide an EER of 8.80% on the evaluation set. The computation time required for the ASV systems based on various integrated features is compared to ensure symmetry between the integrated features and the classifier.
Funder
the College of Computing at Prince of Songkla University, Thailand
Reference58 articles.
1. Kinnunen, T., Evans, N., Yamagishi, J., Lee, K.A., Todisco, M., and Delgado, H. (2021, February 08). ASVspoof 2017: Automatic Speaker Verification Spoofing and Countermeasures Challenge Evaluation Plan. Available online: http://www.asvspoof.org/index2017.html. 2. Delgado, H., Todisco, M., Sahidullah, M., Evans, N., Kinnunen, T., Lee, K.A., and Yamagishi, J. (2018, January 26–29). ASVspoof 2017 Version 2.0: Meta-data analysis and baseline enhancements. Proceedings of the Speaker and Language Recognition Workshop (Odyssey 2018), Les Sables d’Olonne, France. 3. Font, R., Espín, J.M., and Cano, M.J. (2017, January 20–24). Experimental analysis of features for replay attack detection-Results on the ASVspoof 2017 Challenge. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association, Stockholm, Sweden. 4. Li, D., Wang, L., Dang, J., Liu, M., Oo, Z., Nakagawa, S., Guan, H., and Li, X. (2018, January 2–6). Multiple Phase Information Combination for Replay Attacks Detection. Proceedings of the Interspeech, Hyderabad, India. 5. Gunendradasan, T., Wickramasinghe, B., Le, P.N., Ambikairajah, E., and Epps, J. (2018, January 2–6). Detection of replay-spoofing attacks using frequency modulation features. Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH, International Speech Communication Association, Hyderabad, India.
|
|