Multi-task Learning-Based Spoofing-Robust Automatic Speaker Verification System-Reference-Cited by-同舟云学术

Multi-task Learning-Based Spoofing-Robust Automatic Speaker Verification System

Published:2022-02-18 Issue:7 Volume:41 Page:4068-4089
ISSN:0278-081X
Container-title:Circuits, Systems, and Signal Processing
language:en
Short-container-title:Circuits Syst Signal Process

Author:

Zhao Yuanjun^ORCID,Togneri Roberto,Sreeram Victor

Abstract

AbstractSpoofing attacks posed by generating artificial speech can severely degrade the performance of a speaker verification system. Recently, many anti-spoofing countermeasures have been proposed for detecting varying types of attacks from synthetic speech to replay presentations. While there are numerous effective defenses reported on standalone anti-spoofing solutions, the integration for speaker verification and spoofing detection systems has obvious benefits. In this paper, we propose a spoofing-robust automatic speaker verification system for diverse attacks based on a multi-task learning architecture. This deep learning-based model is jointly trained with time-frequency representations from utterances to provide recognition decisions for both tasks simultaneously. Compared with other state-of-the-art systems on the ASVspoof 2017 and 2019 corpora, a substantial improvement of the combined system under different spoofing conditions can be obtained.

Funder

University of Western Australia

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Signal Processing

Link

https://link.springer.com/content/pdf/10.1007/s00034-022-01974-z.pdf

Reference46 articles.

1. F. Alegre, A. Amehraye, N. Evans, Spoofing countermeasures to protect automatic speaker verification from voice conversion, in 2013 IEEE International Conference on Acoustics (Speech and Signal Processing (ICASSP) (IEEE, 2013), pp. 3068–3072

2. F. Alegre, R. Vipperla, A. Amehraye, N. Evans, A new speaker verification spoofing countermeasure based on local binary patterns, in Interspeech (2013), pp. 940–944

3. C. Chen, A. Ross, A multi-task convolutional neural network for joint iris detection and presentation attack detection, in 2018 IEEE Winter Applications of Computer Vision Workshops (WACVW) (IEEE, 2018), pp. 44–51

4. D. Chen, B.K.W. Mak, Multitask learning of deep neural networks for low-resource speech recognition. IEEE/ACM Trans. Audio Speech Lang. Process. TASLP 23(7), 1172–1183 (2015)

5. P.L. De Leon, M. Pucher, J. Yamagishi, I. Hernaez, I. Saratxaga, Evaluation of speaker verification security and detection of HMM-based synthetic speech. IEEE/ACM Trans. Audio Speech Lang. Process. 20(8), 2280–2290 (2012)

Cited by 9 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. CPAUG: Refining Copy-Paste Augmentation for Speech Anti-Spoofing;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

2. Analysis of Deep Generative Model Impact on Feature Extraction and Dimension Reduction for Short Utterance Text-Independent Speaker Verification;Circuits, Systems, and Signal Processing;2024-04-13

3. Synthetic Speech Detection Based on the Temporal Consistency of Speaker Features;IEEE Signal Processing Letters;2024

4. Employing Discrete Fractional Wavelet Transform for Text-Dependent Speaker Verification;2024

5. Text-dependent speaker verification using discrete wavelet transform based on linear prediction coding;Biomedical Signal Processing and Control;2023-09