Bidirectional Attention for Text-Dependent Speaker Verification-Reference-Cited by-同舟云学术

Bidirectional Attention for Text-Dependent Speaker Verification

Published:2020-11-27 Issue:23 Volume:20 Page:6784
ISSN:1424-8220
Container-title:Sensors
language:en
Short-container-title:Sensors

Author:

Fang Xin^ORCID,Gao Tian^ORCID,Zou Liang^ORCID,Ling Zhenhua^ORCID

Abstract

Automatic speaker verification provides a flexible and effective way for biometric authentication. Previous deep learning-based methods have demonstrated promising results, whereas a few problems still require better solutions. In prior works examining speaker discriminative neural networks, the speaker representation of the target speaker is regarded as a fixed one when comparing with utterances from different speakers, and the joint information between enrollment and evaluation utterances is ignored. In this paper, we propose to combine CNN-based feature learning with a bidirectional attention mechanism to achieve better performance with only one enrollment utterance. The evaluation-enrollment joint information is exploited to provide interactive features through bidirectional attention. In addition, we introduce one individual cost function to identify the phonetic contents, which contributes to calculating the attention score more specifically. These interactive features are complementary to the constant ones, which are extracted from individual speakers separately and do not vary with the evaluation utterances. The proposed method archived a competitive equal error rate of 6.26% on the internal “DAN DAN NI HAO” benchmark dataset with 1250 utterances and outperformed various baseline methods, including the traditional i-vector/PLDA, d-vector, self-attention, and sequence-to-sequence attention models.

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry

Link

https://www.mdpi.com/1424-8220/20/23/6784/pdf

Reference33 articles.

1. HMM-Based Phrase-Independent i-Vector Extractor for Text-Dependent Speaker Verification

2. Adversarially Learned Total Variability Embedding for Speaker Recognition with Random Digit Strings

3. Optimization of the area under the ROC curve using neural network supervectors for text-dependent speaker verification

4. Forensic Speaker Verification Using Ordinary Least Squares

5. Text-dependent speaker verification: Classifiers, databases and RSR2015

Cited by 10 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Fluid Inclusion Detection Based on Improved YOLOX;2023 4th International Conference on Computer, Big Data and Artificial Intelligence (ICCBD+AI);2023-12-15

2. Application of Split Residual Multilevel Attention Network in Speaker Recognition;IEEE Access;2023

3. Rapid Qualitative Analysis of Wool Content Based on Improved U -Net plus plus and Near-Infrared Spectroscopy;SPECTROSC SPECT ANAL;2023

4. A Comprehensive Review on Speaker Recognition;Advances in Speech and Music Technology;2022-09-23

5. Attention-Based Temporal-Frequency Aggregation for Speaker Verification;Sensors;2022-03-10