Author:
Shen Qian,Guo Mengxi,Huang YiDa,Ma Jianfen
Publisher
Springer Science and Business Media LLC
Reference23 articles.
1. Abdzadeh, P., & Veisi, H. (2023). A Comparison of CQT spectrogram with STFT-based Acoustic features in deep learning-based synthetic speech detection. Journal of AI and Data Mining, 11(1), 119–129.
2. Chung, J. S., Nagrani, A., & Zisserman, A. (2018). Voxceleb2: Deep speaker recognition.
3. Dai, Z., Yang, Z., Yang, Y., Carbonell, J., Le, Q. V., & Salakhutdinov, R. (2019). Transformer-xl: Attentive language models beyond a fixed-length context.
4. Dai, Y., Gieseke, F., Oehmcke, S., Wu, Y., & Barnard, K. (2021). Attentional feature fusion. In Proceedings of the IEEE/CVF winter conference on applications of computer vision (pp. 3560–3569).
5. Desplanques, B., Thienpondt, J., & Demuynck, K. (2020). Ecapa-tdnn: Emphasized channel attention, propagation and aggregation in tdnn based speaker verification. https://doi.org/10.21437/Interspeech.2020-2650.