Synthetic speech detection through short-term and long-term prediction traces-Reference-Cited by-同舟云学术

Synthetic speech detection through short-term and long-term prediction traces

Published:2021-04-06 Issue:1 Volume:2021 Page:
ISSN:2510-523X
Container-title:EURASIP Journal on Information Security
language:en
Short-container-title:EURASIP J. on Info. Security

Author:

Borrelli Clara^ORCID,Bestagini Paolo,Antonacci Fabio,Sarti Augusto,Tubaro Stefano

Abstract

AbstractSeveral methods for synthetic audio speech generation have been developed in the literature through the years. With the great technological advances brought by deep learning, many novel synthetic speech techniques achieving incredible realistic results have been recently proposed. As these methods generate convincing fake human voices, they can be used in a malicious way to negatively impact on today’s society (e.g., people impersonation, fake news spreading, opinion formation). For this reason, the ability of detecting whether a speech recording is synthetic or pristine is becoming an urgent necessity. In this work, we develop a synthetic speech detector. This takes as input an audio recording, extracts a series of hand-crafted features motivated by the speech-processing literature, and classify them in either closed-set or open-set. The proposed detector is validated on a publicly available dataset consisting of 17 synthetic speech generation algorithms ranging from old fashioned vocoders to modern deep learning solutions. Results show that the proposed method outperforms recently proposed detectors in the forensics literature.

Publisher

Springer Science and Business Media LLC

Subject

Computer Science Applications,Signal Processing

Link

http://link.springer.com/content/pdf/10.1186/s13635-021-00116-3.pdf

Reference49 articles.

1. B. Dolhansky, J. Bitton, B. Pflaum, R. Lu, R. Howes, M. Wang, C. C. Ferrer, The deepfake detection challenge dataset. CoRR http://arxiv.org/abs/2006.07397(2020).

2. L. Verdoliva, Media forensics and deepfakes: an overview. CoRR http://arxiv.org/abs/2001.06564(2020).

3. Deepfakes github. https://github.com/deepfakes/faceswap.

4. Y. Li, M. Chang, S. Lyu, in IEEE International Workshop on Information Forensics and Security (WIFS). In ictu oculi: exposing AI created fake videos by detecting eye blinking (IEEEHong Kong, 2018).

5. D. Güera, E. J. Delp, in IEEE International Conference on Advanced Video and Signal-Based Surveillance (AVSS). Deepfake video detection using recurrent neural networks (IEEEAuckland, 2018).

Cited by 40 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Training-Free Deepfake Voice Recognition by Leveraging Large-Scale Pre-Trained Models;Proceedings of the 2024 ACM Workshop on Information Hiding and Multimedia Security;2024-06-24

2. Derin Sahte Ses Manipülasyonu Tespit Sistemleri Üzerine Bir Derleme;Yüzüncü Yıl Üniversitesi Fen Bilimleri Enstitüsü Dergisi;2024-04-30

3. A Watermark Challenge: Synthetic Speech Detection;Multimedia Watermarking;2024

4. Detection of Fake Audio: A Deep Learning-Based Comprehensive Survey;Smart Innovation, Systems and Technologies;2024

5. Are you Really Alone? Detecting the use of Speech Separation Techniques on Audio Recordings;2023 IEEE International Workshop on Information Forensics and Security (WIFS);2023-12-04