Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection-Reference-Cited by-同舟云学术

Transformer-based cascade networks with spatial and channel reconstruction convolution for deepfake detection

Published:2024 Issue:3 Volume:21 Page:4142-4164
ISSN:1551-0018
Container-title:Mathematical Biosciences and Engineering
language:
Short-container-title:MBE

Author:

Li Xue,Zhou Huibo,Zhao Ming

Abstract

<abstract><p>The threat posed by forged video technology has gradually grown to include individuals, society, and the nation. The technology behind fake videos is getting more advanced and modern. Fake videos are appearing everywhere on the internet. Consequently, addressing the challenge posed by frequent updates in various deepfake detection models is imperative. The substantial volume of data essential for their training adds to this urgency. For the deepfake detection problem, we suggest a cascade network based on spatial and channel reconstruction convolution (SCConv) and vision transformer. Our network model's front portion, which uses SCConv and regular convolution to detect fake videos in conjunction with vision transformer, comprises these two types of convolution. We enhance the feed-forward layer of the vision transformer, which can increase detection accuracy while lowering the model's computing burden. We processed the dataset by splitting frames and extracting faces to obtain many images of real and fake faces. Examinations conducted on the DFDC, FaceForensics++, and Celeb-DF datasets resulted in accuracies of 87.92, 99.23 and 99.98%, respectively. Finally, the video was tested for authenticity and good results were obtained, including excellent visualization results. Numerous studies also confirm the efficacy of the model presented in this study.</p></abstract>

Publisher

American Institute of Mathematical Sciences (AIMS)

Reference54 articles.

1. V. Kumar, V. Kansal, M. Gaur, Multiple forgery detection in video using convolution neural network, Comput. Mater. Continua, 73 (2022), 1347–1364. https://doi.org/10.32604/cmc.2022.023545

2. F. Ding, B. Fan, Z. Shen, K. Yu, G. Srivastava, K. Dev, et al., Securing facial bioinformation by eliminating adversarial perturbations, IEEE Trans. Ind. Inf., 19 (2023), 6682–6691. https://doi.org/10.1109/TII.2022.3201572

3. A. Ilderton, Coherent quantum enhancement of pair production in the null domain, Phys. Rev. D, 101 (2020), 016006. https://doi.org/10.1103/physrevd.101.016006

4. A. Ilderton, Lips don't lie: A generalisable and robust approach to face forgery detection, in Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), (2021), 5039–5049. https://doi.org/10.1109/CVPR46437.2021.00500

5. N. Yu, L. Davis, M. Fritz, Attributing fake images to gans: Learning and analyzing gan fingerprints, in Proceedings of the IEEE/CVF International Conference on Computer Vision (ICCV), (2019), 7556–7566. http://doi.org/10.1109/ICCV.2019.00765