Abstract
AbstractAdvancements in facial manipulation technology have resulted in highly realistic and indistinguishable face and expression swap videos. However, this has also raised concerns regarding the security risks associated with deepfakes. In the field of multimedia forensics, the detection and precise localization of image forgery has become essential tasks. Current deepfake detectors perform well with high-quality faces within specific datasets, but often struggle to maintain their performance when evaluated across different datasets. To this end, we propose an attention-based multi-task approach to improve feature maps for classification and localization tasks. The encoder and the attention-based decoder of our network generate localized maps that highlight regions with information about the type of manipulation. These localized features are shared with the classification network, improving its performance. Instead of using encoded spatial features, attention-based localized features from the decoder’s first layer are combined with frequency domain features to create a discriminative representation for deepfake detection. Through extensive experiments on face and expression swap datasets, we demonstrate that our method achieves competitive performance in comparison to state-of-the-art deepfake detection approaches in both in-dataset and cross-dataset scenarios. Code is available at https://github.com/saimawaseem/Multi-Attention-Based-Approach-for-Deepfake-Face-and-Expression-Swap-Detection-and-Localization.
Publisher
Springer Science and Business Media LLC
Subject
Electrical and Electronic Engineering,Information Systems,Signal Processing
Reference62 articles.
1. J. Thies, M. Zollhofer, M. Stamminger, C. Theobalt, M. Nießner, Face2face: Real-time face capture and reenactment of rgb videos. In: Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, pp. 2387–2395 (2016)
2. Y. Lu, Y.-W. Tai, C.-K. Tang, Attribute-guided face generation using conditional cyclegan. In: Proceedings of the European Conference on Computer Vision (ECCV), pp. 282–297 (2018)
3. H. Kim, P. Garrido, A. Tewari, W. Xu, J. Thies, M. Niessner, P. Pérez, C. Richardt, M. Zollhöfer, C. Theobalt, Deep video portraits. ACM Trans. Graph. (2018). https://doi.org/10.1145/3197517.3201283
4. L. Li, J. Bao, H. Yang, D. Chen, F. Wen, FaceShifter: towards high fidelity and occlusion aware face swapping (2020). arXiv:1912.13457
5. S. Lu, FaceSwap-GAN. https://github.com/shaoanlu/faceswap-GAN. Accessed: 2022-01-30
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献