Affiliation:
1. Department of Robotics, Beijing Union University, Beijing, China
2. Beijing Key Laboratory of Information Service Engineering, Beijing Union University, Beijing, China
Abstract
With the development of generative model, the cost of facial manipulation and forgery is becoming lower and lower. Fraudulent data has brought numerous hidden threats in politics, privacy, and cybersecurity. Although many methods of face forgery detection focus on the learning of high frequency forgery traces and achieve promising performance, these methods usually learn features in spatial and frequency independently. In order to combine the information of the two domains, a combined spatial and frequency dual stream network is proposed for face forgery detection. Concretely, a cross self-attention (CSA) module is designed to improve frequency feature interaction and fusion at different scales. Moreover, to augment the semantic and contextual information, frequency guided spatial feature extraction module is proposed to extract and reconstruct the spatial information. These two modules deeply mine the forgery traces via a dual-stream collaborative network. Through comprehensive experiments on different datasets, we demonstrate the effectiveness of proposed method for both within and cross datasets.
Funder
National Natural Science Foundation of China
Reference37 articles.
1. End-to-end reconstruction-classification learning for face forgery detection;Cao,2022
2. Local relation learning for face forgery detection;Chen;Proceedings of the AAAI Conference on Artificial Intelligence,2021
3. Xception: deep learning with depthwise separable convolutions;Chollet,2017
4. Generative adversarial networks: an overview;Creswell;IEEE Signal Processing Magazine,2018
5. The deepfake detection challenge (dfdc) dataset;Dolhansky,2020