Affiliation:
1. School of Electronic and Electrical Engineering, Shanghai University of Engineering Science, Shanghai 201620, China
2. School of Electronic Information and Electrical Engineering, Shanghai Jiao Tong University, Shanghai 200240, China
Abstract
In audio copy-move forgery forensics, existing traditional methods typically first segment audio into voiced and silent segments, then compute the similarity between voiced segments to detect and locate forged segments. However, audio collected in noisy environments is difficult to segment and manually set, and heuristic similarity thresholds lack robustness. Existing deep learning methods extract features from audio and then use neural networks for binary classification, lacking the ability to locate forged segments. Therefore, for locating audio copy-move forgery segments, we have improved deep learning methods and proposed a robust localization model by CNN-based spectral analysis. In the localization model, the Feature Extraction Module extracts deep features from Mel-spectrograms, while the Correlation Detection Module automatically decides on the correlation between these deep features. Finally, the Mask Decoding Module visually locates the forged segments. Experimental results show that compared to existing methods, the localization model improves the detection accuracy of audio copy-move forgery by 3.0–6.8%and improves the average detection accuracy of forged audio with post-processing attacks such as noise, filtering, resampling, and MP3 compression by over 7.0%.
Funder
Natural Science Foundation of China
Natural Science Foundation of Shanghai
Opening Project of Shanghai Key Laboratory of Integrated Administration Technologies for Information Security
Innovation Fund for Industry-University-Research of Chinese Universities