An End-to-End Transfer Learning Framework of Source Recording Device Identification for Audio Sustainable Security-Reference-Cited by-同舟云学术

An End-to-End Transfer Learning Framework of Source Recording Device Identification for Audio Sustainable Security

Published:2023-07-19 Issue:14 Volume:15 Page:11272
ISSN:2071-1050
Container-title:Sustainability
language:en
Short-container-title:Sustainability

Author:

Wang Zhifeng¹^ORCID,Zhan Jian²,Zhang Guozhong²,Ouyang Daliang²,Guo Huaiyong²

Affiliation:

1. Department of Digital Media Technology, Central China Normal University, Wuhan 430079, China

2. Aerospace Science & Industry Shenzhen (Group) Co., Ltd., Shenzhen 518048, China

Abstract

Source recording device identification poses a significant challenge in the field of Audio Sustainable Security (ASS). Most existing studies on end-to-end identification of digital audio sources follow a two-step process: extracting device-specific features and utilizing them in machine learning or deep learning models for decision-making. However, these approaches often rely on empirically set hyperparameters, limiting their generalization capabilities. To address this limitation, this paper leverages the self-learning ability of deep neural networks and the temporal characteristics of audio data. We propose a novel approach that utilizes the Sinc function for audio preprocessing and combine it with a Deep Neural Network (DNN) to establish a comprehensive end-to-end identification model for digital audio sources. By allowing the parameters of the preprocessing and feature extraction processes to be learned through gradient optimization, we enhance the model’s generalization. To overcome practical challenges such as limited timeliness, small sample sizes, and incremental expression, this paper explores the effectiveness of an end-to-end transfer learning model. Experimental verification demonstrates that the proposed end-to-end transfer learning model achieves both timely and accurate results, even with small sample sizes. Moreover, it avoids the need for retraining the model with a large number of samples due to incremental expression. Our experiments showcase the superiority of our method, achieving an impressive 97.7% accuracy when identifying 141 devices. This outperforms four state-of-the-art methods, demonstrating an absolute accuracy improvement of 4.1%. This research contributes to the field of ASS and provides valuable insights for future studies in audio source identification and related applications of information security, digital forensics, and copyright protection.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Management, Monitoring, Policy and Law,Renewable Energy, Sustainability and the Environment,Geography, Planning and Development,Building and Construction

Link

https://www.mdpi.com/2071-1050/15/14/11272/pdf

Reference49 articles.

1. Detection of Audio Copy-Move-Forgery with Novel Feature Matching on Mel Spectrogram;Ustubioglu;Expert Syst. Appl.,2023

2. Zeng, C., Kong, S., Wang, Z., Li, K., and Zhao, Y. (2023). Digital Audio Tampering Detection Based on Deep Temporal–Spatial Features of Electrical Network Frequency. Information, 14.

3. RARS: Recognition of Audio Recording Source Based on Residual Neural Network;Shen;IEEE/ACM Trans. Audio Speech Lang. Process.,2021

4. Improving the Security of Audio CAPTCHAs with Adversarial Examples;Wang;IEEE Trans. Dependable Secur. Comput.,2023

5. Zeng, C., Feng, S., Zhu, D., and Wang, Z. (2023). Source Acquisition Device Identification from Recorded Audio Based on Spatiotemporal Representation Learning with Multi-Attention Mechanisms. Entropy, 25.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Audio source recording device recognition based on representation learning of sequential Gaussian mean matrix;Forensic Science International: Digital Investigation;2024-03