Discrepant Semantic Diffusion Boosts Transfer Learning Robustness
-
Published:2023-12-16
Issue:24
Volume:12
Page:5027
-
ISSN:2079-9292
-
Container-title:Electronics
-
language:en
-
Short-container-title:Electronics
Author:
Gao Yajun1ORCID, Bai Shihao2, Zhao Xiaowei1, Gong Ruihao12, Wu Yan3, Ma Yuqing14
Affiliation:
1. State Key Lab of Software Development Environment, Beihang University, Beijing 100191, China 2. SenseTime Research, Beijing 100080, China 3. Beijing Academy of Science and Technology, Beijing 100089, China 4. Institute of Artifical Intelligence, Beihang University, Beijing 100191, China
Abstract
Transfer learning could improve the robustness and generalization of the model, reducing potential privacy and security risks. It operates by fine-tuning a pre-trained model on downstream datasets. This process not only enhances the model’s capacity to acquire generalizable features but also ensures an effective alignment between upstream and downstream knowledge domains. Transfer learning can effectively speed up the model convergence when adapting to novel tasks, thereby leading to the efficient conservation of both data and computational resources. However, existing methods often neglect the discrepant downstream–upstream connections. Instead, they rigidly preserve the upstream information without an adequate regularization of the downstream semantic discrepancy. Consequently, this results in weak generalization, issues with collapsed classification, and an overall inferior performance. The main reason lies in the collapsed downstream–upstream connection due to the mismatched semantic granularity. Therefore, we propose a discrepant semantic diffusion method for transfer learning, which could adjust the mismatched semantic granularity and alleviate the collapsed classification problem to improve the transfer learning performance. Specifically, the proposed framework consists of a Prior-Guided Diffusion for pre-training and a discrepant diffusion for fine-tuning. Firstly, the Prior-Guided Diffusion aims to empower the pre-trained model with the semantic-diffusion ability. This is achieved through a semantic prior, which consequently provides a more robust pre-trained model for downstream classification. Secondly, the discrepant diffusion focuses on encouraging semantic diffusion. Its design intends to avoid the unwanted semantic centralization, which often causes the collapsed classification. Furthermore, it is constrained by the semantic discrepancy, serving to elevate the downstream discrimination capabilities. Extensive experiments on eight prevalent downstream classification datasets confirm that our method can outperform a number of state-of-the-art approaches, especially for fine-grained datasets or datasets dissimilar to upstream data (e.g., 3.75% improvement for Cars dataset and 1.79% improvement for SUN dataset under the few-shot setting with 15% data). Furthermore, the experiments of data sparsity caused by privacy protection successfully validate our proposed method’s effectiveness in the field of artificial intelligence security.
Funder
National Natural Science Foundation of China
Subject
Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering
Reference72 articles.
1. Chakraborty, C., Nagarajan, S.M., Devarajan, G.G., Ramana, T., and Mohanty, R. (2023). Intelligent AI-based Healthcare Cyber Security System using Multi-Source Transfer Learning Method. ACM Trans. Sens. Netw. 2. A transfer learning approach for securing resource-constrained iot devices;Aydogan;IEEE Trans. Inf. Forensics Secur.,2021 3. Singla, A., Bertino, E., and Verma, D. (2019, January 12–15). Overcoming the lack of labeled data: Training intrusion detection models using transfer learning. Proceedings of the 2019 IEEE International Conference on Smart Computing (SMARTCOMP), Washington, DC, USA. 4. Pan, W., Xiang, E., Liu, N., and Yang, Q. (2010, January 11–15). Transfer learning in collaborative filtering for sparsity reduction. Proceedings of the AAAI Conference on Artificial Intelligence, Atlanta, GA, USA. 5. Rezaei, S., and Liu, X. (2019). A target-agnostic attack on deep models: Exploiting security vulnerabilities of transfer learning. arXiv.
|
|