Affiliation:
1. Department of Electrical, Computer and Software Engineering, Ontario Tech University, Oshawa, ON L1G 0C5, Canada
Abstract
Deep learning has been successfully utilized in many applications, but it is vulnerable to adversarial samples. To address this vulnerability, a generative adversarial network (GAN) has been used to train a robust classifier. This paper presents a novel GAN model and its implementation to defend against L∞ and L2 constraint gradient-based adversarial attacks. The proposed model is inspired by some of the related work, but it includes multiple new designs such as a dual generator architecture, four new generator input formulations, and two unique implementations with L∞ and L2 norm constraint vector outputs. The new formulations and parameter settings of GAN are proposed and evaluated to address the limitations of adversarial training and defensive GAN training strategies, such as gradient masking and training complexity. Furthermore, the training epoch parameter has been evaluated to determine its effect on the overall training results. The experimental results indicate that the optimal formulation of GAN adversarial training must utilize more gradient information from the target classifier. The results also demonstrate that GANs can overcome gradient masking and produce effective perturbation to augment the data. The model can defend PGD L2 128/255 norm perturbation with over 60% accuracy and PGD L∞ 8/255 norm perturbation with around 45% accuracy. The results have also revealed that robustness can be transferred between the constraints of the proposed model. In addition, a robustness–accuracy tradeoff was discovered, along with overfitting and the generalization capabilities of the generator and classifier. These limitations and ideas for future work will be discussed.
Funder
Natural Sciences and Engineering Research Council of Canada (NSERC) Discovery Grant
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference25 articles.
1. Goodfellow, I.J., Shlens, J., and Szegedy, C. (2015). Explaining and Harnessing Adversarial Examples. arXiv.
2. Madry, A., Makelov, A., Schmidt, L., Tsipras, D., and Vladu, A. (2019). Towards Deep Learning Models Resistant to Adversarial Attacks. arXiv.
3. Carlini, N., and Wagner, D. (2017). Towards Evaluating the Robustness of Neural Networks. arXiv.
4. POBA-GA: Perturbation Optimized Black-Box Adversarial Attacks via Genetic Algorithm;Chen;Comput. Secur.,2019
5. Zhao, W., Alwidian, S., and Mahmoud, Q.H. (2023, January 13–14). Evaluation of GAN Architectures for Adversarial Robustness of Convolution Classifier. Proceedings of the AAAI-23 Workshop on Artificial Intelligence Safety (SafeAI 2023), Washington, DC, USA.
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献