Fooling Examples: Another Intriguing Property of Neural Networks
Author:
Zhang Ming1ORCID, Chen Yongkang1, Qian Cheng1
Affiliation:
1. National Key Laboratory of Science and Technology on Information System Security, Beijing 100101, China
Abstract
Neural networks have been proven to be vulnerable to adversarial examples; these are examples that can be recognized by both humans and neural networks, although neural networks give incorrect predictions. As an intriguing property of neural networks, adversarial examples pose a serious threat to the secure application of neural networks. In this article, we present another intriguing property of neural networks: the fact that well-trained models believe some examples to be recognizable objects (often with high confidence), while humans cannot recognize such examples. We refer to these as “fooling examples”. Specifically, we take inspiration from the construction of adversarial examples and develop an iterative method for generating fooling examples. The experimental results show that fooling examples can not only be easily generated, with a success rate of nearly 100% in the white-box scenario, but also exhibit strong transferability across different models in the black-box scenario. Tests on the Google Cloud Vision API show that fooling examples can also be recognized by real-world computer vision systems. Our findings reveal a new cognitive deficit of neural networks, and we hope that these potential security threats will be addressed in future neural network applications.
Funder
Foundation of National Key Laboratory of Science and Technology on Information System Security
Subject
Electrical and Electronic Engineering,Biochemistry,Instrumentation,Atomic and Molecular Physics, and Optics,Analytical Chemistry
Reference32 articles.
1. He, K., Zhang, X., Ren, S., and Sun, J. (July, January 26). Deep Residual Learning for Image Recognition. Proceedings of the 2016 IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA. 2. Graves, A., Mohamed, A., and Hinton, G.E. (2013, January 26–31). Speech Recognition with Deep Recurrent Neural Networks. Proceedings of the 2013 IEEE International Conference on Acoustics, Speech and Signal Processing, Vancouver, BC, Canada. 3. Tomlin, N., He, A., and Klein, D. (2022, January 22–27). Understanding Game-Playing Agents with Natural Language Annotations. Proceedings of the 60th Annual Meeting of the Association for Computational Linguistics, Dublin, Ireland. 4. Zhang, J., Li, B., Xu, J., Wu, S., Ding, S., Zhang, L., and Wu, C. (2022, January 19–24). Towards efficient data free black-box adversarial attack. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, New Orleans, LA, USA. 5. Metapoison: Practical general-purpose clean-label data poisoning;Huang;Adv. Neural Inf. Process. Syst.,2020
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|