Attack as Detection: Using Adversarial Attack Methods to Detect Abnormal Examples

Author:

Zhao Zhe1,Chen Guangke1,Liu Tong1,Li Taishan1,Song Fu2,Wang Jingyi3,Sun Jun4

Affiliation:

1. ShanghaiTech University, China

2. State Key Laboratory of Computer Science, Institute of Software, Chinese Academy of Sciences and University of Chinese Academy of Sciences, China

3. Zhejiang University, China

4. Singapore Management University, Singapore

Abstract

As a new programming paradigm, deep learning has achieved impressive performance in areas such as image processing and speech recognition, and has expanded its application to solve many real-world problems. However, neural networks and deep learning are normally black box systems, and even worse deep learning based software are vulnerable to threats from abnormal examples, such as adversarial and backdoored examples constructed by attackers with malicious intentions as well as unintentionally mislabeled samples. Therefore, it is important and urgent to detect such abnormal examples. While various detection approaches have been proposed respectively addressing some specific types of abnormal examples, they suffer from some limitations and until today this problem is still of considerable interest. In this work, we first propose a novel characterization to distinguish abnormal examples from normal ones based on the observation that abnormal examples have significantly different (adversarial) robustness from normal ones. We systemically analyze those three different types of abnormal samples in terms of robustness, and find that they have different characteristics from normal ones. As robustness measurement is computationally expensive and hence can be challenging to scale to large networks, we then propose to effectively and efficiency measure robustness of an input sample using the cost of adversarially attacking the input, which was originally proposed to test robustness of neural networks against adversarial examples. Next, we propose a novel detection method, named “attack as detection” (A 2 D) which uses the cost of adversarially attacking an input instead of robustness to check if it is abnormal. Our detection method is generic and various adversarial attack methods could be leveraged. Extensive experiments show that A 2 D is more effective than recent promising approaches that were proposed to detect only one specific type of abnormal examples. We also thoroughly discuss possible adaptive attack methods to our adversarial example detection method and show that A 2 D is still effective in defending carefully designed adaptive adversarial attack methods, e.g., the attack success rate drops to 0% on CIFAR10.

Publisher

Association for Computing Machinery (ACM)

Subject

Software

Reference128 articles.

1. 2022. A2D. https://github.com/S3L-official/attack-as-detection. 2022. A 2 D. https://github.com/S3L-official/attack-as-detection.

2. Neural network laundering: Removing black-box backdoor watermarks from deep neural networks

3. Apollo. 2018. An open reliable and secure software platform for autonomous driving systems. http://apollo.auto. Apollo. 2018. An open reliable and secure software platform for autonomous driving systems. http://apollo.auto.

4. A practical guide for using statistical tests to assess randomized algorithms in software engineering

5. Anish Athalye , Nicholas Carlini , and David  A. Wagner . 2018 . Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples . In Proceedings of the 35th International Conference on Machine Learning. 274–283 . Anish Athalye, Nicholas Carlini, and David A. Wagner. 2018. Obfuscated Gradients Give a False Sense of Security: Circumventing Defenses to Adversarial Examples. In Proceedings of the 35th International Conference on Machine Learning. 274–283.

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Stealthy Backdoor Attack for Code Models;IEEE Transactions on Software Engineering;2024-04

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3