Interpreting and Evaluating Neural Network Robustness-Reference-Cited by-同舟云学术

Interpreting and Evaluating Neural Network Robustness

Published:2019-08 Issue: Volume: Page:
ISSN:
Container-title:Proceedings of the Twenty-Eighth International Joint Conference on Artificial Intelligence
language:
Short-container-title:

Author:

Yu Fuxun¹,Qin Zhuwei¹,Liu Chenchen²,Zhao Liang¹,Wang Yanzhi³,Chen Xiang¹

Affiliation:

1. George Mason University

2. University of Maryland, Baltimore County

3. Northeastern University

Abstract

Recently, adversarial deception becomes one of the most considerable threats to deep neural networks. However, compared to extensive research in new designs of various adversarial attacks and defenses, the neural networks' intrinsic robustness property is still lack of thorough investigation. This work aims to qualitatively interpret the adversarial attack and defense mechanisms through loss visualization, and establish a quantitative metric to evaluate the model's intrinsic robustness. The proposed robustness metric identifies the upper bound of a model's prediction divergence in the given domain and thus indicates whether the model can maintain a stable prediction. With extensive experiments, our metric demonstrates several advantages over conventional testing accuracy based robustness estimation: (1) it provides a uniformed evaluation to models with different structures and parameter scales; (2) it over-performs conventional accuracy based robustness evaluation and provides a more reliable evaluation that is invariant to different test settings; (3) it can be fast generated without considerable testing cost.

Publisher

International Joint Conferences on Artificial Intelligence Organization

Cited by 20 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Adversarial robustness improvement for deep neural networks;Machine Vision and Applications;2024-03-14

2. Trustworthy Graph Neural Networks: Aspects, Methods, and Trends;Proceedings of the IEEE;2024-02

3. Expound: A Black-Box Approach for Generating Diversity-Driven Adversarial Examples;Search-Based Software Engineering;2023-12-04

4. Interpretability for reliable, efficient, and self-cognitive DNNs: From theories to applications;Neurocomputing;2023-08

5. Boosting Verified Training for Robust Image Classifications via Abstraction;2023 IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR);2023-06