Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation-Reference-Cited by-同舟云学术

Generative Counterfactuals for Neural Networks via Attribute-Informed Perturbation

Published:2021-05-26 Issue:1 Volume:23 Page:59-68
ISSN:1931-0145
Container-title:ACM SIGKDD Explorations Newsletter
language:en
Short-container-title:SIGKDD Explor. Newsl.

Author:

Yang Fan¹,Liu Ninghao¹,Du Mengnan¹,Hu Xia¹

Affiliation:

1. Texas A&M University, College Station, TX, USA

Abstract

With the wide use of deep neural networks (DNN), model interpretability has become a critical concern, since explainable decisions are preferred in high-stake scenarios. Current interpretation techniques mainly focus on the feature attribution perspective, which are limited in indicating why and how particular explanations are related to the prediction. To this end, an intriguing class of explanations, named counterfactuals, has been developed to further explore the "what-if" circumstances for interpretation, and enables the reasoning capability on black-box models. However, generating counterfactuals for raw data instances (i.e., text and image) is still in the early stage due to its challenges on high data dimensionality and unsemantic raw features. In this paper, we design a framework to generate counterfactuals specifically for raw data instances with the proposed Attribute-Informed Perturbation (AIP). By utilizing generative models conditioned with different attributes, counterfactuals with desired labels can be obtained effectively and efficiently. Instead of directly modifying instances in the data space, we iteratively optimize the constructed attributeinformed latent space, where features are more robust and semantic. Experimental results on real-world texts and images demonstrate the effectiveness, sample quality as well as efficiency of our designed framework, and show the superiority over other alternatives. Besides, we also introduce some practical applications based on our framework, indicating its potential beyond the model interpretability aspect.

Publisher

Association for Computing Machinery (ACM)

Link

https://dl.acm.org/doi/pdf/10.1145/3468507.3468517

Reference43 articles.

1. Very Deep Convolutional Networks for Text Classification

Cited by 7 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. SCF-Net: A sparse counterfactual generation network for interpretable fault diagnosis;Reliability Engineering & System Safety;2024-10

2. Counterfactual Explanations and Algorithmic Recourses for Machine Learning: A Review;ACM Computing Surveys;2024-07-09

3. SAFE: Saliency-Aware Counterfactual Explanations for DNN-based Automated Driving Systems;2023 IEEE 26th International Conference on Intelligent Transportation Systems (ITSC);2023-09-24

4. Counterfactual Functional Connectomes for Neurological Classifier Selection;2023 31st European Signal Processing Conference (EUSIPCO);2023-09-04

5. CouRGe: Counterfactual Reviews Generator for Sentiment Analysis;Communications in Computer and Information Science;2023