A Rate-Distortion Framework for Explaining Black-Box Model Decisions-Reference-Cited by-同舟云学术

A Rate-Distortion Framework for Explaining Black-Box Model Decisions

Published:2022 Issue: Volume: Page:91-115
ISSN:0302-9743
Container-title:xxAI - Beyond Explainable AI
language:
Short-container-title:

Author:

Kolek Stefan,Nguyen Duc Anh,Levie Ron,Bruna Joan,Kutyniok Gitta

Abstract

AbstractWe present theRate-Distortion Explanation(RDE) framework, a mathematically well-founded method for explaining black-box model decisions. The framework is based on perturbations of the target input signal and applies to any differentiable pre-trained model such as neural networks. Our experiments demonstrate the framework’s adaptability to diverse data modalities, particularly images, audio, and physical simulations of urban environments.

Publisher

Springer International Publishing

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-04083-2_6

Reference32 articles.

1. Bach, S., Binder, A., Montavon, G., Klauschen, F., Müller, K.R., Samek, W.: On pixel-wise explanations for non-linear classifier decisions by layer-wise relevance propagation. PLoS ONE 10(7), e0130140 (2015)

2. Chang, C., Creager, E., Goldenberg, A., Duvenaud, D.: Explaining image classifiers by counterfactual generation. In: Proceedings of the 7th International Conference on Learning Representations, ICLR (2019)

3. Dabkowski, P., Gal, Y.: Real time image saliency for black box classifiers. In: Proceedings of the 31st International Conference on Neural Information Processing Systems, NeurIPS, pp. 6970–6979 (2017)

4. Deng, J., Dong, W., Socher, R., Li, L.J., Li, K., Fei-Fei, L.: ImageNet: a large-scale hierarchical image database. In: Proceedings of the 2009 IEEE Conference on Computer Vision and Pattern Recognition, CVPR, pp. 248–255 (2009)

5. DeVore, R.A.: Nonlinear approximation. Acta Numer. 7, 51–150 (1998)

Cited by 5 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Limitations of Deep Learning for Inverse Problems on Digital Hardware;IEEE Transactions on Information Theory;2023-12

2. Interpretable by Design: Learning Predictors by Composing Interpretable Queries;IEEE Transactions on Pattern Analysis and Machine Intelligence;2023-06-01

3. This looks More Like that: Enhancing Self-Explaining Models by Prototypical Relevance Propagation;Pattern Recognition;2023-04

4. RELAX: Representation Learning Explainability;International Journal of Computer Vision;2023-03-11

5. Explainable AI Methods - A Brief Overview;xxAI - Beyond Explainable AI;2022