Contrasting Explanations for Understanding and Regularizing Model Adaptations
-
Published:2022-05-04
Issue:
Volume:
Page:
-
ISSN:1370-4621
-
Container-title:Neural Processing Letters
-
language:en
-
Short-container-title:Neural Process Lett
Author:
Artelt André,Hinder Fabian,Vaquet Valerie,Feldhans Robert,Hammer Barbara
Abstract
AbstractMany of today’s decision making systems deployed in the real world are not static—they are changing and adapting over time, a phenomenon known as model adaptation takes place. Because of their wide reaching influence and potentially serious consequences, the need for transparency and interpretability of AI-based decision making systems is widely accepted and thus have been worked on extensively—e.g. a very prominent class of explanations are contrasting explanations which try to mimic human explanations. However, usually, explanation methods assume a static system that has to be explained. Explaining non-static systems is still an open research question, which poses the challenge how to explain model differences, adaptations and changes. In this contribution, we propose and (empirically) evaluate a general framework for explaining model adaptations and differences by contrasting explanations. We also propose a method for automatically finding regions in data space that are affected by a given model adaptation—i.e. regions where the internal reasoning of the other (e.g. adapted) model changed—and thus should be explained. Finally, we also propose a regularization for model adaptations to ensure that the internal reasoning of the adapted model does not change in an unwanted way.
Funder
Volkswagen Foundation
Bundesministerium für Bildung und Forschung
Federal state government of North Rhine-Westphalia
Publisher
Springer Science and Business Media LLC
Subject
Artificial Intelligence,Computer Networks and Communications,General Neuroscience,Software
Reference42 articles.
1. Stalidis P, Semertzidis T, Daras P (2018) Examining deep learning architectures for crime classification and prediction. arXiv:1812.00602
2. Khandani AE, Kim AJ, Lo A (2010) Consumer credit-risk models via machine-learning algorithms. J Bank Finance 34(11)
3. Waddell K (2016) How algorithms can bring down minorities’ credit scores. The Atlantic
4. Leslie D (2019) Understanding artificial intelligence ethics and safety. CoRR arXiv:1906.05684
5. Parliament E, council (2016) General Data Protection Regulation: Regulation
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献