From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence-Reference-Cited by-同舟云学术

From Human Explanation to Model Interpretability: A Framework Based on Weight of Evidence

Published:2021-10-04 Issue: Volume:9 Page:35-47
ISSN:2769-1349
Container-title:Proceedings of the AAAI Conference on Human Computation and Crowdsourcing
language:
Short-container-title:HCOMP

Author:

Alvarez Melis David,Kaur Harmanpreet,Daumé III Hal,Wallach Hanna,Wortman Vaughan Jennifer

Abstract

We take inspiration from the study of human explanation to inform the design and evaluation of interpretability methods in machine learning. First, we survey the literature on human explanation in philosophy, cognitive science, and the social sciences, and propose a list of design principles for machine-generated explanations that are meaningful to humans. Using the concept of weight of evidence from information theory, we develop a method for generating explanations that adhere to these principles. We show that this method can be adapted to handle high-dimensional, multi-class settings, yielding a flexible framework for generating explanations. We demonstrate that these explanations can be estimated accurately from finite samples and are robust to small perturbations of the inputs. We also evaluate our method through a qualitative user study with machine learning practitioners, where we observe that the resulting explanations are usable despite some participants struggling with background concepts like prior class probabilities. Finally, we conclude by surfacing design implications for interpretability tools in general.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Cited by 11 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Evaluating Explanations From AI Algorithms for Clinical Decision-Making: A Social Science-Based Approach;IEEE Journal of Biomedical and Health Informatics;2024-07

2. Unraveling the Dilemma of AI Errors: Exploring the Effectiveness of Human and Machine Explanations for Large Language Models;Proceedings of the CHI Conference on Human Factors in Computing Systems;2024-05-11

3. Interpretability Gone Bad: The Role of Bounded Rationality in How Practitioners Understand Machine Learning;Proceedings of the ACM on Human-Computer Interaction;2024-04-17

4. Evaluating Explanations from AI Algorithms for Clinical Decision-Making: A Social Science-based Approach;2024-02-27

5. A Learning Approach for Increasing AI Literacy via XAI in Informal Settings;Lecture Notes in Computer Science;2024