The explanation game: a formal framework for interpretable machine learning-Reference-Cited by-同舟云学术

The explanation game: a formal framework for interpretable machine learning

Published:2020-04-03 Issue:10 Volume:198 Page:9211-9242
ISSN:0039-7857
Container-title:Synthese
language:en
Short-container-title:Synthese

Author:

Watson David S.,Floridi Luciano

Abstract

AbstractWe propose a formal framework for interpretable machine learning. Combining elements from statistical learning, causal interventionism, and decision theory, we design an idealised explanation game in which players collaborate to find the best explanation(s) for a given algorithmic prediction. Through an iterative procedure of questions and answers, the players establish a three-dimensional Pareto frontier that describes the optimal trade-offs between explanatory accuracy, simplicity, and relevance. Multiple rounds are played at different levels of abstraction, allowing the players to explore overlapping causal patterns of variable granularity and scope. We characterise the conditions under which such a game is almost surely guaranteed to converge on a (conditionally) optimal explanation surface in polynomial time, and highlight obstacles that will tend to prevent the players from advancing beyond certain explanatory thresholds. The game serves a descriptive and a normative function, establishing a conceptual space in which to analyse and compare existing proposals, as well as design new and improved solutions.

Publisher

Springer Science and Business Media LLC

Subject

General Social Sciences,Philosophy

Link

https://link.springer.com/content/pdf/10.1007/s11229-020-02629-9.pdf

Reference123 articles.

1. Angelino, E., Larus-Stone, N., Alabi, D., Seltzer, M., & Rudin, C. (2018). Learning certifiably optimal rule lists for categorical data. Journal of Machine Learning Research, 18(234), 1–78.

2. Angwin, J., Larson, J., Mattu, S., & Kirchner, L. (2016). Machine bias. Retrieved October 23, 2019 from https://www.propublica.org/article/machine-bias-risk-assessments-in-criminal-sentencing.

3. Baker, A. (2016). Simplicity. In E. N. Zalta (Ed.), The Stanford Encyclopedia of Philosophy. Stanford, CA: Metaphysics Research Lab, Stanford University.

4. Barocas, S., & Selbst, A. (2016). Big data’s disparate impact. California Law Review, 104(1), 671–729.

5. Bell, R. M., & Koren, Y. (2007). Lessons from the Netflix Prize Challenge. SIGKDD Explorations Newsletter, 9(2), 75–79.

Cited by 39 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Reliability and Interpretability in Science and Deep Learning;Minds and Machines;2024-06-25

2. Consumers’ Financial Distress: Prediction and Prescription Using Interpretable Machine Learning;Information Systems Frontiers;2024-06-11

3. Ética(s) de la Inteligencia Artificial y Derecho. Consideraciones a propósito de los límites y la contención del desarrollo tecnológico;DERECHOS Y LIBERTADES: Revista de Filosofía del Derecho y derechos humanos;2024-04-24

4. The SAGE Framework for Explaining Context in Explainable Artificial Intelligence;Applied Artificial Intelligence;2024-02-22

5. ML interpretability: Simple isn't easy;Studies in History and Philosophy of Science;2024-02