Disagreement amongst counterfactual explanations: how transparency can be misleading-Reference-Cited by-同舟云学术

Disagreement amongst counterfactual explanations: how transparency can be misleading

Published:2024-05-08 Issue: Volume: Page:
ISSN:1134-5764
Container-title:TOP
language:en
Short-container-title:TOP

Author:

Brughmans Dieter^ORCID,Melis Lissa,Martens David

Abstract

AbstractCounterfactual explanations are increasingly used as an Explainable Artificial Intelligence (XAI) technique to provide stakeholders of complex machine learning algorithms with explanations for data-driven decisions. The popularity of counterfactual explanations resulted in a boom in the algorithms generating them. However, not every algorithm creates uniform explanations for the same instance. Even though in some contexts multiple possible explanations are beneficial, there are circumstances where diversity amongst counterfactual explanations results in a potential disagreement problem among stakeholders. Ethical issues arise when for example, malicious agents use this diversity to fairwash an unfair machine learning model by hiding sensitive features. As legislators worldwide tend to start including the right to explanations for data-driven, high-stakes decisions in their policies, these ethical issues should be understood and addressed. Our literature review on the disagreement problem in XAI reveals that this problem has never been empirically assessed for counterfactual explanations. Therefore, in this work, we conduct a large-scale empirical analysis, on 40 data sets, using 12 explanation-generating methods, for two black-box models, yielding over 192,000 explanations. Our study finds alarmingly high disagreement levels between the methods tested. A malicious user is able to both exclude and include desired features when multiple counterfactual explanations are available. This disagreement seems to be driven mainly by the data set characteristics and the type of counterfactual algorithm. XAI centers on the transparency of algorithmic decision-making, but our analysis advocates for transparency about this self-proclaimed transparency.

Funder

Belgian American Educational Foundation

President's Postdoctoral Fellowship Program

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s11750-024-00670-2.pdf

Reference46 articles.

1. Arrieta AB, Díaz-Rodríguez N, Del Ser J et al (2020) Explainable artificial intelligence (xai): Concepts, taxonomies, opportunities and challenges toward responsible ai. Information fusion 58:82–115

2. Aïvodji U, Arai H, Fortineau O, et al (2019) Fairwashing: the risk of rationalization. International Conference on Machine Learning pp 161–170

3. Bordt S, Finck M, Raidl E, et al (2022) Post-hoc explanations fail to achieve their purpose in adversarial contexts. In: 2022 ACM Conference on Fairness, Accountability, and Transparency, pp 891–905

4. Brughmans D, Leyman P, Martens D (2023) Nice: an algorithm for nearest instance counterfactual explanations. Data Mining and Knowledge Discovery pp 1–39

5. Carrizosa E, Ramírez-Ayerbe J, Morales DR (2024) Generating collective counterfactual explanations in score-based classification via mathematical optimization. Expert Syst Appl 238:121954

Cited by 2 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. EvaluateXAI: A framework to evaluate the reliability and consistency of rule-based XAI techniques for software analytics tasks;Journal of Systems and Software;2024-11

2. An Empirical Analysis of User Preferences Regarding XAI Metrics;Lecture Notes in Computer Science;2024