DeforestVis: Behaviour Analysis of Machine Learning Models with Surrogate Decision Stumps-Reference-Cited by-同舟云学术

DeforestVis: Behaviour Analysis of Machine Learning Models with Surrogate Decision Stumps

Published:2024-02-27 Issue: Volume: Page:
ISSN:0167-7055
Container-title:Computer Graphics Forum
language:en
Short-container-title:Computer Graphics Forum

Author:

Chatzimparmpas Angelos¹^ORCID,Martins Rafeal M.²^ORCID,Telea Alexandru C.³^ORCID,Kerren Andreas²⁴^ORCID

Affiliation:

1. Department of Computer Science Northwestern University Evanston USA

2. Department of Computer Science and Media Technology Linnaeus University Växjö Sweden

3. Department of Information and Computing Sciences Utrecht University Utrecht The Netherlands

4. Department of Science and Technology Linköping University Norrköping Sweden

Abstract

AbstractAs the complexity of machine learning (ML) models increases and their application in different (and critical) domains grows, there is a strong demand for more interpretable and trustworthy ML. A direct, model‐agnostic, way to interpret such models is to train surrogate models—such as rule sets and decision trees—that sufficiently approximate the original ones while being simpler and easier‐to‐explain. Yet, rule sets can become very lengthy, with many if–else statements, and decision tree depth grows rapidly when accurately emulating complex ML models. In such cases, both approaches can fail to meet their core goal—providing users with model interpretability. To tackle this, we propose DeforestVis, a visual analytics tool that offers summarization of the behaviour of complex ML models by providing surrogate decision stumps (one‐level decision trees) generated with the Adaptive Boosting (AdaBoost) technique. DeforestVis helps users to explore the complexity versus fidelity trade‐off by incrementally generating more stumps, creating attribute‐based explanations with weighted stumps to justify decision making, and analysing the impact of rule overriding on training instance allocation between one or more stumps. An independent test set allows users to monitor the effectiveness of manual rule changes and form hypotheses based on case‐by‐case analyses. We show the applicability and usefulness of DeforestVis with two use cases and expert interviews with data analysts and model developers.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1111/cgf.15004

Reference96 articles.

1. AnkerstM. EsterM. KriegelH.‐P.:Towards an effective cooperation of the user and the computer for classification. InProceedings of the Sixth ACM SIGKDD International Conference on Knowledge Discovery and Data Mining(2000) KDD '00 ACM pp. 179–188.https://doi.org/10.1145/347090.347124

2. AntweilerD. FuchsG.:Visualizing rule‐based classifiers for clinical risk prognosis. InProceedings of the IEEE Visualization and Visual Analytics(2022) VIS '22 IEEE pp. 55–59.https://doi.org/10.1109/VIS54862.2022.00020

3. BehrischM. KorkmazF. ShaoL. SchreckT.:Feedback‐driven interactive exploration of large multidimensional data supported by visual classifier. InProceedings of the IEEE Conference on Visual Analytics Science and Technology(2014) VAST '14 pp. 43–52.https://doi.org/10.1109/VAST.2014.7042480

4. BarlowT. NevilleP.:Case study: Visualization for decision tree analysis in data mining. InProceedings of the IEEE Symposium on Information Visualization (2001) INFOVIS '01 IEEE pp. 149–152.https://doi.org/10.1109/INFVIS.2001.963292

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Navigating the landscape of concept-supported XAI: Challenges, innovations, and future directions;Multimedia Tools and Applications;2024-01-22