Understanding the performance of machine learning models from data- to patient-level

Author:

Valeriano Maria1ORCID,Matran-Fernandez Ana2ORCID,Kiffer Carlos3ORCID,Lorena Ana Carolina4ORCID

Affiliation:

1. Instituto Tecnológico de Aeronáutica, Sao Jose dos Campos, Brazil

2. University of Essex, Colchester United Kingdom of Great Britain and Northern Ireland

3. Universidade Federal de São Paulo, Sao Paulo Brazil

4. Instituto Tecnológico de Aeronáutica, Sao Jose dos Campos Brazil

Abstract

Machine Learning (ML) models have the potential to support decision-making in healthcare by grasping complex patterns within data. However, decisions in this domain are sensitive and require active involvement of domain specialists with deep knowledge of the data. In order to address this task, clinicians need to understand how predictions are generated so they can provide feedback for model refinement. There is usually a gap in the communication between data scientists and domain specialists that needs to be addressed. Specifically, many ML studies are only concerned with presenting average accuracies over an entire dataset, losing valuable insights that can be obtained at a more fine-grained patient-level analysis of classification performance. In this paper, we present a case study aimed at explaining the factors that contribute to specific predictions for individual patients. Our approach takes a data-centric perspective, focusing on the structure of the data and its correlation with ML model performance. We utilize the concept of Instance Hardness , which measures the level of difficulty an instance poses in being correctly classified. By selecting the hardest and easiest to classify instances, we analyze and contrast the distributions of specific input features and extract meta-features to describe each instance. Furthermore, we individually examine certain instances, offering valuable insights into why they offer challenges for classification, enabling a better understanding of both the successes and failures of the ML models. This opens up the possibility for discussions between data scientists and domain specialists, supporting collaborative decision-making.

Publisher

Association for Computing Machinery (ACM)

Reference25 articles.

1. Measuring Instance Hardness Using Data Complexity Measures

2. André Calero Valdez, Martina Ziefle, Katrien Verbert, Alexander Felfernig, and Andreas Holzinger. 2016. Recommender systems for health informatics: state-of-the-art and future perspectives. In Machine learning for health informatics. Springer, 391–414.

3. Angelos Chatzimparmpas Fernando V Paulovich and Andreas Kerren. 2022. HardVis: Visual Analytics to Handle Instance Hardness Using Undersampling and Oversampling Techniques. arXiv preprint arXiv:2203.15753(2022).

4. Clinical and immunological features of severe and moderate coronavirus disease 2019

5. Alexander Decruyenaere, Philippe Decruyenaere, Patrick Peeters, Frank Vermassen, Tom Dhaene, and Ivo Couckuyt. 2015. Prediction of delayed graft function after kidney transplantation: comparison between logistic regression and machine learning methods. BMC medical informatics and decision making 15 (2015), 1–10.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3