Using Generative AI to Improve the Performance and Interpretability of Rule-Based Diagnosis of Type 2 Diabetes Mellitus

Author:

Kopitar Leon12ORCID,Fister Iztok2ORCID,Stiglic Gregor123ORCID

Affiliation:

1. Faculty of Health Sciences, University of Maribor, Zitna Ulica 15, 2000 Maribor, Slovenia

2. Faculty of Electrical Engineering and Computer Science, University of Maribor, Koroska Cesta 46, 2000 Maribor, Slovenia

3. Usher Institute, University of Edinburgh, Teviot Place, Edinburgh 0131, UK

Abstract

Introduction: Type 2 diabetes mellitus is a major global health concern, but interpreting machine learning models for diagnosis remains challenging. This study investigates combining association rule mining with advanced natural language processing to improve both diagnostic accuracy and interpretability. This novel approach has not been explored before in using pretrained transformers for diabetes classification on tabular data. Methods: The study used the Pima Indians Diabetes dataset to investigate Type 2 diabetes mellitus. Python and Jupyter Notebook were employed for analysis, with the NiaARM framework for association rule mining. LightGBM and the dalex package were used for performance comparison and feature importance analysis, respectively. SHAP was used for local interpretability. OpenAI GPT version 3.5 was utilized for outcome prediction and interpretation. The source code is available on GitHub. Results: NiaARM generated 350 rules to predict diabetes. LightGBM performed better than the GPT-based model. A comparison of GPT and NiaARM rules showed disparities, prompting a similarity score analysis. LightGBM’s decision making leaned heavily on glucose, age, and BMI, as highlighted in feature importance rankings. Beeswarm plots demonstrated how feature values correlate with their influence on diagnosis outcomes. Discussion: Combining association rule mining with GPT for Type 2 diabetes mellitus classification yields limited effectiveness. Enhancements like preprocessing and hyperparameter tuning are required. Interpretation challenges and GPT’s dependency on provided rules indicate the necessity for prompt engineering and similarity score methods. Variations in feature importance rankings underscore the complexity of T2DM. Concerns regarding GPT’s reliability emphasize the importance of iterative approaches for improving prediction accuracy.

Funder

Slovenian Research Agency

Publisher

MDPI AG

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3