Polycystic Ovary Syndrome Detection Machine Learning Model Based on Optimized Feature Selection and Explainable Artificial Intelligence

Author:

Elmannai Hela1ORCID,El-Rashidy Nora2ORCID,Mashal Ibrahim3ORCID,Alohali Manal Abdullah4,Farag Sara5ORCID,El-Sappagh Shaker67,Saleh Hager8ORCID

Affiliation:

1. Department of Information Technology, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

2. Machine Learning and Information Retrieval Department, Faculty of Artificial Intelligence, Kafrelsheiksh University, Kafrelsheiksh 13518, Egypt

3. Faculty of Information Technology, Applied Science Private University, Amman 11937, Jordan

4. Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, P.O. Box 84428, Riyadh 11671, Saudi Arabia

5. Faculty of Computers and Informations, South Valley University, Qena 83523, Egypt

6. Faculty of Computer Science and Engineering, Galala University, Suez 435611, Egypt

7. Information Systems Department, Faculty of Computers and Artificial Intelligence, Benha University, Banha 13518, Egypt

8. Faculty of Computers and Artificial Intelligence, South Valley University, Hurghada 84511, Egypt

Abstract

Polycystic ovary syndrome (PCOS) has been classified as a severe health problem common among women globally. Early detection and treatment of PCOS reduce the possibility of long-term complications, such as increasing the chances of developing type 2 diabetes and gestational diabetes. Therefore, effective and early PCOS diagnosis will help the healthcare systems to reduce the disease’s problems and complications. Machine learning (ML) and ensemble learning have recently shown promising results in medical diagnostics. The main goal of our research is to provide model explanations to ensure efficiency, effectiveness, and trust in the developed model through local and global explanations. Feature selection methods with different types of ML models (logistic regression (LR), random forest (RF), decision tree (DT), naive Bayes (NB), support vector machine (SVM), k-nearest neighbor (KNN), xgboost, and Adaboost algorithm to get optimal feature selection and best model. Stacking ML models that combine the best base ML models with meta-learner are proposed to improve performance. Bayesian optimization is used to optimize ML models. Combining SMOTE (Synthetic Minority Oversampling Techniques) and ENN (Edited Nearest Neighbour) solves the class imbalance. The experimental results were made using a benchmark PCOS dataset with two ratios splitting 70:30 and 80:20. The result showed that the Stacking ML with REF feature selection recorded the highest accuracy at 100 compared to other models.

Funder

Princess Nourah bint Abdulrahman University, Riyadh, Saudi Arabia

Publisher

MDPI AG

Subject

Clinical Biochemistry

Reference49 articles.

1. Polycystic ovary syndrome: Definition, aetiology, diagnosis and treatment;Nat. Rev. Endocrinol.,2018

2. Polycystic ovary syndrome;Norman;Lancet,2007

3. Polycystic ovary syndrome;McCartney;N. Engl. J. Med.,2016

4. Obesity and polycystic ovary syndrome;Barber;Clin. Endocrinol.,2021

5. Polycystic ovary syndrome;Azziz;Obstet. Gynecol.,2018

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3