Affiliation:
1. School of Computing, SASTRA Deemed University, Thirumalasaisamudram, Thanjavur, Tamil Nadu, India
Abstract
Polycystic Ovary Syndrome (PCOS) is a hormonal condition that typically affects female during the time of their reproduction. It is identified by the disruptions in hormonal balance, particularly an increase in levels of androgen (male hormone) in the female body. PCOS can lead to various symptoms and health complications including irregular menstrual cycles, ovarian cysts, fertility issues, insulin resistance, weight gain, acne, and excess hair growth. The real-world PCOS detection is a challenging task whilst PCOS specific cause is unknown and its symptoms are unclear. Thus, accurate and timely diagnosis of PCOS is crucial for effective management and prevention of long-term complications. In such cases, Machine learning based PCOS prediction model support diagnostic process, address potential errors and time constraints. Machine learning algorithms can analyze large set of patient data, including medical history, hormonal profiles, and imaging results, to assist in the diagnosis of PCOS. In particular, the performance of data analysis chore and prediction model is improved by ensemble feature selection strategies. These methods concentrate on selecting a subset of pertinent features from a broader range of features. The unstable nature of the outcome of feature selection algorithm is a frequent issue in practical applications, when it is applied multiple times on similar dataset or with slight modifications in the data. Thus, evaluating the robustness of feature selection algorithm is most important. To address these issues and quantify the robustness, this study uses Jenson-Shannon divergence, an information theoretic approach with ensemble feature selection method to handle the various findings, such as complete ranking, half ranking and top-k lists (without ranking). Furthermore, this article proposes a hybrid machine learning classifier with SMOTE – SVM for the prompt detection of PCOS and the performance of the model is compared with a number of other individual classifiers including KNN (K-Nearest Neighbour), Support Vector Machine (SVM), AdaBoost, LR –Logistic Regression, NB –Nave Bayes, RF –Random Forest, Decision Tree. The proposed SWISS-AdaBoost classifier surpassed other models with 97.81% of accuracy and AUC of 99.08%.