Abstract
The aim of this study was to compare effectiveness of various artificial intelligence classification algorithms in identifying patients with high-grade final histopathology of conisation based on last PAP smear result and risk factors for development of uterine cervical dysplasia and cancer. The data of 1475 patients who underwent conisation surgery at University Clinical Centre Maribor between 1993–2005 were analysed. Synthetic Minority Oversampling Technique (SMOTE) algorithm was employed for the imbalanced data correction. Various classification algorithms were tested with Weka open-source software. The 10-fold cross validation was used to define testing and hold-out set for analysis. Random Forest (RF) classification algorithm was better than the other tested algorithms and achieved 89.42% correct classifications (baseline ZeroR classification 63.4%, sensitivity 96.80%, specificity 76.60%, kappa 0.7632, Area under Receiver Operation Characteristic curve (AUC ROC) 0.911, Precision Recall curve (PRC) Area 0.916, and Matthews Correlation Coefficient (MCC) 0.771. Random Forest (RF) algorithm correctly identified majority of patients with final high-grade histopathology of conisation from patients dataset based on last PAP smear result and risk factors of developing high-grade dysplasia and carcinoma. Such algorithms can help clinicians to identify high-risk patients in future. An invitation could be sent to patients who did not participate in organized screening program, thus preventing the serious disease. Further studies are required in this regard.
Subject
Obstetrics and Gynecology,Oncology