Author:
Zhang Hongjian,Fan Xiao,Zhang Junxia,Wei Zhiyuan,Feng Wei,Hu Yifang,Ni Jiaying,Yao Fushen,Zhou Gaoxin,Wan Cheng,Zhang Xin,Wang Junjie,Liu Yun,You Yongping,Yu Yun
Abstract
ObjectivesIn adult diffuse glioma, preoperative detection of isocitrate dehydrogenase (IDH) status helps clinicians develop surgical strategies and evaluate patient prognosis. Here, we aim to identify an optimal machine-learning model for prediction of IDH genotyping by combining deep-learning (DL) signatures and conventional radiomics (CR) features as model predictors.MethodsIn this study, a total of 486 patients with adult diffuse gliomas were retrospectively collected from our medical center (n=268) and the public database (TCGA, n=218). All included patients were randomly divided into the training and validation sets by using nested 10-fold cross-validation. A total of 6,736 CR features were extracted from four MRI modalities in each patient, namely T1WI, T1CE, T2WI, and FLAIR. The LASSO algorithm was performed for CR feature selection. In each MRI modality, we applied a CNN+LSTM–based neural network to extract DL features and integrate these features into a DL signature after the fully connected layer with sigmoid activation. Eight classic machine-learning models were analyzed and compared in terms of their prediction performance and stability in IDH genotyping by combining the LASSO–selected CR features and integrated DL signatures as model predictors. In the validation sets, the prediction performance was evaluated by using accuracy and the area under the curve (AUC) of the receiver operating characteristics, while the model stability was analyzed by using the relative standard deviation of the AUC (RSDAUC). Subgroup analyses of DL signatures and CR features were also individually conducted to explore their independent prediction values.ResultsLogistic regression (LR) achieved favorable prediction performance (AUC: 0.920 ± 0.043, accuracy: 0.843 ± 0.044), whereas support vector machine with the linear kernel (l-SVM) displayed low prediction performance (AUC: 0.812 ± 0.052, accuracy: 0.821 ± 0.050). With regard to stability, LR also showed high robustness against data perturbation (RSDAUC: 4.7%). Subgroup analyses showed that DL signatures outperformed CR features (DL, AUC: 0.915 ± 0.054, accuracy: 0.835 ± 0.061, RSDAUC: 5.9%; CR, AUC: 0.830 ± 0.066, accuracy: 0.771 ± 0.051, RSDAUC: 8.0%), while DL and DL+CR achieved similar prediction results.ConclusionIn IDH genotyping, LR is a promising machine-learning classification model. Compared with CR features, DL signatures exhibit markedly superior prediction values and discriminative capability.