Evaluating Machine Learning Classifiers in Breast Cancer: Non-Linear Contributions of MR Diffusion-Perfusion Features to Molecular-based Prognostic Stratification-Reference-Cited by-同舟云学术

Evaluating Machine Learning Classifiers in Breast Cancer: Non-Linear Contributions of MR Diffusion-Perfusion Features to Molecular-based Prognostic Stratification

Published:2024-03-19 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Amini Behnam,Ghasemi Moein,Farazandeh Dorreh,M. Mohammad H. Akbarizadeh,Farzaneh Hana,Torabi Sarah,Sedaghat Mona,Jafarimehrabady Niloofar,Hajiabbasi Mobasher^ORCID,Azizi Ashkan,Gorjestani Omidreza,Naviafar Anahita,Hosseini Mohammad M.,Karimi Nastaran,Parsaei Amirhossein,Rahmani Alireza^ORCID,Doshmanziari Reza,Vajihinezhad Maryam^ORCID,Rikhtehgar Masih¹,Nokiani Alireza Almasi¹

Affiliation:

1. Department of Radiology, Iran University of Medical Sciences

Abstract

Abstract Background Diffusion-weighted imaging (DWI) map the microenvironment of breast cancer (BC) into cellular density and membrane integrity, and captures the effects of capillary microcirculation and intracellular structures through multi b-value analyses. Amidst potential biases in the radiomics pipeline, we aim to discern clinically relevant features from artifacts, improving machine learning (ML) classifier applicability in BC diagnostics through informed feature selection. Methods We prospectively enrolled 148 BC patients for ML classifier training, with an additional 98 patients included retrospectively for validation, ensuring consistent imaging and post-processing standards. Tumor subtypes were classified based on hormone receptor (HR), Human Epidermal Growth Factor Receptor 2 (HER2), and Ki67 levels. Utilizing a wide range of ML classifiers, we pinpointed an optimal feature count range of 8–13 for maximal training efficacy and generalizability, given our training and validation cohort sizes. Specifically, 12 domain-specific multi b-value DWI features were selected, focusing on entropy and first-order statistics of the of apparent diffusion coefficient (ADC), and higher-order statistical features (intravoxel incoherent motion (IVIM) parameters Dt, fp, Dp; diffusion kurtosis imaging (DKI) metrics MD, MK). Classifier stability was gauged by the interfold range of 4-fold cross-validation area under the curve (AUC) for the training dataset, while performance was assessed based on validation dataset AUC. Significant DWI features for molecular-based stratifications were identified based on a combined criterion applied to the ML classifier with the highest validation AUC, prioritizing the top three features ranked by importance and with a stability score over 0.7 in subsampling. Results Among linear classifiers, Stochastic Gradient Descent (SGD) stood out by showing distinct predictive power for HR status, contrasting with the generally limited effectiveness of other linear models. Non-linear classifiers significantly outperformed linear models across other categories. Random Forest excelled in Ki67 and luminal A subtype, AdaBoost in triple-negative subtyping, and XGBoost in HER2 status and subtype. SVM with Radial Basis Function kernels and Feed-Forward Neural Network jointly showed proficiency in classifying luminal HER2. Notably, XGBoost and Random Forest demonstrated stable feature selection processes. The entropy and first-order features of ADC was pivotal across molecular-based prognostic stratifications. IVIM features significantly influenced HR and Ki67 statuses, along with their attributed subtypes (luminal A, luminal B, and triple-negative). Conversely, DKI features were uniquely predictive in the HER2 domain, highlighting their distinctive contributions to the stratification of luminal HER2 and HER2 subtypes. Conclusions Non-linear machine learning classifiers excel in BC stratification, leveraging complex DWI features to deepen insights into cancer subtypes and molecular characteristics, marking a strategic evolution towards precision diagnostics.

Publisher

Research Square Platform LLC

Reference43 articles.

1. Identification of the prognostic value of ferroptosis-related gene signature in breast cancer patients;Wang D;BMC Cancer,2021

2. Histogram analysis of apparent diffusion coefficient at 3.0 t: correlation with prognostic factors and subtypes of invasive ductal carcinoma;Kim EJ;J Magn Reson Imaging,2015

3. Precursors and preinvasive lesions of the breast: the role of molecular prognostic markers in the diagnostic and therapeutic dilemma;Zagouri F;World J Surg Oncol,2007

4. Clinical intravoxel incoherent motion and diffusion MR imaging: past, present, and future;Iima M;Radiology,2016

5. Value of genomics- and radiomics-based machine learning models in the identification of breast cancer molecular subtypes: a systematic review and meta-analysis;Zhang Y;Ann Transl Med,2022