Abstract
Purpose
Gene expression profiles are used for decision making in the adjuvant setting of hormone receptor positive, HER2 negative (HR+/HER2-) breast cancer. Previous studies have reported algorithms to optimize the use of RS/Oncotype Dx but no such efforts have focused on ROR/Prosigna. We sought to improve pe-selection of patients before testing using machine learning.
Methods
Postmenopausal women with resected HR+/HER2- node negative breast cancer tested with ROR/Prosigna in four Swedish regions were included (n = 348). We used the ROR/Prosigna assessment results to compare the performance of four risk classifications in terms of over- and undertreatment. We developed and validated a machine learning model that comprised simple prognostic factors (size, progesterone receptor expression, grade and Ki67) for prediction of ROR/Prosigna outcome.
Results
Adherence to guidelines reached 66.3%, with non-tested patients being older and having more comorbidities (p < 0.001). Previous risk classifications led to excessive undertreatments (CTS5: 21.8%, MINDACT/TailorX risk definitions: 28.1%) or large intermediate groups that would need to be tested with gene expression profiling (Ki67 cut-offs according to Plan B: 86.5%). The model achieved AUC under ROC for predicting ROR/Prosigna result of 0.77 in the training and 0.83 in the validation cohort. By setting and validating upper and lower cut-offs in the model, we could improve correct risk stratification and decrease the proportion of patients needing testing with ROR/Prosigna compared to current management.
Conclusion
We show the feasibility of machine learning algorithms to improve patient selection for gene expression profiling. Further validation in external cohorts is needed.