Abstract
Abstract
Purpose
Serum microRNA (miRNA) holds great potential as a non-invasive biomarker for diagnosing breast cancer (BrC). However, most diagnostic models rely on the absolute expression levels of miRNAs, which are susceptible to batch effects and challenging for clinical transformation. Furthermore, current studies on liquid biopsy diagnostic biomarkers for BrC mainly focus on distinguishing BrC patients from healthy controls, needing more specificity assessment.
Methods
We collected a large number of miRNA expression data involving 8465 samples from GEO, including 13 different cancer types and non-cancer controls. Based on the relative expression orderings (REOs) of miRNAs within each sample, we applied the greedy, LASSO multiple linear regression, and random forest algorithms to identify a qualitative biomarker specific to BrC by comparing BrC samples to samples of other cancers as controls.
Results
We developed a BrC-specific biomarker called 7-miRPairs, consisting of seven miRNA pairs. It demonstrated comparable classification performance in our analyzed machine learning algorithms while requiring fewer miRNA pairs, accurately distinguishing BrC from 12 other cancer types. The diagnostic performance of 7-miRPairs was favorable in the training set (accuracy = 98.47%, specificity = 98.14%, sensitivity = 99.25%), and similar results were obtained in the test set (accuracy = 97.22%, specificity = 96.87%, sensitivity = 98.02%). KEGG pathway enrichment analysis of the 11 miRNAs within the 7-miRPairs revealed significant enrichment of target mRNAs in pathways associated with BrC.
Conclusion
Our study provides evidence that utilizing serum miRNA pairs can offer significant advantages for BrC-specific diagnosis in clinical practice by directly comparing serum samples with BrC to other cancer types.
Funder
National Natural Science Foundation of China
Thousand Talents Program of Jiangxi for High-level talents in innovation and entrepreneurship
Publisher
Springer Science and Business Media LLC