Abstract
Breast cancer remains the most prevalent cancer in women. To date, its underlying molecular mechanisms have not been fully uncovered. The determination of gene factors is important to improve our understanding on breast cancer, which can correlate the specific gene expression and tumor staging. However, the knowledge in this regard is still far from complete. Thus, this study aimed to explore these knowledge gaps by analyzing existing gene expression profile data from 3149 breast cancer samples, where each sample was represented by the expression of 19,644 genes and classified into Nottingham histological grade (NHG) classes (Grade 1, 2, and 3). To this end, a machine learning–based framework was designed. First, the profile data were analyzed by using seven feature ranking algorithms to evaluate the importance of features (genes). Seven feature lists were generated, each of which sorted features in accordance with feature importance evaluated from a special aspect. Then, the incremental feature selection method was applied to each list to determine essential features for classification and building efficient classifiers. Consequently, overlapping genes, such as AURKA, CBX2, and MYBL2, were deemed as potentially related to breast cancer malignancy and prognosis, indicating that such genes were identified to be important by multiple feature ranking algorithms. In addition, the study formulated classification rules to reflect special gene expression patterns for three NHG classes. Some genes and rules were analyzed and supported by recent literature, providing new references for studying breast cancer.
Similar content being viewed by others
Data Availability
The datasets analysed during the current study are available in the Gene Expression Omnibus repository, [https://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE202203].
References
Abdelzaher E, Mostafa MF (2015) Lysophosphatidylcholine acyltransferase 1 (LPCAT1) upregulation in breast carcinoma contributes to tumor progression and predicts early tumor recurrence. Tumour Biol 36:5473–5483
Alexandrou S, George SM, Ormandy CJ, Lim E, Oakes SR, Caldon CE (2019) The proliferative and apoptotic landscape of basal-like breast cancer. Int J Mol Sci 20:667
Alfarsi LH, Ansari RE, Craze ML, Toss MS, Masisi B, Ellis IO, Rakha EA, Green AR (2019) CDC20 expression in oestrogen receptor positive breast cancer predicts poor prognosis and lack of response to endocrine therapy. Breast Cancer Res Treat 178:535–544
Amiri Souri E, Chenoweth A, Cheung A, Karagiannis SN, Tsoka S (2021) Cancer grade model: a multi-gene machine learning-based risk classification for improving prognosis in breast cancer. Br J Cancer 125:748–758
Angus L, Moleirinho S, Herron L, Sinha A, Zhang X, Niestrata M, Dholakia K, Prystowsky MB, Harvey KF, Reynolds PA, Gunn-Moore FJ (2012) Willin/FRMD6 expression activates the Hippo signaling pathway kinases in mammals and antagonizes oncogenic YAP. Oncogene 31:238–250
Bayley R, Ward C, Garcia P (2020) MYBL2 amplification in breast cancer: molecular mechanisms and therapeutic potential. Biochim Biophys Acta Rev Cancer 1874:188407
Bilton LJ, Warren C, Humphries RM, Kalsi S, Waters E, Francis T, Dobrowinski W, Beltran-Alvarez P, Wade MA (2022) The epigenetic regulatory protein CBX2 promotes mTORC1 signalling and inhibits DREAM complex activity to drive breast cancer cell growth. Cancers (basel) 14:3491
Bonacho T, Rodrigues F, Liberal J (2020) Immunohistochemistry for diagnosis and prognosis of breast cancer: a review. Biotech Histochem 95:71–91
Breiman L (2001) Random forests. Mach Learn 45:5–32
Budzik MP, Fudalej MM, Badowska-Kozakiewicz AM (2021) Histopathological analysis of mucinous breast cancer subtypes and comparison with invasive carcinoma of no special type. Sci Rep 11:5770
Cai XP, Chen LD, Song HB, Zhang CX, Yuan ZW, Xiang ZX (2016) PLK1 promotes epithelial-mesenchymal transition and metastasis of gastric carcinoma cells. Am J Transl Res 8:4172–4183
Chawla NV, Bowyer KW, Hall LO, Kegelmeyer WP (2002) SMOTE: synthetic minority over-sampling technique. J Artif Intell Res 16:321–357
Chen T, Guestrin C (2016) XGBoost: A Scalable Tree Boosting System. In: The 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining. Association for Computing Machinery, pp 785–794
Chen J, Chen X (2018) MYBL2 is targeted by miR-143-3p and regulates breast cancer cell proliferation and apoptosis. Oncol Res 26:913–922
Chen L, Chen K, Zhou B (2023) Inferring drug-disease associations by a deep analysis on drug and disease networks. Math Biosci Eng 20:14136–14157
Chen WY, Zhang XY, Liu T, Liu Y, Zhao YS, Pang D (2017) Chromobox homolog 2 protein: a novel biomarker for predicting prognosis and Taxol sensitivity in patients with breast cancer. Oncol Lett 13:1149–1156
Chen X, Lu Y, Yu H, Du K, Zhang Y, Nan Y, Huang Q (2021) Pan-cancer analysis indicates that MYBL2 is associated with the prognosis and immunotherapy of multiple cancers as an oncogene. Cell Cycle 20:2291–2308
Cortes C, Vapnik V (1995) Support-vector networks. Mach Learn 20:273–297
Cover T, Hart P (1967) Nearest neighbor pattern classification. IEEE Trans Inf Theory 13:21–27
Creighton CJ (2012) The molecular profile of luminal B breast cancer. Biol Targets Ther. https://doi.org/10.2147/BTT.S29923
D’Assoro AB, Liu T, Quatraro C, Amato A, Opyrchal M, Leontovich A, Ikeda Y, Ohmine S, Lingle W, Suman V, Ecsedy J, Iankov I, Di Leonardo A, Ayers-Inglers J, Degnim A, Billadeau D, McCubrey J, Ingle J, Salisbury JL, Galanis E (2014) The mitotic kinase Aurora–a promotes distant metastases by inducing epithelial-to-mesenchymal transition in ERα(+) breast cancer cells. Oncogene 33:599–610
Dalal H, Dahlgren M, Gladchuk S, Brueffer C, Gruvberger-Saal SK, Saal LH (2022) Clinical associations of ESR2 (estrogen receptor beta) expression across thousands of primary breast tumors. Sci Rep 12:4696
Dong M, Chen J, Deng Y, Zhang D, Dong L, Sun D (2021) H2AFZ is a prognostic biomarker correlated to TP53 mutation and immune infiltration in hepatocellular carcinoma. Front Oncol 11:701736
Dongre A, Weinberg RA (2019) New insights into the mechanisms of epithelial-mesenchymal transition and implications for cancer. Nat Rev Mol Cell Biol 20:69–84
Dorogush AV, Ershov V, Gulin A (2018) CatBoost: gradient boosting with categorical features support. Preprint at arXiv:1810.11363
Draminski M, Rada-Iglesias A, Enroth S, Wadelius C, Koronacki J, Komorowski J (2008) Monte Carlo feature selection for supervised classification. Bioinformatics 24:110–117
Du Y, Wang Q, Zhang X, Wang X, Qin C, Sheng Z, Yin H, Jiang C, Li J, Xu T (2017) Lysophosphatidylcholine acyltransferase 1 upregulation and concomitant phospholipid alterations in clear cell renal cell carcinoma. J Exp Clin Cancer Res 36:66
Ediriweera MK, Tennekoon KH, Samarakoon SR (2019) Role of the PI3K/AKT/mTOR signaling pathway in ovarian cancer: biological and therapeutic significance. Semin Cancer Biol 59:147–160
Ehlén Å, Martin C, Miron S, Julien M, Theillet FX, Ropars V, Sessa G, Beaurepere R, Boucherit V, Duchambon P, El Marjou A, Zinn-Justin S, Carreira A (2020) Proper chromosome alignment depends on BRCA2 phosphorylation by PLK1. Nat Commun 11:1819
Fennell DA, Myrand SP, Nguyen TS, Ferry D, Kerr KM, Maxwell P, Moore SD, Visseren-Grul C, Das M, Nicolson MC (2014) Association between gene expression profiles and clinical outcome of pemetrexed-based treatment in patients with advanced non-squamous non-small cell lung cancer: exploratory results from a phase II study. PLoS ONE 9:e107455
Fischer M, Quaas M, Steiner L, Engeland K (2016) The p53–p21-DREAM-CDE/CHR pathway regulates G2/M cell cycle genes. Nucleic Acids Res 44:164–174
Frkovic-Grazio S, Bracko M (2002) Long term prognostic value of Nottingham histological grade and its components in early (pT1N0M0) breast carcinoma. J Clin Pathol 55:88–92
Galliano MF, Toulza E, Gallinaro H, Jonca N, Ishida-Yamamoto A, Serre G, Guerrin M (2006) A novel protease inhibitor of the alpha2-macroglobulin family expressed in the human epidermis. J Biol Chem 281:5780–5789
Ghozlan H, Showalter A, Lee E, Zhu X, Khaled AR (2021) Chaperonin-containing TCP1 complex (CCT) promotes breast cancer growth through correlations with key cell cycle regulators. Front Oncol 11:663877
Gorodkin J (2004) Comparing two K-category assignments by a K-category correlation coefficient. Comput Biol Chem 28:367–374
Guo Y, Chen X, Zhang X, Hu X (2023) UBE2S and UBE2C confer a poor prognosis to breast cancer via downregulation of Numb. Front Oncol 13:992233
Hachim MY, Hachim IY, Talaat IM, Yakout NM, Hamoudi R (2020) M1 polarization markers are upregulated in basal-like breast cancer molecular subtype and associated with favorable patient outcome. Front Immunol 11:560074
Haldrup J, Strand SH, Cieza-Borrella C, Jakobsson ME, Riedel M, Norgaard M, Hedensted S, Dagnaes-Hansen F, Ulhoi BP, Eeles R, Borre M, Olsen JV, Thomsen M, Kote-Jarai Z, Sorensen KD (2021) FRMD6 has tumor suppressor functions in prostate cancer. Oncogene 40:763–776
Hand T, Rosseau NA, Stiles CE, Sheih T, Ghandakly E, Oluwasanu M, Olopade OI (2021) The global role, impact, and limitations of community health workers (CHWs) in breast cancer screening: a scoping review and recommendations to promote health equity for all. Glob Health Action 14:1883336
Hsu CC, Shi J, Yuan C, Zhao D, Jiang S, Lyu J, Wang X, Li H, Wen H, Li W, Shi X (2018) Recognition of histone acetylation by the GAS41 YEATS domain promotes H2A.Z deposition in non-small cell lung cancer. Genes Dev 32:58–69
Hu T, Wang X, Xia Y, Wu L, Ma Y, Zhou R, Zhao Y (2022) Comprehensive analysis identifies as a critical prognostic prediction gene in breast cancer. Chin Med J (engl) 135:2218–2231
Huang J, Deng X, Chen X, Chang Z, Lu Q, Tang A, Liu P (2022) Circular RNA KIF4A promotes liver metastasis of breast cancer by reprogramming glucose metabolism. J Oncol 2022:8035083
Huang F, Fu M, Li J, Chen L, Feng K, Huang T, Cai Y-D (2023a) Analysis and prediction of protein stability based on interaction network, gene ontology, and kegg pathway enrichment scores. BBA - Proteins Proteom 1871:140889
Huang F, Ma Q, Ren J, Li J, Wang F, Huang T, Cai Y-D (2023b) Identification of smoking associated transcriptome aberration in blood with machine learning methods. Biomed Res Int 2023:5333361
Iqbal MA, Siddiqui S, Ur Rehman A, Siddiqui FA, Singh P, Kumar B, Saluja D (2021) Multiomics integrative analysis reveals antagonistic roles of CBX2 and CBX7 in metabolic reprogramming of breast cancer. Mol Oncol 15:1450–1465
Islam MM, Haque MR, Iqbal H, Hasan MM, Hasan M, Kabir MN (2020) Breast cancer prediction: a comparative study using machine learning techniques. SN Comput Sci 1:290
Jeong SB, Im JH, Yoon J-H, Bui QT, Lim SC, Song JM, Shim Y, Yun J, Hong J, Kang KW (2018) essential role of polo-like kinase 1 (Plk1) oncogene in tumor growth and metastasis of tamoxifen-resistant breast CancerRole of Plk1 in tamoxifen-resistant breast cancer. Mol Cancer Ther 17:825–837
Jin Y, Yang L, Li X, Liu F (2020) Circular RNA KIF4A promotes cell migration, invasion and inhibits apoptosis through miR-152/ZEB1 axis in breast cancer. Diagn Pathol 15:55
Kahl I, Mense J, Finke C, Boller AL, Lorber C, Győrffy B, Greve B, Götte M, Espinoza-Sánchez NA (2022) The cell cycle-related genes RHAMM, AURKA, TPX2, PLK1, and PLK4 are associated with the poor prognosis of breast cancer patients. J Cell Biochem 123:581–600
Kariri Y, Toss MS, Alsaleem M, Elsharawy KA, Joseph C, Mongan NP, Green AR, Rakha EA (2022) Ubiquitin-conjugating enzyme 2C (UBE2C) is a poor prognostic biomarker in invasive breast cancer. Breast Cancer Res Treat 192:529–539
Ke G, Meng Q, Finley T, Wang T, Chen W, Ma W, Ye Q, Liu T-Y (2017) Lightgbm: a highly efficient gradient boosting decision tree. Adv Neural Inf Process Syst 30:3146–3154
Khatun L, Hossain SMM, Ray S, Mukhopadhyay A (2022) Classifying breast invasive carcinoma subtypes: a feature selection-based machine learning approach. In: 2022 2nd Odisha International Conference on Electrical Power Engineering, Communication and Computing Technology (ODICON). IEEE, pp 1–6
Kim YJ, Lee G, Han J, Song K, Choi JS, Choi YL, Shin YK (2019) UBE2C overexpression aggravates patient outcome by promoting estrogen-dependent/independent cell proliferation in early hormone receptor-positive and HER2-negative breast cancer. Front Oncol 9:1574
Kohavi R (1995) A study of cross-validation and bootstrap for accuracy estimation and model selection. In: International joint Conference on artificial intelligence. Lawrence Erlbaum Associates Ltd, pp 1137–1145
Kudela E, Samec M, Koklesova L, Liskova A, Kubatka P, Kozubik E, Rokos T, Pribulova T, Gabonova E, Smolar M (2020) miRNA expression profiles in luminal A breast cancer—implications in biology, prognosis, and prediction of response to hormonal treatment. Int J Mol Sci 21:7691
Lebok P, von Hassel A, Meiners J, Hube-Magg C, Simon R, Höflmayer D, Hinsch A, Dum D, Fraune C, Göbel C, Möller K, Sauter G, Jacobsen F, Büscheck F, Prien K, Krech T, Krech RH, von der Assen A, Wölber L, Witzel I, Schmalfeldt B, Geist S, Paluchoswski P, Wilke C, Heilenkötter U, Terracciano L, Müller V, Wilczak W, Burandt EC (2019) Up-regulation of lysophosphatidylcholine acyltransferase 1 (LPCAT1) is linked to poor prognosis in breast cancer. Aging (Albany NY) 11:7796–7804
Lee SB, Kim JJ, Nam HJ, Gao B, Yin P, Qin B, Yi SY, Ham H, Evans D, Kim SH, Zhang J, Deng M, Liu T, Zhang H, Billadeau DD, Wang L, Giaime E, Shen J, Pang YP, Jen J, van Deursen JM, Lou Z (2015) Parkin regulates mitosis and genomic stability through Cdc20/Cdh1. Mol Cell 60:21–34
Li H, Gao C, Liu L, Zhuang J, Yang J, Liu C, Zhou C, Feng F, Sun C (2019a) 7-lncRNA assessment model for monitoring and prognosis of breast cancer patients: based on cox regression and co-expression analysis. Front Oncol 9:1348
Li J, Chen H (2022) Actin-binding Rho activating C-terminal like (ABRACL) transcriptionally regulated by MYB proto-oncogene like 2 (MYBL2) promotes the proliferation, invasion, migration and epithelial-mesenchymal transition of breast cancer cells. Bioengineered 13:9019–9031
Li W, Wang HY, Zhao X, Duan H, Cheng B, Liu Y, Zhao M, Shu W, Mei Y, Wen Z, Tang M, Guo L, Li G, Chen Q, Liu X, Du HN (2019) A methylation-phosphorylation switch determines Plk1 kinase activity and function in DNA damage repair. Sci Adv 5:eaau7566
Li X, Gou J, Li H, Yang X (2020a) Bioinformatic analysis of the expression and prognostic value of chromobox family proteins in human breast cancer. Sci Rep 10:17739
Li Y, Zhou X, Liu J, Yin Y, Yuan X, Yang R, Wang Q, Ji J, He Q (2020b) Differentially expressed genes and key molecules of BRCA1/2-mutant breast cancer: evidence from bioinformatics analyses. PeerJ 8:e8403
Liang HB, Cao Y, Ma Q, Shu YJ, Wang Z, Zhang F, Ye YY, Li HF, Xiang SS, Song XL, Xu Y, Zhang YC, Bao RF, Yuan RY, Zhang YJ, Hu YP, Jiang L, Li ML, Wang XA, Wu XS, Wu WG, Zhao S, Fand Y, Cui XP, Lu YS, Zhou J, Zheng L, Gong W, Liu YB (2017) MYBL2 is a potential prognostic marker that promotes cell proliferation in gallbladder cancer. Cell Physiol Biochem 41:2117–2131
Lim SM, Jang HY, Lee JE, Shin JS, Park SH, Yoon BH, Kim GJ (2016) Alteration of pituitary tumor transforming Gene-1 regulates trophoblast invasion via the integrin/Rho-family signaling pathway. PLoS ONE 11:e0149371
Liu H, Setiono R (1998) Incremental feature selection. Appl Intell 9:217–230
Liu J, Sun X, Qin S, Wang H, Du N, Li Y, Pang Y, Wang C, Xu C, Ren H (2016) CDH1 promoter methylation correlates with decreased gene expression and poor prognosis in patients with breast cancer. Oncol Lett 11:2635–2643
Liu B, Yao P, Xiao F, Guo J, Wu L, Yang Y (2021) MYBL2-induced PITPNA-AS1 upregulates SIK2 to exert oncogenic function in triple-negative breast cancer through miR-520d-5p and DDX54. J Transl Med 19:333
Liu M, Yu X, Qu C, Xu S (2023) Predictive value of gene databases in discovering new biomarkers and new therapeutic targets in lung cancer. Medicina 59:547
Lu ZN, Song J, Sun TH, Sun G (2021) UBE2C affects breast cancer proliferation through the AKT/mTOR signaling pathway. Chin Med J (Engl) 134:2465–2474
Maire V, Némati F, Richardson M, Vincent-Salomon A, Tesson B, Rigaill G, Gravier E, Marty-Prouvost B, De Koning L, Lang G, Gentien D, Dumont A, Barillot E, Marangoni E, Decaudin D, Roman-Roman S, Pierré A, Cruzalegui F, Depil S, Tucker GC, Dubois T (2013) Polo-like kinase 1: a potential therapeutic option in combination with conventional chemotherapy for the management of patients with triple-negative breast cancer. Cancer Res 73:813–823
Maleki N, Zeinali Y, Niaki STA (2021) A k-NN method for lung cancer prognosis with the use of a genetic algorithm for feature selection. Expert Syst Appl 164:113981
Mansilla F, da Costa K-A, Wang S, Kruhøffer M, Lewin TM, Ørntoft TF, Coleman RA, Birkenkamp-Demtröder K (2009) Lysophosphatidylcholine acyltransferase 1 (LPCAT1) overexpression in human colorectal cancer. J Mol Med 87:85–97
Massari G, Magnoni F, Favia G, Peradze N, Veronesi P, La Vecchia C, Corso G (2021) Frequency of CDH1 germline mutations in non-gastric cancers. Cancers (Basel) 13:2321
Matsumoto Y, Saito M, Saito K, Kanke Y, Watanabe Y, Onozawa H, Hayase S, Sakamoto W, Ishigame T, Momma T, Kumamoto K, Ohki S, Takenoshita S (2018) Enhanced expression of KIF4A in colorectal cancer is associated with lymph node metastasis. Oncol Lett 15:2188–2194
Matthews B (1975) Comparison of the predicted and observed secondary structure of T4 phage lysozyme. Biochimica et Biophysica Acta (BBA)-Protein Structure 405:442–451
Mei L (2022) Multiple types of noncoding RNA are involved in potential modulation of PTTG1’s expression and function in breast cancer. Genomics 114:110352
Mo CH, Gao L, Zhu XF, Wei KL, Zeng JJ, Chen G, Feng ZB (2017) The clinicopathological significance of UBE2C in breast cancer: a study based on immunohistochemistry, microarray and RNA-sequencing data. Cancer Cell Int 17:83
Montaudon E, Nikitorowicz-Buniak J, Sourd L, Morisset L, El Botty R, Huguet L, Dahmani A, Painsec P, Nemati F, Vacher S, Chemlali W, Masliah-Planchon J, Château-Joubert S, Rega C, Leal MF, Simigdala N, Pancholi S, Ribas R, Nicolas A, Meseure D, Vincent-Salomon A, Reyes C, Rapinat A, Gentien D, Larcher T, Bohec M, Baulande S, Bernard V, Decaudin D, Coussy F, Le Romancer M, Dutertre G, Tariq Z, Cottu P, Driouch K, Bièche I, Martin LA, Marangoni E (2020) PLK1 inhibition exhibits strong anti-tumoral activity in CCND1-driven breast cancer metastases with acquired palbociclib resistance. Nat Commun 11:4053
Munkácsy G, Santarpia L, Győrffy B (2022) Gene expression profiling in early breast cancer—patient stratification based on molecular and tumor microenvironment features. Biomedicines 10:248
Naorem LD, Muthaiyan M, Venkatesan A (2019) Integrated network analysis and machine learning approach for the identification of key genes of triple-negative breast cancer. J Cell Biochem 120:6154–6167
Naqvi AAT, Rizvi SAM, Hassan MI (2023) Pan-cancer analysis of Chromobox (CBX) genes for prognostic significance and cancer classification. Biochim Biophys Acta Mol Basis Dis 1869:166561
Nielsen S, Narayan AK (2023) Breast cancer screening modalities, recommendations, and novel imaging techniques. Surg Clin 103:63–82
Pang J, Li H, Zhang X, Luo Z, Chen Y, Zhao H, Lv H, Zheng H, Fu Z, Tang W, Sheng M (2023) Application of novel transcription factor machine learning model and targeted drug combination therapy strategy in triple negative breast cancer. Int J Mol Sci 24:13497
Pedregosa F, Varoquaux G, Gramfort A, Michel V, Thirion B, Grisel O, Blondel M, Prettenhofer P, Weiss R, Dubourg V, Vanderplas J, Passos A, Cournapeau D, Brucher M, Perrot M, Duchesnay E (2011) Scikit-learn: machine learning in python. J Mach Learn Res 12:2825–2830
Peng H, Long F, Ding C (2005) Feature selection based on mutual information: criteria of max-dependency, max-relevance, and min-redundancy. IEEE Trans Pattern Anal Mach Intell 27:1226–1238
Peng F, Xu J, Cui B, Liang Q, Zeng S, He B, Zou H, Li M, Zhao H, Meng Y, Chen J, Liu B, Lv S, Chu P, An F, Wang Z, Huang J, Zhan Y, Liao Y, Lu J, Xu L, Zhang J, Sun Z, Li Z, Wang F, Lam EW, Liu Q (2021) Oncogenic AURKA-enhanced N(6)-methyladenosine modification increases DROSHA mRNA stability to transactivate STC1 in breast cancer stem-like cells. Cell Res 31:345–361
Phung MT, Tin Tin S, Elwood JM (2019) Prognostic models for breast cancer: a systematic review. BMC Cancer 19:230
Pinder S, Murray S, Ellis I, Trihia H, Elston C, Gelber R, Goldhirsch A, Lindtner J, Cortés-Funes H, Simoncini E (1998) The importance of the histologic grade of invasive breast carcinoma and response to chemotherapy. Cancer: Interdiscip Int J Am Cancer Soc 83:1529–1539
Piqué DG, Montagna C, Greally JM, Mar JC (2019) A novel approach to modelling transcriptional heterogeneity identifies the oncogene candidate CBX2 in invasive breast carcinoma. Br J Cancer 120:746–753
Powers D (2011) Evaluation: From precision, recall and f-measure to roc., informedness, markedness & correlation. J Mach Learn Technol 2:37–63
Qi L, Zhou B, Chen J, Hu W, Bai R, Ye C, Weng X, Zheng S (2019) Significant prognostic values of differentially expressed-aberrantly methylated hub genes in breast cancer. J Cancer 10:6618
Qu H, Zhu F, Dong H, Hu X, Han M (2020) Upregulation of CCT-3 induces breast cancer cell proliferation through miR-223 competition and Wnt/β-catenin signaling pathway activation. Front Oncol 10:533176
Ren F, Wang L, Shen X, Xiao X, Liu Z, Wei P, Wang Y, Qi P, Shen C, Sheng W, Du X (2015) MYBL2 is an independent prognostic marker that has tumor-promoting functions in colorectal cancer. Am J Cancer Res 5:1542–1552
Ren J, Zhang Y, Guo W, Feng K, Yuan Y, Huang T, Cai Y-D (2023) Identification of genes associated with the impairment of olfactory and gustatory functions in COVID-19 via machine-learning methods. Life 13:798
Romani C, Bignotti E, Mattavelli D, Bozzola A, Lorini L, Tomasoni M, Ardighieri L, Rampinelli V, Paderno A, Battocchio S (2021) Gene expression profiling of olfactory neuroblastoma helps identify prognostic pathways and define potentially therapeutic targets. Cancers 13:2527
Safavian SR, Landgrebe D (1991) A survey of decision tree classifier methodology. IEEE Trans Syst Man Cybern 21:660–674
Schettini F, Pascual T, Conte B, Chic N, Brasó-Maristany F, Galván P, Martínez O, Adamo B, Vidal M, Muñoz M (2020) HER2-enriched subtype and pathological complete response in HER2-positive breast cancer: a systematic review and meta-analysis. Cancer Treat Rev 84:101965
Shang G, Ma X, Lv G (2018) Cell division cycle 20 promotes cell proliferation and invasion and inhibits apoptosis in osteosarcoma cells. Cell Cycle 17:43–52
Song C, Lowe VJ, Lee S (2021) Inhibition of Cdc20 suppresses the metastasis in triple negative breast cancer (TNBC). Breast Cancer 28:1073–1086
Soupene E, Fyrst H, Kuypers FA (2008) Mammalian acyl-CoA:lysophosphatidylcholine acyltransferase enzymes. Proc Natl Acad Sci U S A 105:88–93
Sung H, Ferlay J, Siegel RL, Laversanne M, Soerjomataram I, Jemal A, Bray F (2021) Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries. CA Cancer J Clin 71:209–249
Tang F, Pan M-H, Lu Y, Wan X, Zhang Y, Sun S-C (2018a) Involvement of Kif4a in spindle formation and chromosome segregation in mouse oocytes. Aging Dis 9:623
Tang J, Kong D, Cui Q, Wang K, Zhang D, Gong Y, Wu G (2018b) Prognostic genes of breast cancer identified by gene co-expression network analysis. Front Oncol 8:374
Tang J, Lu M, Cui Q, Zhang D, Kong D, Liao X, Ren J, Gong Y, Wu G (2019) Overexpression of ASPM, CDC20, and TTK confer a poorer prognosis in breast cancer identified by gene co-expression network analysis. Front Oncol 9:310
Tibshirani R (1996) Regression shrinkage and selection via the lasso. J Roy Stat Soc: Ser B (Methodol) 58:267–288
Tyagi M, Cheema MS, Dryhurst D, Eskiw CH, Ausió J (2018) Metformin alters H2A.Z dynamics and regulates androgen dependent prostate cancer progression. Oncotarget 9:37054–37068
Ueda A, Oikawa K, Fujita K, Ishikawa A, Sato E, Ishikawa T, Kuroda M, Kanekura K (2019) Therapeutic potential of PLK1 inhibition in triple-negative breast cancer. Lab Invest 99:1275–1286
van Roy F, Berx G (2008) The cell-cell adhesion molecule E-cadherin. Cell Mol Life Sci 65:3756–3788
Vissers LE, Bonetti M, Paardekooper Overman J, Nillesen WM, Frints SG, de Ligt J, Zampino G, Justino A, Machado JC, Schepens M, Brunner HG, Veltman JA, Scheffer H, Gros P, Costa JL, Tartaglia M, van der Burgt I, Yntema HG, den Hertog J (2015) Heterozygous germline mutations in A2ML1 are associated with a disorder clinically related to Noonan syndrome. Eur J Hum Genet 23:317–324
Waks AG, Winer EP (2019) Breast cancer treatment: a review. Jama 321:288–300
Wang B, Huang X, Liang H, Yang H, Guo Z, Ai M, Zhang J, Khan M, Tian Y, Sun Q, Mao Z, Zheng R, Yuan Y (2021) PLK1 inhibition sensitizes breast cancer cells to radiation via suppressing autophagy. Int J Radiat Oncol Biol Phys 110:1234–1247
Wang J, Li Z, Wang X, Ding Y, Li N (2020) The tumor suppressive effect of long non-coding RNA FRMD6-AS2 in uteri corpus endometrial carcinoma. Life Sci 243:117254
Wang T, Guo H, Zhang L, Yu M, Li Q, Zhang J, Tang Y, Zhang H, Zhan J (2023) FERM domain-containing protein FRMD6 activates the mTOR signaling pathway and promotes lung cancer progression. Front Med 17:714–728
Wavelet-Vermuse C, Odnokoz O, Xue Y, Lu X, Cristofanilli M, Wan Y (2022) CDC20-mediated hnRNPU ubiquitination regulates chromatin condensation and anti-cancer drug response. Cancers (basel) 14:3732
Whately KM, Voronkova MA, Maskey A, Gandhi J, Loskutov J, Choi H, Yanardag S, Chen D, Wen S, Margaryan NV, Smolkin MB, Purazo ML, Hu G, Pugacheva EN (2021) Nuclear Aurora-A kinase-induced hypoxia signaling drives early dissemination and metastasis in breast cancer: implications for detection of metastatic tumors. Oncogene 40:5651–5664
Wu C, Chen L (2023) A model with deep analysis on a large drug network for drug classification. Math Biosci Eng 20:383–401
Wu H-J, Chu P-Y (2022) Current and developing liquid biopsy techniques for breast cancer. Cancers 14:2052
Wu J, Hicks C (2021) Breast cancer type classification using machine learning. J Pers Med 11:61
Xiea Y, Wangb R (2016) Pttg1 Promotes growth of breast cancer through P27 nuclear exclusion. Cell Physiol Biochem 38:393–400
Xue D, Cheng P, Han M, Liu X, Xue L, Ye C, Wang K, Huang J (2018) An integrated bioinformatical analysis to evaluate the role of KIF4A as a prognostic biomarker for breast cancer. Onco Targets Ther 11:4755–4768
Yin L, He Z, Yi B, Xue L, Sun J (2020) Simvastatin suppresses human breast cancer cell invasion by decreasing the expression of pituitary tumor-transforming Gene 1. Front Pharmacol 11:574068
Yoon CH, Kim MJ, Lee H, Kim RK, Lim EJ, Yoo KC, Lee GH, Cui YH, Oh YS, Gye MC, Lee YY, Park IC, An S, Hwang SG, Park MJ, Suh Y, Lee SJ (2012) PTTG1 oncogene promotes tumor malignancy via epithelial to mesenchymal transition and expansion of cancer stem cell population. J Biol Chem 287:19516–19527
Yu R, Li C, Lin X, Chen Q, Li J, Song L, Lin L, Liu J, Zhang Y, Kong W, Ouyang X, Chen X (2017) Clinicopathologic features and prognostic implications of MYBL2 protein expression in pancreatic ductal adenocarcinoma. Pathol Res Pract 213:964–968
Zhang H, Zheng Y (2023) LPCAT1 is transcriptionally regulated by FOXA1 to promote breast cancer progression and paclitaxel resistance. Oncol Lett 25:134
Zhang DY, Ma SS, Sun WL, Lv XCH, Lu Z (2021) KIF4A as a novel prognostic biomarker in cholangiocarcinoma. Medicine (Baltimore) 100:e26130
Zhang H, Xu K, Xiang Q, Zhao L, Tan B, Ju P, Lan X, Liu Y, Zhang J, Fu Z (2022) LPCAT1 functions as a novel prognostic molecular marker in hepatocellular carcinoma. Genes Dis 9:151–164
Zhang L, He M, Zhu W, Lv X, Zhao Y, Yan Y, Li X, Jiang L, Zhao L, Fan Y, Su P, Gao M, Ma H, Li K, Wei M (2020) Identification of a panel of mitotic spindle-related genes as a signature predicting survival in lung adenocarcinoma. J Cell Physiol 235:4361–4375
Zhang W, Cui Q, Qu W, Ding X, Jiang D, Liu H (2018) TRIM58/cg26157385 methylation is associated with eight prognostic genes in lung squamous cell carcinoma. Oncol Rep 40:206–216
Zheng F, Yue C, Li G, He B, Cheng W, Wang X, Yan M, Long Z, Qiu W, Yuan Z, Xu J, Liu B, Shi Q, Lam EW, Hung MC, Liu Q (2016) Nuclear AURKA acquires kinase-independent transactivating function to enhance breast cancer stem cell phenotype. Nat Commun 7:10180
Zheng G, Zhang C, Zhong C (2021) Identification of potential prognostic biomarkers for breast cancer using WGCNA and PPI integrated techniques. Ann Diagn Pathol 50:151675
Zheng S, Lv P, Su J, Miao K, Xu H, Li M (2019) Overexpression of CBX2 in breast cancer promotes tumor progression through the PI3K/AKT signaling pathway. Am J Transl Res 11:1668–1682
Zheng X, Ma H, Dong Y, Fang M, Wang J, Xiong X, Liang J, Han M, You A, Yin Q, Huang W (2023) Immune-related biomarkers predict the prognosis and immune response of breast cancer based on bioinformatic analysis and machine learning. Funct Integr Genom 23:201
Zhou Q, Li L, Sha F, Lei Y, Tian X, Chen L, Chen Y, Liu H, Guo Y (2023) PTTG1 reprograms asparagine metabolism to promote hepatocellular carcinoma progression. Cancer Res. https://doi.org/10.1158/0008-5472.CAN-22-3561
Zou JX, Duan Z, Wang J, Sokolov A, Xu J, Chen CZ, Li JJ, Chen HW (2014) Kinesin family deregulation coordinated by bromodomain protein ANCCA and histone methyltransferase MLL for breast cancer cell growth, survival, and tamoxifen resistance. Mol Cancer Res 12:539–549
Funding
This work was supported by the National Key R&D Program of China (2022YFF1203202), Strategic Priority Research Program of Chinese Academy of Sciences (XDA26040304, XDB38050200), the Fund of the Key Laboratory of Tissue Microenvironment and Tumor of Chinese Academy of Sciences (202002), Shandong Provincial Natural Science Foundation (ZR2022MC072).
Author information
Authors and Affiliations
Contributions
All authors contributed to the study conception and design. Material preparation, data collection and analysis were performed by QLM, LC, WG and TH. The first draft of the manuscript was written by QLM and LC and all authors commented on previous versions of the manuscript. All authors read and approved the final manuscript.
Corresponding authors
Ethics declarations
Conflict of interests
The authors declare that there is no conflict of interests regarding the publication of this paper. The authors have no relevant financial or non-financial interests to disclose.
Ethical Approval
This study did not involve experiments, so there is no moral statement.
Additional information
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Below is the link to the electronic supplementary material.
10528_2024_10712_MOESM1_ESM.xlsx
Supplementary file1 (XLSX 1189 KB)—Feature ranking results obtained using LASSO, LightGBM, MCFS, mRMR, RF_ZL, CATboost and XGBoost methods.
10528_2024_10712_MOESM2_ESM.xlsx
Supplementary file2 (XLSX 504 KB)—Performance of IFS with four different classification algorithms on LASSO, LightGBM, MCFS, mRMR, RF_ZL, CATboost and XGBoost feature lists.
10528_2024_10712_MOESM3_ESM.xlsx
Supplementary file3 (XLSX 17 KB)—Intersection of seven critical gene sets extracted from the LASSO, LightGBM, MCFS, mRMR, RF_ZL, CATboost and XGBoost feature lists.
10528_2024_10712_MOESM4_ESM.xlsx
Supplementary file4 (XLSX 634 KB)—Classification rules generated by decision tree using its optimal features on seven feature lists.
Rights and permissions
Springer Nature or its licensor (e.g. a society or other partner) holds exclusive rights to this article under a publishing agreement with the author(s) or other rightsholder(s); author self-archiving of the accepted manuscript version of this article is solely governed by the terms of such publishing agreement and applicable law.
About this article
Cite this article
Ma, Q., Chen, L., Feng, K. et al. Exploring Prognostic Gene Factors in Breast Cancer via Machine Learning. Biochem Genet (2024). https://doi.org/10.1007/s10528-024-10712-w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1007/s10528-024-10712-w