Identification of biomarkers for breast cancer early diagnosis based on the molecular classification using machine learning algorithms on transcriptomic data and factorial designs for analysis

Author:

Mayoral-Peña Kalaumari1,Peña Omar Israel González2,Artzi Natalie3,de Donato Marcos1

Affiliation:

1. Monterrey Institute of Technology and Higher Education

2. Hospital Infantil de México Federico Gómez

3. Brigham and Women's Hospital

Abstract

Abstract Background: Breast cancer is the second leading cause of global female mortality. Diagnosing and treating breast cancer patients at early stages is relevant for providing successful treatment and increasing the patient's survival rate. The use of new analytical methods for massive data from biological samples, such as Machine Learning Algortithms (MLAs), is necessary for improving cancer diagnosis, especially in patients from low-income countries. A computational methodology for selecting a small number of biomarkers with strong diagnostic capabilities and an accessible cellular location could be useful for developing low-cost diagnostic devices. Hence, this study aimed to develop a computational methodology to find relevant genetic biomarkers and establish a discrete panel of genes capable of classifying breast cancer samples for diagnostic purposes with high accuracy. Methods: This study aimed to develop a computational methodology for finding genetic biomarkers and establish a panel with a few genes capable of classifying breast cancer molecularly for diagnostic purposes. Panels with a small number of genes (<10) that can be used for the molecular classification of breast cancer cells through four Machine Learning Algorithms on transcriptomic data. Five gene selection approaches were used for the generation of these panels: factor analysis genes, surfaceome genes, transmembrane genes, combined genes, and network analysis genes. The classification performance and analyzed and validated using seven factorial designs and non-parametric statistical tests. Results: The MLAs accuracy was higher than 80% in cell lines and in patient samples for all selection approaches. The combined approach with the best genes of the three approaches (transmembrane, surfaceome, and factor analysis) had better classification performance than each approach alone. Also, the combined genes of this approach (TMEM210, CD44, SPDEF, TENM4, KIRREL, BCAS1, TMEM86A, LRFN2, TFF3) had similar performance than the ones selected by network analysis. The panel of genes identified from the combined approach was completely different from the genes previously described in four commercial panels for breast cancer that were analyzed. Conclusions In this study, the panels of selected genes were capable of classify breast cancer cell lines and patient samples according to their molecular characteristics. Two genes of the combined approach (TFF3 and CD44) have been used in cancer biosensors, which suggests a plausible result due to the potential for the development of new diagnostic devices; however, experimental studies are required to corroborate this type of implementation.

Publisher

Research Square Platform LLC

Reference62 articles.

1. International evaluation of an AI system for breast cancer screening;McKinney SM;Nature,2020

2. Global cancer statistics 2020: GLOBOCAN estimates of incidence and mortality worldwide for 36 cancers in 185 countries;Sung H;CA: a cancer journal for clinicians,2021

3. Breast cancer statistics: recent trends;Ahmad A;Breast Cancer Metastasis and Drug Resistance,2019

4. Key steps for effective breast cancer prevention;Britt KL;Nature Reviews Cancer,2020

5. Francies FZ, Hull R, Khanyile R, Dlamini Z. Breast cancer in low-middle income countries: abnormality in splicing and lack of targeted treatment options. 2020;

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3