An epistasis and heterogeneity analysis method based on maximum correlation and maximum consistence criteria
-
Published:2021
Issue:6
Volume:18
Page:7711-7726
-
ISSN:1551-0018
-
Container-title:Mathematical Biosciences and Engineering
-
language:
-
Short-container-title:MBE
Author:
Chen Xia, ,Lin Yexiong,Qu Qiang,Ning Bin,Chen Haowen,Li Xiong, ,
Abstract
<abstract>
<p>Tumor heterogeneity significantly increases the difficulty of tumor treatment. The same drugs and treatment methods have different effects on different tumor subtypes. Therefore, tumor heterogeneity is one of the main sources of poor prognosis, recurrence and metastasis. At present, there have been some computational methods to study tumor heterogeneity from the level of genome, transcriptome, and histology, but these methods still have certain limitations. In this study, we proposed an epistasis and heterogeneity analysis method based on genomic single nucleotide polymorphism (SNP) data. First of all, a maximum correlation and maximum consistence criteria was designed based on Bayesian network score <italic>K2</italic> and information entropy for evaluating genomic epistasis. As the number of SNPs increases, the epistasis combination space increases sharply, resulting in a combination explosion phenomenon. Therefore, we next use an improved genetic algorithm to search the SNP epistatic combination space for identifying potential feasible epistasis solutions. Multiple epistasis solutions represent different pathogenic gene combinations, which may lead to different tumor subtypes, that is, heterogeneity. Finally, the XGBoost classifier is trained with feature SNPs selected that constitute multiple sets of epistatic solutions to verify that considering tumor heterogeneity is beneficial to improve the accuracy of tumor subtype prediction. In order to demonstrate the effectiveness of our method, the power of multiple epistatic recognition and the accuracy of tumor subtype classification measures are evaluated. Extensive simulation results show that our method has better power and prediction accuracy than previous methods.</p>
</abstract>
Publisher
American Institute of Mathematical Sciences (AIMS)
Subject
Applied Mathematics,Computational Mathematics,General Agricultural and Biological Sciences,Modelling and Simulation,General Medicine
Reference57 articles.
1. E. A. Ashley, Towards precision medicine, Nat. Rev. Genet., 17 (2016), 507. 2. H. Peng, X. Zeng, Y. Zhou, D. Zhang, R. Nussinov, F. Cheng, A component overlapping attribute clustering (COAC) algorithm for single-cell RNA sequencing data analysis and potential pathobiological implications, PLoS Comput. Biol., 15 (2019), e1006772. 3. X. Liu, Z. Hong, J. Liu, Y. Lin, R. Alfonso, Q. Zou, et al, Computational methods for identifying the critical nodes in biological networks, Briefings Bioinf., 21 (2020), 486-497. 4. A. Alizadeh, V. Aranda, A. Bardelli, C. Blanpain, C. Bock, C. Borowski, et al., Toward understanding and exploiting tumor heterogeneity, Nat. Med., 21 (2015), 846-853 5. Q. Jia, W. Wu, Y. Wang, P. B. Alexander, C. Sun, Z. Gong, et al, Local mutational diversity drives intratumoral immune heterogeneity in non-small cell lung cancer, Nat. Commun., 9 (2018), 1-10.
|
|