Biomarker interaction selection and disease detection based on multivariate gain ratio-Reference-Cited by-同舟云学术

Biomarker interaction selection and disease detection based on multivariate gain ratio

Published:2022-05-12 Issue:1 Volume:23 Page:
ISSN:1471-2105
Container-title:BMC Bioinformatics
language:en
Short-container-title:BMC Bioinformatics

Author:

Chu Xiao,Jiang Mao,Liu Zhuo-Jun

Abstract

Abstract Background Disease detection is an important aspect of biotherapy. With the development of biotechnology and computer technology, there are many methods to detect disease based on single biomarker. However, biomarker does not influence disease alone in some cases. It’s the interaction between biomarkers that determines disease status. The existing influence measure I-score is used to evaluate the importance of interaction in determining disease status, but there is a deviation about the number of variables in interaction when applying I-score. To solve the problem, we propose a new influence measure Multivariate Gain Ratio (MGR) based on Gain Ratio (GR) of single-variate, which provides us with multivariate combination called interaction. Results We propose a preprocessing verification algorithm based on partial predictor variables to select an appropriate preprocessing method. In this paper, an algorithm for selecting key interactions of biomarkers and applying key interactions to construct a disease detection model is provided. MGR is more credible than I-score in the case of interaction containing small number of variables. Our method behaves better with average accuracy

$$93.13\%$$

93.13 % than I-score of

$$91.73\%$$

91.73 % in Breast Cancer Wisconsin (Diagnostic) Dataset. Compared to the classification results

$$89.80\%$$

89.80 % based on all predictor variables, MGR identifies the true main biomarkers and realizes the dimension reduction. In Leukemia Dataset, the experiment results show the effectiveness of MGR with the accuracy of

$$97.32\%$$

97.32 % compared to I-score with accuracy

$$89.11\%$$

89.11 % . The results can be explained by the nature of MGR and I-score mentioned above because every key interaction contains a small number of variables in Leukemia Dataset. Conclusions MGR is effective for selecting important biomarkers and biomarker interactions even in high-dimension feature space in which the interaction could contain more than two biomarkers. The prediction ability of interactions selected by MGR is better than I-score in the case of interaction containing small number of variables. MGR is generally applicable to various types of biomarker datasets including cell nuclei, gene, SNPs and protein datasets.

Publisher

Springer Science and Business Media LLC

Subject

Applied Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Structural Biology

Link

https://link.springer.com/content/pdf/10.1186/s12859-022-04699-7.pdf

Reference27 articles.

1. Carlborg Ö, Haley CS. Epistasis: too often neglected in complex trait studies? Nat Rev Genet. 2004;5(8):618–25.

2. Khan AI, Dinh DM, Schneider D, Lenski RE, Cooper TF. Negative epistasis between beneficial mutations in an evolving bacterial population. Science. 2011;332(6034):1193–6.

3. Moore JH, Williams SM. Epistasis and its implications for personal genetics. Am J Hum Genet. 2009;85(3):309–20.

4. Shao H, Burrage LC, Sinasac DS, Hill AE, Ernest SR, O’Brien W, Courtland H-W, Jepsen KJ, Kirby A, Kulbokas E, et al. Genetic architecture of complex traits: large phenotypic effects and pervasive epistasis. Proc Natl Acad Sci. 2008;105(50):19910–4.

5. Zuk O, Hechter E, Sunyaev SR, Lander ES. The mystery of missing heritability: genetic interactions create phantom heritability. Proc Natl Acad Sci. 2012;109(4):1193–8.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. GOAT: Gene-level biomarker discovery from multi-Omics data using graph ATtention neural network for eosinophilic asthma subtype;Bioinformatics;2023-09-22