Affiliation:
1. Department of Molecular and Human Genetics Baylor College of Medicine Houston TX USA
2. Computational & Integrative Biomedical Research Center Baylor College of Medicine Houston TX USA
Abstract
Background
Coronary artery disease is a primary cause of death around the world, with both genetic and environmental risk factors. Although genome‐wide association studies have linked >100 unique loci to its genetic basis, these only explain a fraction of disease heritability.
Methods and Results
To find additional gene drivers of coronary artery disease, we applied machine learning to quantitative evolutionary information on the impact of coding variants in whole exomes from the Myocardial Infarction Genetics Consortium. Using ensemble‐based supervised learning, the Evolutionary Action–Machine Learning framework ranked each gene's ability to classify case and control samples and identified 79 significant associations. These were connected to known risk loci; enriched in cardiovascular processes like lipid metabolism, blood clotting, and inflammation; and enriched for cardiovascular phenotypes in knockout mouse models. Among them,
INPP5F
and
MST1R
are examples of potentially novel coronary artery disease risk genes that modulate immune signaling in response to cardiac stress.
Conclusions
We concluded that machine learning on the functional impact of coding variants, based on a massive amount of evolutionary information, has the power to suggest novel coronary artery disease risk genes for mechanistic and therapeutic discoveries in cardiovascular biology, and should also apply in other complex polygenic diseases.
Publisher
Ovid Technologies (Wolters Kluwer Health)
Subject
Cardiology and Cardiovascular Medicine