Machine Learning in Identifying Marker Genes for Congenital Heart Diseases of Different Cardiac Cell Types

Author:

Ma Qinglan1,Zhang Yu-Hang2ORCID,Guo Wei3,Feng Kaiyan4,Huang Tao56ORCID,Cai Yu-Dong1ORCID

Affiliation:

1. School of Life Sciences, Shanghai University, Shanghai 200444, China

2. Channing Division of Network Medicine, Brigham and Women’s Hospital, Harvard Medical School, Boston, MA 02115, USA

3. Key Laboratory of Stem Cell Biology, Shanghai Jiao Tong University School of Medicine (SJTUSM) & Shanghai Institutes for Biological Sciences (SIBS), Chinese Academy of Sciences (CAS), Shanghai 200030, China

4. Department of Computer Science, Guangdong AIB Polytechnic College, Guangzhou 510507, China

5. Bio-Med Big Data Center, CAS Key Laboratory of Computational Biology, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China

6. CAS Key Laboratory of Tissue Microenvironment and Tumor, Shanghai Institute of Nutrition and Health, University of Chinese Academy of Sciences, Chinese Academy of Sciences, Shanghai 200031, China

Abstract

Congenital heart disease (CHD) represents a spectrum of inborn heart defects influenced by genetic and environmental factors. This study advances the field by analyzing gene expression profiles in 21,034 cardiac fibroblasts, 73,296 cardiomyocytes, and 35,673 endothelial cells, utilizing single-cell level analysis and machine learning techniques. Six CHD conditions: dilated cardiomyopathy (DCM), donor hearts (used as healthy controls), hypertrophic cardiomyopathy (HCM), heart failure with hypoplastic left heart syndrome (HF_HLHS), Neonatal Hypoplastic Left Heart Syndrome (Neo_HLHS), and Tetralogy of Fallot (TOF), were investigated for each cardiac cell type. Each cell sample was represented by 29,266 gene features. These features were first analyzed by six feature-ranking algorithms, resulting in several feature lists. Then, these lists were fed into incremental feature selection, containing two classification algorithms, to extract essential gene features and classification rules and build efficient classifiers. The identified essential genes can be potential CHD markers in different cardiac cell types. For instance, the LASSO identified key genes specific to various heart cell types in CHD subtypes. FOXO3 was found to be up-regulated in cardiac fibroblasts for both Dilated and hypertrophic cardiomyopathy. In cardiomyocytes, distinct genes such as TMTC1, ART3, ARHGAP24, SHROOM3, and XIST were linked to dilated cardiomyopathy, Neo-Hypoplastic Left Heart Syndrome, hypertrophic cardiomyopathy, HF-Hypoplastic Left Heart Syndrome, and Tetralogy of Fallot, respectively. Endothelial cell analysis further revealed COL25A1, NFIB, and KLF7 as significant genes for dilated cardiomyopathy, hypertrophic cardiomyopathy, and Tetralogy of Fallot. LightGBM, Catboost, MCFS, RF, and XGBoost further delineated key genes for specific CHD subtypes, demonstrating the efficacy of machine learning in identifying CHD-specific genes. Additionally, this study developed quantitative rules for representing the gene expression patterns related to CHDs. This research underscores the potential of machine learning in unraveling the molecular complexities of CHD and establishes a foundation for future mechanism-based studies.

Funder

Strategic Priority Research Program of Chinese Academy of Sciences

National Key R&D Program of China

Fund of the Key Laboratory of Tissue Microenvironment and Tumor of Chinese Academy of Sciences

Shandong Provincial Natural Science Foundation

Publisher

MDPI AG

Reference117 articles.

1. Congenital heart disease: Causes, diagnosis, symptoms, and treatments;Sun;Cell Biochem. Biophys.,2015

2. The changing epidemiology of congenital heart disease;Zomer;Nat. Rev. Cardiol.,2011

3. Arrhythmia diagnosis and management throughout life in congenital heart disease;Clark;Expert Rev. Cardiovasc. Ther.,2016

4. GBD 2017 Congenital Heart Disease Collaborators (2020). Global, regional, and national burden of congenital heart disease, 1990–2017: A systematic analysis for the global burden of disease study 2017. Lancet Child Adolesc. Health, 4, 185–200.

5. Congenital heart defects in the united states: Estimating the magnitude of the affected population in 2010;Gilboa;Circulation,2016

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3