A machine learning-based SNP-set analysis approach for identifying disease-associated susceptibility loci

Author:

Silva Princess P.ORCID,Gaudillo Joverlyn D.ORCID,Vilela Julianne A.ORCID,Roxas-Villanueva Ranzivelle Marianne L.ORCID,Tiangco Beatrice J.ORCID,Domingo Mario R.,Albia Jason R.ORCID

Abstract

AbstractIntroductionIdentifying disease-associated susceptibility loci is one of the most pressing and crucial challenges in modeling complex diseases. Existing approaches to biomarker discovery are subject to several limitations including underpowered detection, neglect for variant interactions, and restrictive dependence on prior biological knowledge. Addressing these challenges necessitates more ingenious ways of approaching the “missing heritability” problem.ObjectivesThis study aims to discover disease-associated susceptibility loci by augmenting previous genome-wide association study (GWAS) using the integration of random forest and cluster analysis.MethodsThe proposed integrated framework is applied to a hepatitis B virus surface antigen (HBsAg) seroclearance GWAS data. Multiple cluster analyses were performed on (1) single nucleotide polymorphisms (SNPs) considered significant by GWAS and (2) SNPs with the highest feature importance scores obtained using random forest. The resulting SNP-sets from the cluster analyses were subsequently tested for trait-association.ResultsThree susceptibility loci possibly associated with HBsAg seroclearance were identified: (1) SNP rs2399971, (2) gene LINC00578, and (3) locus 11p15. SNP rs2399971 is a biomarker reported in the literature to be significantly associated with HBsAg seroclearance in patients who had received antiviral treatment. The latter two loci are linked with diseases influenced by the presence of hepatitis B virus infection.ConclusionThese findings demonstrate the potential of the proposed integrated framework in identifying disease-associated susceptibility loci. With further validation, results herein could aid in better understanding complex disease etiologies and provide inputs for a more advanced disease risk assessment for patients.

Publisher

Cold Spring Harbor Laboratory

Reference57 articles.

1. A polygenic approach to the study of polygenic diseases;Acta Naturae,2012

2. Genetics of Complex Disease

3. 10 Years of GWAS Discovery: Biology, Function, and Translation

4. The personal and clinical utility of polygenic risk scores

5. K. Norrgard , Genetic variation and disease: GWAS [Internet], Nat Educ; 2008 [cited 2022 Mar 8], Available from: https://www.nature.com/scitable/topicpage/genetic-variation-and-disease-gwas-682/#.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3