Statistical Study Design for Analyzing Multiple Gene Loci Correlation in DNA Sequences

Author:

Kamoljitprapa Pianpool1ORCID,Baksh Fazil M.2,De Gaetano Andrea34ORCID,Polsen Orathai1,Leelasilapasart Piyachat1ORCID

Affiliation:

1. Department of Applied Statistics, Faculty of Applied Science, King Mongkut’s University of Technology North Bangkok, Bangkok 10800, Thailand

2. Department of Mathematics and Statistics, University of Reading, Reading RG6 6AH, UK

3. Consiglio Nazionale delle Ricerche, CNR-IASI Rome and CNR-IRIB Palermo, 90146 Palermo, Italy

4. Distinguished Professor Excellence Program, Department of Biomatics, Óbuda University, 1034 Budapest, Hungary

Abstract

This study presents a novel statistical and computational approach using nonparametric regression, which capitalizes on correlation structure to deal with the high-dimensional data often found in pharmacogenomics, for instance, in Crohn’s inflammatory bowel disease. The empirical correlation between the test statistics, investigated via simulation, can be used as an estimate of noise. The theoretical distribution of −log10(p-value) is used to support the estimation of that optimal bandwidth for the model, which adequately controls type I error rates while maintaining reasonable power. Two proposed approaches, involving normal and Laplace-LD kernels, were evaluated by conducting a case-control study using real data from a genome-wide association study on Crohn’s disease. The study successfully identified single nucleotide polymorphisms on the NOD2 gene associated with the disease. The proposed method reduces the computational burden by approximately 33% with reasonable power, allowing for a more efficient and accurate analysis of genetic variants influencing drug responses. The study contributes to the advancement of statistical methodology for analyzing complex genetic data and is of practical advantage for the development of personalized medicine.

Funder

King Mongkut’s University of Technology North Bangkok

Publisher

MDPI AG

Subject

General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)

Reference34 articles.

1. Genome-wide association studies;Uffelmann;Nat. Rev. Methods Primers,2021

2. Optimal Bandwidth Selection for Nonparametric Conditional Distribution and Quantile Functions;Li;J. Bus. Econ. Stat.,2013

3. Association of Type 2 Diabetes Susceptibility Variants with Advanced Prostate Cancer Risk in the Breast and Prostate Cancer Cohort Consortium;Machiela;Am. J. Epidemiol.,2012

4. 10 Years of GWAS Discovery: Biology, Function, and Translation;Visscher;Am. J. Hum. Genet.,2017

5. Statistical discoveries and effect-size estimation;R. Stat. Soc.,1989

Cited by 1 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

1. Nonlinear Models for Influenza Patients for Different Age Groups in Thailand;Proceedings of the 2024 9th International Conference on Information and Education Innovations;2024-04-12

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3