Simulation of model overfit in variance explained with genetic data

Author:

Derringer JaimeORCID

Abstract

AbstractTwo recent papers, and an author response to prior commentary, addressing the genetic architecture of human temperament and character claimed that “The identified SNPs explained nearly all the heritability expected”. The authors’ method for estimating heritability may be summarized as: Step 1: Pre-select SNPs on the basis of GWAS p<0.01 in the target sample. Step 2: Enter target sample genotypes (the pre-selected SNPs from Step 1) and phenotypes into an unsupervised machine learning algorithm (Phenotype-Genotype Many-to-Many Relations Analysis, PGMRA) for further reduction of the set of SNPs. Step 3: Test the sum score of the SNPs identified from Step 2, weighted by the GWAS regression weights estimated in Step 1, within the same target sample. The authors interpreted the linear regression model R2 obtained from Step 3 as a measure of successfully identified heritability. Regardless of the method applied to select SNPs in Step 2, the combination of Steps 1 and 3, as described, causes inflation of the estimated effect size. The extent of this inflation is demonstrated here, where random SNP selection and polygenic scoring from simulated random data recovered effect sizes similar to those reported in the original empirical papers.

Publisher

Cold Spring Harbor Laboratory

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3