Extreme-value sampling design is cost-beneficial only with a valid statistical approach for exposure–secondary outcome association analyses

Author:

Zhang Hang12,Bi Wenjian3,Cui Yuehua4,Chen Honglei5,Chen Jinbo6,Zhao Yanlong12,Kang Guolian3ORCID

Affiliation:

1. Key Laboratory of Systems and Control, Academy of Mathematics and Systems Science, Chinese Academy of Sciences, Beijing, PR China

2. School of Mathematical Sciences, University of Chinese Academy of Sciences, Beijing, PR China

3. Department of Biostatistics, St. Jude Children's Research Hospital, Memphis, TN, USA

4. Department of Statistics and Probability, Michigan State University, East Lansing, MI, USA

5. Department of Epidemiology and Biostatistics, Michigan State University, East Lansing, MI, USA

6. Department of Biostatistics, Epidemiology and Informatics, University of Pennsylvania, Philadelphia, PA, USA

Abstract

In epidemiology cohort studies, exposure data are collected in sub-studies based on a primary outcome (PO) of interest, as with the extreme-value sampling design (EVSD), to investigate their correlation. Secondary outcomes (SOs) data are also readily available, enabling researchers to assess the correlations between the exposure and the SOs. However, when the EVSD is used, the data for SOs are not representative samples of a general population; thus, many commonly used statistical methods, such as the generalized linear model (GLM), are not valid. A prospective likelihood method has been developed to associate SOs with single-nucleotide polymorphisms under an extreme phenotype sequencing design. In this paper, we describe the application of the prospective likelihood method (STEVSD) to exposure–SO association analysis under an EVSD. We undertook extensive simulations to assess the performance of the STEVSD method in associating binary and continuous exposures with SOs, comparing it to the simple GLM method that ignores the EVSD. To demonstrate the cost-benefit of the STEVSD method, we also mimicked the design of two new retrospective studies, as would be done in actual practice, based on the PO of interest, which was the same as the SO in the EVSD study. We then analyzed these data by using the GLM method and compared its power to that of the STEVSD method. We demonstrated the usefulness of the STEVSD method by applying it to a benign ethnic neutropenia dataset. Our results indicate that the STEVSD method can control type I error well, whereas the GLM method cannot do so owing to its ignorance of EVSD, and that the STEVSD method is cost-effective because it has statistical power similar to that of two new retrospective studies that require collecting new exposure data for selected individuals.

Publisher

SAGE Publications

Subject

Health Information Management,Statistics and Probability,Epidemiology

Cited by 2 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3