Improve the model of disease subtype heterogeneity by leveraging external summary data
-
Published:2023-07-12
Issue:7
Volume:19
Page:e1011236
-
ISSN:1553-7358
-
Container-title:PLOS Computational Biology
-
language:en
-
Short-container-title:PLoS Comput Biol
Author:
Fu Sheng,
Purdue Mark P.,
Zhang Han,
Qin Jing,
Song Lei,
Berndt Sonja I.,
Yu KaiORCID
Abstract
Researchers are often interested in understanding the disease subtype heterogeneity by testing whether a risk exposure has the same level of effect on different disease subtypes. The polytomous logistic regression (PLR) model provides a flexible tool for such an evaluation. Disease subtype heterogeneity can also be investigated with a case-only study that uses a case-case comparison procedure to directly assess the difference between risk effects on two disease subtypes. Motivated by a large consortium project on the genetic basis of non-Hodgkin lymphoma (NHL) subtypes, we develop PolyGIM, a procedure to fit the PLR model by integrating individual-level data with summary data extracted from multiple studies under different designs. The summary data consist of coefficient estimates from working logistic regression models established by external studies. Examples of the working model include the case-case comparison model and the case-control comparison model, which compares the control group with a subtype group or a broad disease group formed by merging several subtypes. PolyGIM efficiently evaluates risk effects and provides a powerful test for disease subtype heterogeneity in situations when only summary data, instead of individual-level data, is available from external studies due to various informatics and privacy constraints. We investigate the theoretic properties of PolyGIM and use simulation studies to demonstrate its advantages. Using data from eight genome-wide association studies within the NHL consortium, we apply it to study the effect of the polygenic risk score defined by a lymphoid malignancy on the risks of four NHL subtypes. These results show that PolyGIM can be a valuable tool for pooling data from multiple sources for a more coherent evaluation of disease subtype heterogeneity.
Funder
National Institutes of Health
Publisher
Public Library of Science (PLoS)
Subject
Computational Theory and Mathematics,Cellular and Molecular Neuroscience,Genetics,Molecular Biology,Ecology,Modeling and Simulation,Ecology, Evolution, Behavior and Systematics
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献