Improving prediction of linear regression models by integrating external information from heterogeneous populations: James–Stein estimators

Author:

Han Peisong1ORCID,Li Haoyue2,Park Sung Kyun3ORCID,Mukherjee Bhramar2ORCID,Taylor Jeremy M G2

Affiliation:

1. Biostatistics Innovation Group, Gilead Sciences , 333 Lakeside Drive, Foster City, CA 94404 , United States

2. Department of Biostatistics, University of Michigan , 1415 Washington Heights, Ann Arbor, MI 48109 , United States

3. Department of Epidemiology, University of Michigan , 1415 Washington Heights, Ann Arbor, MI 48109 , United States

Abstract

Abstract We consider the setting where (1) an internal study builds a linear regression model for prediction based on individual-level data, (2) some external studies have fitted similar linear regression models that use only subsets of the covariates and provide coefficient estimates for the reduced models without individual-level data, and (3) there is heterogeneity across these study populations. The goal is to integrate the external model summary information into fitting the internal model to improve prediction accuracy. We adapt the James–Stein shrinkage method to propose estimators that are no worse and are oftentimes better in the prediction mean squared error after information integration, regardless of the degree of study population heterogeneity. We conduct comprehensive simulation studies to investigate the numerical performance of the proposed estimators. We also apply the method to enhance a prediction model for patella bone lead level in terms of blood lead level and other covariates by integrating summary information from published literature.

Funder

National Institutes of Health

Publisher

Oxford University Press (OUP)

Reference39 articles.

1. A family of minimax estimators of the mean of a multivariate normal distribution;Baranchik;Annals of Mathematical Statistics,1970

2. Confidence regions for averaging estimators;Boot,2020

3. Automated bone lead analysis by k-X-ray fluorescence for the clinical environment;Burger;Basic Life Sciences,1990

4. Constrained maximum likelihood estimation for model calibration using summary-level information from external big data sources;Chatterjee;Journal of the American Statistical Association,2016

5. Combining primary cohort data with external aggregate information without assuming comparability;Chen;Biometrics,2021

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3