Outcome risk model development for heterogeneity of treatment effect analyses: a comparison of non-parametric machine learning methods and semi-parametric statistical methods

Author:

Xu Edward,Vanghelof Joseph,Wang Yiyang,Patel Anisha,Furst Jacob,Raicu Daniela Stan,Neumann Johannes Tobias,Wolfe Rory,Gao Caroline X.,McNeil John J.,Shah Raj C.,Tchoua Roselyne

Abstract

Abstract Background In randomized clinical trials, treatment effects may vary, and this possibility is referred to as heterogeneity of treatment effect (HTE). One way to quantify HTE is to partition participants into subgroups based on individual’s risk of experiencing an outcome, then measuring treatment effect by subgroup. Given the limited availability of externally validated outcome risk prediction models, internal models (created using the same dataset in which heterogeneity of treatment analyses also will be performed) are commonly developed for subgroup identification. We aim to compare different methods for generating internally developed outcome risk prediction models for subject partitioning in HTE analysis. Methods Three approaches were selected for generating subgroups for the 2,441 participants from the United States enrolled in the ASPirin in Reducing Events in the Elderly (ASPREE) randomized controlled trial. An extant proportional hazards-based outcomes predictive risk model developed on the overall ASPREE cohort of 19,114 participants was identified and was used to partition United States’ participants by risk of experiencing a composite outcome of death, dementia, or persistent physical disability. Next, two supervised non-parametric machine learning outcome classifiers, decision trees and random forests, were used to develop multivariable risk prediction models and partition participants into subgroups with varied risks of experiencing the composite outcome. Then, we assessed how the partitioning from the proportional hazard model compared to those generated by the machine learning models in an HTE analysis of the 5-year absolute risk reduction (ARR) and hazard ratio for aspirin vs. placebo in each subgroup. Cochran’s Q test was used to detect if ARR varied significantly by subgroup. Results The proportional hazard model was used to generate 5 subgroups using the quintiles of the estimated risk scores; the decision tree model was used to generate 6 subgroups (6 automatically determined tree leaves); and the random forest model was used to generate 5 subgroups using the quintiles of the prediction probability as risk scores. Using the semi-parametric proportional hazards model, the ARR at 5 years was 15.1% (95% CI 4.0–26.3%) for participants with the highest 20% of predicted risk. Using the random forest model, the ARR at 5 years was 13.7% (95% CI 3.1–24.4%) for participants with the highest 20% of predicted risk. The highest outcome risk group in the decision tree model also exhibited a risk reduction, but the confidence interval was wider (5-year ARR = 17.0%, 95% CI= -5.4–39.4%). Cochran’s Q test indicated ARR varied significantly only by subgroups created using the proportional hazards model. The hazard ratio for aspirin vs. placebo therapy did not significantly vary by subgroup in any of the models. The highest risk groups for the proportional hazards model and random forest model contained 230 participants each, while the highest risk group in the decision tree model contained 41 participants. Conclusions The choice of technique for internally developed models for outcome risk subgroups influences HTE analyses. The rationale for the use of a particular subgroup determination model in HTE analyses needs to be explicitly defined based on desired levels of explainability (with features importance), uncertainty of prediction, chances of overfitting, and assumptions regarding the underlying data structure. Replication of these analyses using data from other mid-size clinical trials may help to establish guidance for selecting an outcomes risk prediction modelling technique for HTE analyses.

Funder

National Institutes of Health

Publisher

Springer Science and Business Media LLC

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3