Author:
Loftus Tyler J.,Shickel Benjamin,Balch Jeremy A.,Tighe Patrick J.,Abbott Kenneth L.,Fazzone Brian,Anderson Erik M.,Rozowsky Jared,Ozrazgat-Baslanti Tezcan,Ren Yuanfang,Berceli Scott A.,Hogan William R.,Efron Philip A.,Moorman J. Randall,Rashidi Parisa,Upchurch Gilbert R.,Bihorac Azra
Abstract
Human pathophysiology is occasionally too complex for unaided hypothetical-deductive reasoning and the isolated application of additive or linear statistical methods. Clustering algorithms use input data patterns and distributions to form groups of similar patients or diseases that share distinct properties. Although clinicians frequently perform tasks that may be enhanced by clustering, few receive formal training and clinician-centered literature in clustering is sparse. To add value to clinical care and research, optimal clustering practices require a thorough understanding of how to process and optimize data, select features, weigh strengths and weaknesses of different clustering methods, select the optimal clustering method, and apply clustering methods to solve problems. These concepts and our suggestions for implementing them are described in this narrative review of published literature. All clustering methods share the weakness of finding potential clusters even when natural clusters do not exist, underscoring the importance of applying data-driven techniques as well as clinical and statistical expertise to clustering analyses. When applied properly, patient and disease phenotype clustering can reveal obscured associations that can help clinicians understand disease pathophysiology, predict treatment response, and identify patients for clinical trial enrollment.
Funder
National Institute of General Medical Sciences
National Institute on Aging
National Science Foundation
National Institute of Biomedical Imaging and Bioengineering
Cited by
31 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献