Abstract
AbstractIn the 20th century many advances in biological knowledge and evidence-based medicine were supported by p-values and accompanying methods. In the beginning 21st century, ambitions towards precision medicine put a premium on detailed predictions for single individuals. The shift causes tension between traditional methods used to infer statistically significant group differences and burgeoning machine-learning tools suited to forecast an individual’s future. This comparison applies the linear model for identifying significant contributing variables and for finding the most predictive variable sets. In systematic data simulations and common medical datasets, we explored how statistical inference and pattern recognition can agree and diverge. Across analysis scenarios, even small predictive performances typically coincided with finding underlying significant statistical relationships. However, even statistically strong findings with very low p-values shed little light on their value for achieving accurate prediction in the same dataset. More complete understanding of different ways to define ‘important’ associations is a prerequisite for reproducible research findings that can serve to personalize clinical care.
Publisher
Cold Spring Harbor Laboratory
Cited by
10 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献