Abstract
Given a large clinical database of longitudinal patient information including many covariates, it is computationally prohibitive to consider all types of interdependence between patient variables of interest. This challenge motivates the use of mutual information (MI), a statistical summary of data interdependence with appealing properties that make it a suitable alternative or addition to correlation for identifying relationships in data. MI: (i) captures all types of dependence, both linear and nonlinear, (ii) is zero only when random variables are independent, (iii) serves as a measure of relationship strength (similar to but more general than R2), and (iv) is interpreted the same way for numerical and categorical data. Unfortunately, MI typically receives little to no attention in introductory statistics courses and is more difficult than correlation to estimate from data. In this article, we motivate the use of MI in the analyses of epidemiologic data, while providing a general introduction to estimation and interpretation. We illustrate its utility through a retrospective study relating intraoperative heart rate (HR) and mean arterial pressure (MAP). We: (i) show postoperative mortality is associated with decreased MI between HR and MAP and (ii) improve existing postoperative mortality risk assessment by including MI and additional hemodynamic statistics.
Funder
National Science Foundation
National Institute of Environmental Health Sciences
Publisher
Public Library of Science (PLoS)
Reference27 articles.
1. Reshef, DN, Reshef, YA, Sabeti PC, and Mitzenmacher MM. An Empirical Study of Leading Measures of Dependence. arXiv.1505.02214, 2015.
2. Detecting novel associations in large data sets;DN Reshef;Science,2011
3. Measuring Dependence Powerfully and Equitably;YA Reshef;Journal of Machine Learning Research,2016
4. Gretton A, Bousquet O, Smola A, and Schölkopf B. Measuring Statistical Dependence with Hilbert-Schmidt Norms. In: Jain, S., Simon, H.U., Tomita, E. (eds) Algorithmic Learning Theory. 2005. Lecture Notes in Computer Science, vol 3734. Springer, Berlin, Heidelberg. https://doi.org/10.1007/11564089_7
5. Measuring and testing dependence by correlation of distances;GJ Székely;Annals of Statistics,2007
Cited by
3 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献