Abstract
AbstractMultimodal diseases are those in which affected individuals can be divided into subtypes (or ‘data modes’); for instance, ‘mild’ vs. ‘severe’, based on (unknown) modifiers of disease severity. Studies have shown that despite the inclusion of a large number of subjects, the causal role of the microbiome in human diseases remains uncertain. The role of the microbiome in multimodal diseases has been studied in animals; however, findings are often deemed irreproducible, or unreasonably biased, with pathogenic roles in 95% of reports. As a solution to repeatability, investigators have been told to seek funds to increase the number of human-microbiome donors (N) to increase the reproducibility of animal studies (doi:10.1016/j.cell.2019.12.025). Herein, through simulations, we illustrate that increasing N will not uniformly/universally enable the identification of consistent statistical differences (patterns of analytical irreproducibility), due to random sampling from a population with ample variability in disease and the presence of ‘disease data subtypes’ (or modes). We also found that studies do not use cluster statistics when needed (97.4%, 37/38, 95%CI=86.5,99.5), and that scientists who increased N, concurrently reduced the number of mice/donor (y=-0.21x, R2=0.24; and vice versa), indicating that statistically, scientists replace the disease variance in mice by the variance of human disease. Instead of assuming that increasing N will solve reproducibility and identify clinically-predictive findings on causality, we propose the visualization of data distribution using kernel-density-violin plots (rarely used in rodent studies; 0%, 0/38, 95%CI=6.9e-18,9.1) to identify ‘disease data subtypes’ to self-correct, guide and promote the personalized investigation of disease subtype mechanisms.HighlightsMultimodal diseases are those in which affected individuals can be divided into subtypes (or ‘data modes’); for instance, ‘mild’ vs. ‘severe’, based on (unknown) modifiers of disease severity.The role of the microbiome in multimodal diseases has been studied in animals; however, findings are often deemed irreproducible, or unreasonably biased, with pathogenic roles in 95% of reports.As a solution to repeatably, investigators have been told to seek funds to increase the number of human-microbiome donors (N) to increase the reproducibility of animal studies.Herein, we illustrate that although increasing N could help identify statistical effects (patterns of analytical irreproducibility), clinically-relevant information will not always be identified.Depending on which diseases need to be compared, ‘random sampling’ alone leads to reproducible ‘patterns of analytical irreproducibility’ in multimodal disease simulations.Instead of solely increasing N, we illustrate how disease multimodality could be understood, visualized and used to guide the study of diseases by selecting and focusing on ‘disease modes’.
Publisher
Cold Spring Harbor Laboratory