Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models-Reference-Cited by-同舟云学术

Processing-bias correction with DEBIAS-M improves cross-study generalization of microbiome-based prediction models

Published:2024-02-12 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Austin George I.^ORCID,Kav Aya Brown^ORCID,Park Heekuk^ORCID,Biermann Jana^ORCID,Uhlemann Anne-Catrin^ORCID,Korem Tal^ORCID

Abstract

AbstractEvery step in common microbiome profiling protocols has variable efficiency for each microbe. For example, different DNA extraction kits may have different efficiency for Gram-positive and -negative bacteria. These variable efficiencies, combined with technical variation, create strong processing biases, which impede the identification of signals that are reproducible across studies and the development of generalizable and biologically interpretable prediction models. “Batch-correction” methods have been used to alleviate these issues computationally with some success. However, many make strong parametric assumptions which do not necessarily apply to microbiome data or processing biases, or require the use of an outcome variable, which risks overfitting. Lastly and importantly, existing transformations used to correct microbiome data are largely non-interpretable, and could, for example, introduce values to features that were initially mostly zeros. Altogether, processing bias currently compromises our ability to glean robust and generalizable biological insights from microbiome data. Here, we present DEBIAS-M (Domain adaptation with phenotypeEstimation andBatchIntegrationAcrossStudies of theMicrobiome), an interpretable framework for inference and correction of processing bias, which facilitates domain adaptation in microbiome studies. DEBIAS-M learns bias-correction factors for each microbe in each batch that simultaneously minimize batch effects and maximize cross-study associations with phenotypes. Using benchmarks of HIV and colorectal cancer classification from gut microbiome data, and cervical neoplasia prediction from cervical microbiome data, we demonstrate that DEBIAS-M outperforms batch-correction methods commonly used in the field. Notably, we show that the inferred bias-correction factors are stable, interpretable, and strongly associated with specific experimental protocols. Overall, we show that DEBIAS-M allows for better modeling of microbiome data and identification of interpretable signals that are reproducible across studies.

Publisher

Cold Spring Harbor Laboratory

Reference73 articles.

1. External validation of prognostic models: what, why, how, when and where?

2. External validation of new risk prediction models is infrequent and reveals worse prognostic discrimination

3. Preterm birth is associated with xenobiotics and predicted by the vaginal metabolome;Nat Microbiol,2023

4. The vaginal microbiome and preterm birth

5. Brown, R. G. et al. Vaginal dysbiosis increases risk of preterm fetal membrane rupture, neonatal sepsis and is exacerbated by erythromycin. BMC Med. 16, 9 (2018).

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Compositional transformations can reasonably introduce phenotype-associated values into sparse features;2024-02-21