Abstract
AbstractHow to design experiments that accelerate knowledge discovery on complex biological landscapes remains a tantalizing question. We present an optimal experimental design method (coined OPEX) to identify informative omics experiments using machine learning models for both experimental space exploration and model training. OPEX-guided exploration of Escherichia coli’s populations exposed to biocide and antibiotic combinations lead to more accurate predictive models of gene expression with 44% less data. Analysis of the proposed experiments shows that broad exploration of the experimental space followed by fine-tuning emerges as the optimal strategy. Additionally, analysis of the experimental data reveals 29 cases of cross-stress protection and 4 cases of cross-stress vulnerability. Further validation reveals the central role of chaperones, stress response proteins and transport pumps in cross-stress exposure. This work demonstrates how active learning can be used to guide omics data collection for training predictive models, making evidence-driven decisions and accelerating knowledge discovery in life sciences.
Funder
NSF | BIO | Division of Biological Infrastructure
NSF | Directorate for Computer & Information Science & Engineering | Division of Computing and Communication Foundations
Publisher
Springer Science and Business Media LLC
Subject
General Physics and Astronomy,General Biochemistry, Genetics and Molecular Biology,General Chemistry
Reference68 articles.
1. O’Malley, M. A., Elliott, K. C., Haufe, C. & Burian, R. M. Philosophies of funding. Cell 138, 611–615 (2009).
2. Waters, C. K. The nature and context of exploratory experimentation: an introduction to three case studies of exploratory research. Hist. Philos. Life Sci. 29, 275–284 (2007).
3. Elliott, K. C. Varieties of exploratory experimentation in nanotoxicology. Hist. Philos. Life Sci. 29, 313–336 (2007).
4. Renaud, J.-P. et al. Cryo-em in drug discovery: achievements, limitations and prospects. Nat. Rev. Drug Discov. 17, 471–492 (2018).
5. Soon, W. W., Hariharan, M. & Snyder, M. P. High-throughput sequencing for biology and medicine. Mol. Syst. Biol. 9, 640 (2013).
Cited by
13 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献