Abstract
AbstractGenome-scale metabolic models are powerful tools for understanding cellular physiology. Flux balance analysis (FBA), in particular, is a popular optimization approach for predicting metabolic phenotypes under genetic and environmental perturbations. In model microbes such asEscherichia coli, FBA has been successful at predicting essential genes, i.e. those genes that impair survival when deleted. A central assumption in this approach, however, is that both wild type and deletion strains optimize the same fitness objective. While the optimality assumption may hold for the wild type metabolic network, deletion strains are not subject to the same evolutionary pressures and knock-out mutants may steer their metabolism to meet other objectives for survival. Here, we present FlowGAT, a hybrid FBA-machine learning strategy for predicting essentiality directly from wild type metabolic phenotypes. The approach is based on graph-structured representation of metabolic fluxes predicted by FBA, where nodes correspond to enzymatic reactions and edges quantify the propagation of metabolite mass flow between a reaction and its neighbours. We integrate this information into a graph neural network that can be trained on knock-out fitness assay data. Comparisons across different model architectures reveal that FlowGAT predictions forE. coliare close to those of FBA for several growth conditions. This suggests that gene essentiality can be accurately predicted by exploiting the network structure of metabolism, without additional assumptions beyond optimality of the wild type. Our approach demonstrates the benefits of combining the mechanistic insights afforded by genome-scale models with the ability of deep learning models to extract patterns from complex data.
Publisher
Cold Spring Harbor Laboratory
Reference49 articles.
1. Olufemi Aromolaran , Damilare Aromolaran , Itunuoluwa Isewon , and Jelili Oyelade . Machine learning approach to gene essentiality prediction: a review. Briefings in Bioinformatics, 22 (5), sep 2021. ISSN 14774054.
2. Construction of
Escherichia coli
K‐12 in‐frame, single‐gene knockout mutants: the Keio collection
3. Flux-dependent graphs for metabolic networks
4. David B. Bernstein , Batu Akkas , Morgan N. Price , and Adam P. Arkin . Critical assessment of E. coli genome-scale metabolic model with high-throughput mutant fitness data, January 2023. Pages: 2023.01.05.522875 Section: New Results.
5. Lars Buitinck , Gilles Louppe , Mathieu Blondel , Fabian Pedregosa , Andreas Mueller , Olivier Grisel , Vlad Niculae , Peter Prettenhofer , Alexandre Gramfort , Jaques Grobler , Robert Layton , Jake VanderPlas , Arnaud Joly , Brian Holt , and Gaël Varoquaux . API design for machine learning software: experiences from the scikit-learn project. In ECML PKDD Workshop: Languages for Data Mining and Machine Learning, pages 108–122, 2013.
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献