NPOmix: A machine learning classifier to connect mass spectrometry fragmentation data to biosynthetic gene clusters

Author:

Leão Tiago F12ORCID,Wang Mingxun13ORCID,da Silva Ricardo4ORCID,Gurevich Alexey5ORCID,Bauermeister Anelize1ORCID,Gomes Paulo Wender P1ORCID,Brejnrod Asker1ORCID,Glukhov Evgenia6ORCID,Aron Allegra T17,Louwen Joris J R8ORCID,Kim Hyun Woo9,Reher Raphael10ORCID,Fiore Marli F2ORCID,van der Hooft Justin J J811ORCID,Gerwick Lena6ORCID,Gerwick William H16ORCID,Bandeira Nuno13ORCID,Dorrestein Pieter C11213ORCID

Affiliation:

1. Collaborative Mass Spectrometry Innovation Center, Skaggs School of Pharmacy and Pharmaceutical Sciences, University of California San Diego , La Jolla, CA 92093 , USA

2. Center for Nuclear Energy in Agriculture, University of São Paulo , Piracicaba 13400-970, SP , Brazil

3. Center for Computational Mass Spectrometry, University of California San Diego , La Jolla, CA 92093 , USA

4. NPPNS, Physic and Chemistry Department, School of Pharmaceutical Sciences of Ribeirão Preto, University of São Paulo , Ribeirão Preto 14040-900 , Brazil

5. Center for Algorithmic Biotechnology, St. Petersburg State University , St Petersburg 199004 , Russia

6. Center for Marine Biotechnology and Biomedicine, Scripps Institution of Oceanography, University of California San Diego , La Jolla, CA 92093 , USA

7. Department of Chemistry and Biochemistry, University of Denver , Denver, CO 80210 , USA

8. Bioinformatics Group, Wageningen University , 6708 PB Wageningen , The Netherlands

9. College of Pharmacy and Integrated Research Institute for Drug Development, Dongguk University , Gyeonggi-do 10326 , Korea

10. Institute of Pharmaceutical Biology and Biotechnology, University of Marburg , 35043 Marburg , Germany

11. Department of Biochemistry, University of Johannesburg , Auckland Park, Johannesburg 2006 , South Africa

12. Center for Microbiome Innovation, University of California San Diego , La Jolla, CA 92093 , USA

13. Departments of Pharmacology and Pediatrics, University of California San Diego , La Jolla, CA 92093 , USA

Abstract

Abstract Microbial specialized metabolites are an important source of and inspiration for many pharmaceuticals, biotechnological products and play key roles in ecological processes. Untargeted metabolomics using liquid chromatography coupled with tandem mass spectrometry is an efficient technique to access metabolites from fractions and even environmental crude extracts. Nevertheless, metabolomics is limited in predicting structures or bioactivities for cryptic metabolites. Efficiently linking the biosynthetic potential inferred from (meta)genomics to the specialized metabolome would accelerate drug discovery programs by allowing metabolomics to make use of genetic predictions. Here, we present a k-nearest neighbor classifier to systematically connect mass spectrometry fragmentation spectra to their corresponding biosynthetic gene clusters (independent of their chemical class). Our new pattern-based genome mining pipeline links biosynthetic genes to metabolites that they encode for, as detected via mass spectrometry from bacterial cultures or environmental microbiomes. Using paired datasets that include validated genes-mass spectral links from the Paired Omics Data Platform, we demonstrate this approach by automatically linking 18 previously known mass spectra (17 for which the biosynthesis gene clusters can be found at the MIBiG database plus palmyramide A) to their corresponding previously experimentally validated biosynthetic genes (e.g., via nuclear magnetic resonance or genetic engineering). We illustrated a computational example of how to use our Natural Products Mixed Omics (NPOmix) tool for siderophore mining that can be reproduced by the users. We conclude that NPOmix minimizes the need for culturing (it worked well on microbiomes) and facilitates specialized metabolite prioritization based on integrative omics mining.

Funder

National Institutes of Health

University of California

Fundação de Amparo à Pesquisa do Estado de São Paulo

Publisher

Oxford University Press (OUP)

Cited by 9 articles. 订阅此论文施引文献 订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3