An integrated deep learning framework for the interpretation of untargeted metabolomics data-Reference-Cited by-同舟云学术

An integrated deep learning framework for the interpretation of untargeted metabolomics data

Published:2023-06-27 Issue:4 Volume:24 Page:
ISSN:1467-5463
Container-title:Briefings in Bioinformatics
language:en
Short-container-title:

Author:

Tian Leqi¹²,Yu Tianwei¹²³^ORCID

Affiliation:

1. School of Data Science, The Chinese University of Hong Kong – Shenzhen , Guangdong , China

2. Shenzhen Research Institute of Big Data , Guangdong , China

3. Guangdong Provincial Key Laboratory of Big Data Computing , Guangdong , China

Abstract

Abstract Untargeted metabolomics is gaining widespread applications. The key aspects of the data analysis include modeling complex activities of the metabolic network, selecting metabolites associated with clinical outcome and finding critical metabolic pathways to reveal biological mechanisms. One of the key roadblocks in data analysis is not well-addressed, which is the problem of matching uncertainty between data features and known metabolites. Given the limitations of the experimental technology, the identities of data features cannot be directly revealed in the data. The predominant approach for mapping features to metabolites is to match the mass-to-charge ratio (m/z) of data features to those derived from theoretical values of known metabolites. The relationship between features and metabolites is not one-to-one since some metabolites share molecular composition, and various adduct ions can be derived from the same metabolite. This matching uncertainty causes unreliable metabolite selection and functional analysis results. Here we introduce an integrated deep learning framework for metabolomics data that take matching uncertainty into consideration. The model is devised with a gradual sparsification neural network based on the known metabolic network and the annotation relationship between features and metabolites. This architecture characterizes metabolomics data and reflects the modular structure of biological system. Three goals can be achieved simultaneously without requiring much complex inference and additional assumptions: (1) evaluate metabolite importance, (2) infer feature-metabolite matching likelihood and (3) select disease sub-networks. When applied to a COVID metabolomics dataset and an aging mouse brain dataset, our method found metabolic sub-networks that were easily interpretable.

Funder

National Key Research and Development Program of China

Guangdong Talent Program

Shenzhen Research Institute of Big Data

Publisher

Oxford University Press (OUP)

Subject

Molecular Biology,Information Systems

Link

https://academic.oup.com/bib/article-pdf/24/4/bbad244/50916636/bbad244.pdf

Reference73 articles.

1. Metabolomics toward personalized medicine;Jacob;Mass Spectrom Rev,2019

2. Metabolomic profiles of body mass index in the Framingham heart study reveal distinct cardiometabolic phenotypes;Ho;PloS One,2016

3. Perturbation of metabolic pathways mediates the association of air pollutants with asthma and cardiovascular diseases;Jeong;Environ Int,2018

4. Metaboanalyst 4.0: towards more transparent and integrative metabolomics analysis;Chong;Nucleic Acids Res,2018

5. Xcms: processing mass spectrometry data for metabolite profiling using nonlinear peak alignment, matching, and identification;Smith;Anal Chem,2006