Algorithmic Learning for Auto-deconvolution of GC-MS Data to Enable Molecular Networking within GNPS
Author:
Aksenov Alexander A.ORCID, Laponogov Ivan, Zhang Zheng, Doran Sophie LF, Belluomo Ilaria, Veselkov Dennis, Bittremieux Wout, Nothias Louis Felix, Nothias-Esposito Mélissa, Maloney Katherine N., Misra Biswapriya B., Melnik Alexey V., Jones Kenneth L., Dorrestein Kathleen, Panitchpakdi Morgan, Ernst Madeleine, van der Hooft Justin J.J.ORCID, Gonzalez Mabel, Carazzone Chiara, Amézquita Adolfo, Callewaert Chris, Morton James, Quinn Robert, Bouslimani Amina, Albarracín Orio Andrea, Petras Daniel, Smania Andrea M., Couvillion Sneha P., Burnet Meagan C., Nicora Carrie D., Zink Erika, Metz Thomas O., Artaev Viatcheslav, Humston-Fulmer Elizabeth, Gregor Rachel, Meijler Michael M., Mizrahi Itzhak, Eyal Stav, Anderson Brooke, Dutton Rachel, Lugan Raphaël, Boulch Pauline Le, Guitton Yann, Prevost Stephanie, Poirier Audrey, Dervilly Gaud, Bizec Bruno Le, Fait Aaron, Persi Noga Sikron, Song Chao, Gashu Kelem, Coras Roxana, Guma Monica, Manasson Julia, Scher Jose U., Barupal Dinesh, Alseekh Saleh, Fernie Alisdair, Mirnezami Reza, Vasiliou Vasilis, Schmid Robin, Borisov Roman S., Kulikova Larisa N., Knight RobORCID, Wang Mingxun, Hanna George B, Dorrestein Pieter C., Veselkov Kirill
Abstract
AbstractGas chromatography-mass spectrometry (GC-MS) represents an analytical technique with significant practical societal impact. Spectral deconvolution is an essential step for interpreting GC-MS data. No public GC-MS repositories that also enable repository-scale analysis exist, in part because deconvolution requires significant user input. We therefore engineered a scalable machine learning workflow for the Global Natural Product Social Molecular Networking (GNPS) analysis platform to enable the mass spectrometry community to store, process, share, annotate, compare, and perform molecular networking of GC-MS data. The workflow performs auto-deconvolution of compound fragmentation patterns via unsupervised non-negative matrix factorization, using a Fast Fourier Transform-based strategy to overcome scalability limitations. We introduce a “balance score” that quantifies the reproducibility of fragmentation patterns across all samples. We demonstrate the utility of the platform with breathomics analysis applied to the early detection of oesophago-gastric cancer, and by creating the first molecular spatial map of the human volatilome.
Publisher
Cold Spring Harbor Laboratory
Cited by
2 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献
|
|