Abstract
AbstractGlycans constitute the most complicated post-translational modification, modulating protein activity in health and disease. However, structural annotation from tandem mass spectrometry data is a bottleneck in glycomics, preventing high-throughput endeavors and relegating glycomics to a few experts. Trained on a newly curated set of 300,000 annotated MS/MS spectra, we present CandyCrunch, a dilated residual neural network predicting glycan structure from raw LC-MS/MS data in seconds (Top1 Accuracy: 87.7%). We developed an open-access Python-based workflow of raw data conversion and prediction, followed by automated curation and fragment annotation, with predictions recapitulating and extending expert annotation. We demonstrate that this can be used forde novoannotation, diagnostic fragment identification, and high-throughput glycomics. For maximum impact, this entire pipeline is tightly interlaced with our glycowork platform and can be easily tested athttps://colab.research.google.com/github/BojarLab/CandyCrunch/blob/main/CandyCru nch.ipynb. We envision CandyCrunch to democratize structural glycomics and the elucidation of biological roles of glycans.
Publisher
Cold Spring Harbor Laboratory
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献