Abstract
AbstractSemi-supervised machine learning post-processors critically improve peptide identification of shot-gun proteomics data. Such post-processors accept the peptide-spectrum matches (PSMs) and feature vectors resulting from a database search, train a machine learning classifier, and recalibrate PSMs using the trained parameters, often yielding significantly more identified peptides across q-value thresholds. However, current state-of-the-art post-processors rely on shallow machine learning methods, such as support vector machines. In contrast, the powerful training capabilities of deep learning models have displayed superior performance to shallow models in an ever-growing number of other fields. In this work, we show that deep models significantly improve the recalibration of PSMs compared to the most accurate and widely-used post-processors, such as Percolator and PeptideProphet. Furthermore, we show that deep learning is able to adaptively analyze complex datasets and features for more accurate universal post-processing, leading to both improved Prosit analysis and markedly better recalibration of recently developed database-search functions.
Publisher
Cold Spring Harbor Laboratory
Reference50 articles.
1. Mass-spectrometric exploration of proteome structure and function
2. Predicting the sequence specificities of DNA- and RNA-binding proteins by deep learning
3. P. Baldi . Deep Learning in Science: Theory, Algorithms, and Applications. Cambridge University Press, Cambridge, UK, 2021. In press.
4. Searching for exotic particles in high-energy physics with deep learning;Nature communications,2014
5. Accurate and Sensitive Peptide Identification with Mascot Percolator
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献