Author:
Petrovskiy Denis V.,Nikolsky Kirill S.,Kulikova Liudmila I.,Rudnev Vladimir R.,Butkova Tatiana V.,Malsagova Kristina A.,Kopylov Arthur T.,Kaysheva Anna L.
Abstract
AbstractThe primary objective of analyzing the data obtained in a mass spectrometry-based proteomic experiment is peptide and protein identification, or correct assignment of the tandem mass spectrum to one amino acid sequence. Comparison of empirical fragment spectra with the theoretical predicted one or matching with the collected spectra library are commonly accepted strategies of proteins identification and defining of their amino acid sequences. Although these approaches are widely used and are appreciably efficient for the well-characterized model organisms or measured proteins, they cannot detect novel peptide sequences that have not been previously annotated or are rare. This study presents PowerNovo tool for de novo sequencing of proteins using tandem mass spectra acquired in a variety of types of mass analyzers and different fragmentation techniques. PowerNovo involves an ensemble of models for peptide sequencing: model for detecting regularities in tandem mass spectra, precursors, and fragment ions and a natural language processing model, which has a function of peptide sequence quality assessment and helps with reconstruction of noisy sequences. The results of testing showed that the performance of PowerNovo is comparable and even better than widely utilized PointNovo, DeepNovo, Casanovo, and Novor packages. Also, PowerNovo provides complete cycle of processing (pipeline) of mass spectrometry data and, along with predicting the peptide sequence, involves the peptide assembly and protein inference blocks.
Publisher
Springer Science and Business Media LLC
Reference43 articles.
1. Ma, B. Novor: Real-time peptide de novo sequencing software. J. Am. Soc. Mass Spectrom. 26, 1885–1894 (2015).
2. Tran, N. H., Zhang, X., Xin, L., Shan, B. & Li, M. D. novo peptide sequencing by deep learning. Proc. Natl. Acad. Sci. U. S. A. 114, 8247–8252 (2017).
3. Karunratanakul, K., Tang, H.-Y., Speicher, D. W., Chuangsuwanich, E. & Sriswasdi, S. Uncovering thousands of new peptides with sequence-mask-search hybrid de novo peptide sequencing framework. Mol. Cell. Proteomics MCP 18, 2478–2491 (2019).
4. Inc, B. S. Computationally instrument-resolution-independent de novo peptide sequencing for high-resolution devices. Bioinformatics Solutions Inc. https://www.bioinfor.com/computationally-instrument-resolution-independent-de-novo-peptide-sequencing-for-high-resolution-devices/ (2021).
5. Yilmaz, M., Fondrie, W., Bittremieux, W., Oh, S. & Noble, W.S. De novo mass spectrometry peptide sequencing with a transformer model. In Proceedings of the 39th International Conference on Machine Learning, in Proceedings of Machine Learning Research, vol. 162, pp. 25514–25522. https://proceedings.mlr.press/v162/yilmaz22a.html (2022).