Pepid: a Highly Modifiable, Bioinformatics-Oriented Peptide Search Engine-Reference-Cited by-同舟云学术

Pepid: a Highly Modifiable, Bioinformatics-Oriented Peptide Search Engine

Published:2023-11-02 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Zumer Jeremie,Lemieux Sébastien^ORCID

Abstract

AbstractMotivationCurrent peptide search engines are optimized for wet-lab workflows, i.e. they operate in an “end-to-end” manner to achieve good identification results, not to be modified or provide algorithmic insight. This makes developing new software methods to solve problems in peptide identification methods difficult, often requiring a full engine rewrite. Recently, many deep learning methods were proposed as solutions to various parts of the peptide identification task, but virtually none of those methods have been implemented in any actual peptide search process. We believe that the lack of a reliable bioinformatics research platform for peptide identification that enables such integrations is slowing down proteomics research as a whole.ResultsWe present pepid, a bioinformatics research-oriented peptide search engine. Unlike other search engines, pepid is specifically designed with ease of computational research in mind. Our design is highly flexible and allows easy modifications with little required software development expertise, allowing researchers to focus on analysing and improving peptide identification methods.It also takes recent computational trends into account, such as the recent slew of deep learning publications in proteomics, and features a multi-phased batched operations design that is more appropriate than the spectrum batch “end-to-end” designs of existing search engines for those approaches. We show that pepid is competitive with common engines in terms of both identification rates and runtime, forming a minimum required baseline to enable further identification research.Availability and ImplementationPepid is available as open source software under the MIT license athttps://github.com/lemieux-lab/pepid. Other data referenced in the text is 3rd party. The selected yeast proteome can be found on SwissProt with accession ID UP000002311 while the human proteome’s accession ID is UP0000005640. The ProteomeTools spectra can be found in the PRIDE archive under accession D PXD004732 and the One Hour Yeast Proteome can be found at the ChorusProject athttps://chorusproject.org/anonymous/download/experiment/-8823069691100997209andhttps://chorusproject.org/anonymous/download/experiment/449795368199176159.

Publisher

Cold Spring Harbor Laboratory

Reference40 articles.

1. Tempest: Accelerated MS/MS Database Search Software for Heterogeneous Computing Platforms

2. UniProt: the universal protein knowledgebase in 2021

3. Andromeda: A Peptide Search Engine Integrated into the MaxQuant Environment

4. TANDEM: matching proteins with tandem mass spectra

5. Bias in False Discovery Rate Estimation in Mass-Spectrometry-Based Peptide Identification