Evaluation of bottom-up and top-down mass spectrum identifications with different customized protein sequences databases
Author:
Li Ziwei1,
He Bo1,
Feng Weixing1
Affiliation:
1. College of Automation, Harbin Engineering University, Harbin, Heilongjiang, China
Abstract
Abstract
Motivation
Generally, bottom-up and top-down are two complementary approaches for proteoforms identification. The inference of proteoforms relies on searching mass spectra against an accurate proteoform sequence database. A customized protein sequence database derived by RNA-Seq data can be used to better identify the proteoform existed in a studied species. However, the quality of sequences in customized databases which constructed by different strategies affect the performances of mass spectrometry (MS) identification. Additionally, performances of identifications between bottom-up and top-down using customized databases are also needed to be evaluated
Results
Three customized databases were constructed with different strategies separately. Two of them were based on translating assembled transcripts with or without genomic annotation, and the third one is a variant-extending protein database. By testing with bottom-up and top-down MS data separately, a variant-extending protein database could identify not only the most number of spectra but also the alleles expressed at the same time in diploid cells. An assembled database could identify the spectrum missed in reference database and amino acid (AA) alterations existed in studied species.
Availability and implementation
Experimental results demonstrated that the proteoform sequences in an annotated database are more suitable for identifying AA alterations and peptide sequences missed in reference database. An unannotated database instead of a reference proteome database gets an enough high sensitivity of identifying mass spectra. The variant-extending reference database is the most sensitive to identify mass spectra and single AA variants
Supplementary information
Supplementary data are available at Bioinformatics online.
Funder
China National Natural Science Foundation
Natural Science Foundation of Heilongjiang Province
HEU Fundamental Research Funds for the Central University
Publisher
Oxford University Press (OUP)
Subject
Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献