Abstract
AbstractMass spectrometry (MS)-based proteomics allows identifying and quantifying thousands of proteins but suffers from challenges when measuring human antibodies due to their vast variety. The mainly used bottom-up proteomics approaches rely on database searches that compare experimental values of peptides and their fragments to theoretical values derived from protein sequences in a database. While the human body can produce millions of distinct antibodies, the current databases for human antibodies such as UniProtKB/Swiss-Prot are limited to only 1095 sequences (as of 2024 Jan). This limitation may hinder the identification of new antibodies using mass spectrometry. Therefore, extending the database for mass spectrometry is an important task for discovering new antibodies. Recent genomic studies have compiled millions of human antibody sequences publicly accessible through the Observed Antibody Space (OAS) database. However, this data has yet to be exploited to confirm the presence of these antibodies. In this study, we adopted this extensive collection of antibody sequences for conducting efficient database searches in publicly available proteomics data with a focus on the SARS-CoV-2 disease. Thirty million heavy antibody sequences from 146 SARS-CoV-2 patients in the OAS database were digestedin silicoto obtain 18 million unique peptides. These peptides were then used to create new databases for bottom-up proteomics. We used those databases for searching new antibody peptides in publicly available SARS-CoV-2 human plasma samples in the Proteomics Identification Database (PRIDE). This approach avoids false positives in antibody peptide identification as confirmed by searching against negative controls (brain samples) and employing different database sizes. We show that the found sequences provide valuable information to distinguish diseased from healthy and expect that the newly discovered antibody peptides can be further employed to develop therapeutic antibodies. The method will be broadly applicable to find characteristic antibodies for other diseases.
Publisher
Cold Spring Harbor Laboratory