AllerCatPro—prediction of protein allergenicity potential from the protein sequence

Author:

Maurer-Stroh Sebastian12,Krutz Nora L3,Kern Petra S3,Gunalan Vithiagaran1,Nguyen Minh N1,Limviphuvadh Vachiranee1,Eisenhaber Frank12,Gerberick G Frank4

Affiliation:

1. Biomolecular Function Discovery Division, Bioinformatics Institute, Agency for Science, Technology and Research, Singapore

2. Department of Biological Sciences, National University of Singapore, Singapore

3. The Procter & Gamble Services Company, Strombeek-Bever, Belgium

4. The Procter and Gamble Company, Mason, OH, USA

Abstract

Abstract Motivation Due to the risk of inducing an immediate Type I (IgE-mediated) allergic response, proteins intended for use in consumer products must be investigated for their allergenic potential before introduction into the marketplace. The FAO/WHO guidelines for computational assessment of allergenic potential of proteins based on short peptide hits and linear sequence window identity thresholds misclassify many proteins as allergens. Results We developed AllerCatPro which predicts the allergenic potential of proteins based on similarity of their 3D protein structure as well as their amino acid sequence compared with a data set of known protein allergens comprising of 4180 unique allergenic protein sequences derived from the union of the major databases Food Allergy Research and Resource Program, Comprehensive Protein Allergen Resource, WHO/International Union of Immunological Societies, UniProtKB and Allergome. We extended the hexamer hit rule by removing peptides with high probability of random occurrence measured by sequence entropy as well as requiring 3 or more hexamer hits consistent with natural linear epitope patterns in known allergens. This is complemented with a Gluten-like repeat pattern detection. We also switched from a linear sequence window similarity to a B-cell epitope-like 3D surface similarity window which became possible through extensive 3D structure modeling covering the majority (74%) of allergens. In case no structure similarity is found, the decision workflow reverts to the old linear sequence window rule. The overall accuracy of AllerCatPro is 84% compared with other current methods which range from 51 to 73%. Both the FAO/WHO rules and AllerCatPro achieve highest sensitivity but AllerCatPro provides a 37-fold increase in specificity. Availability and implementation https://allercatpro.bii.a-star.edu.sg/ Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Agency of Science, Technology and Research

A*STAR

Procter & Gamble

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Reference45 articles.

1. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs;Altschul;Nucleic Acids Res,1997

2. Protein Data Bank (PDB): the single global macromolecular structure archive;Burley;Methods Mol. Biol,2017

3. Structural similarity between native proteins and chimera constructs obtained by inverting the amino acid sequence;Carugo;Acta Chim Slov,2010

4. Structure of allergens and structure based epitope predictions;Dall’antonia;Methods,2014

5. AllerTOP v.2–a server for in silico prediction of allergens;Dimitrov;J. Mol. Model,2014

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3