Author:
Wang Connie Y.,Chang Paul M.,Ary Marie L.,Allen Benjamin D.,Chica Roberto A.,Mayo Stephen L.,Olafson Barry D.
Abstract
AbstractWe present ProtaBank, a repository for storing, querying, analyzing, and sharing protein design and engineering data in an actively maintained and updated database. ProtaBank provides a format to describe and compare all types of protein mutational data, spanning a wide range of properties and techniques. It features a user-friendly web interface and programming layer that streamlines data deposition and allows for batch input and queries. The database schema design incorporates a standard format for reporting protein sequences and experimental data that facilitates comparison of results across different data sets. A suite of analysis and visualization tools are provided to facilitate discovery, to guide future designs, and to benchmark and train new predictive tools and algorithms. ProtaBank will provide a valuable resource to the protein engineering community by storing and safeguarding newly generated data, allowing for fast searching and identification of relevant data from the existing literature, and exploring correlations between disparate data sets. ProtaBank invites researchers to contribute data to the database to make it accessible for search and analysis. ProtaBank is available at https://protabank.org.ImpactThe ProtaBank database provides a central repository for researchers to store, query, analyze, and share all types of protein engineering data. This modern database will serve a pivotal role in organizing protein engineering data and leveraging the increasingly large amounts of mutational data being generated. Together with the analysis tools, it will help scientists gain insights into sequence-function relationships, support the development of new predictive tools and algorithms, and facilitate future protein engineering efforts.Abbreviations3Dthree-dimensionalAPIapplication programming interfaceAWSAmazon Web ServicesBLASTBasic Local Alignment Search ToolCmconcentration of denaturant at midpoint of unfolding transitionCSVcomma-separated valuesΔGGibbs free energy of folding/unfoldingGβ1β1 domain of Streptococcal protein GGdmClguanidinium chloridekcatcatalytic rate constantKddissociation constantMICminimum inhibitory concentrationPDBProtein Data BankPEprotein engineeringRDSRelational Database ServicesRESTRepresentation State TransferTmmelting temperature
Publisher
Cold Spring Harbor Laboratory