Random sampling of the Protein Data Bank: RaSPDB-Reference-Cited by-同舟云学术

Random sampling of the Protein Data Bank: RaSPDB

Published:2021-12 Issue:1 Volume:11 Page:
ISSN:2045-2322
Container-title:Scientific Reports
language:en
Short-container-title:Sci Rep

Author:

Carugo Oliviero^ORCID

Abstract

AbstractA novel and simple procedure (RaSPDB) for Protein Data Bank mining is described. 10 PDB subsets, each containing 7000 randomly selected protein chains, are built and used to make 10 estimations of the average value of a generic feature F—the length of the protein chain, the amino acid composition, the crystallographic resolution, and the secondary structure composition. These 10 estimations are then used to compute an average estimation of F together with its standard error. It is heuristically verified that the dimension of these 10 subsets—7000 protein chains—is sufficiently small to avoid redundancy within each subset and sufficiently large to guarantee stable estimations amongst different subsets. RaSPDB has two major advantages over classical procedures aimed to build a single, non-redundant PDB subset: a larger fraction of the information stored in the PDB is used and an estimation of the standard error of F is possible.

Publisher

Springer Science and Business Media LLC

Subject

Multidisciplinary

Link

https://www.nature.com/articles/s41598-021-03615-y.pdf

Reference16 articles.

1. Protein Data Bank. Crystallography: Protein Data Bank. Nat. New Biol. 233, 223 (1971).

2. wwPDB Consortium. Protein Data Bank: The single global archive fro 3D macromolecular structural data. Nucleic Acids Res. 47, D520–D528 (2019).

3. Drenth, J. Principles of Protein X-ray Crystallography (Springer, 1994).

4. Tramontano, A. Protein Structure Prediction: Concepts and Applications (Wiley, 2006).

5. Burley, S. K. Impact of structural biologists and the Protein Data Bank on small-molecule drug discovery and development. J. Biol. Chem. 296, 100559 (2021).

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Interplay between hydrogen and chalcogen bonds in cysteine;Proteins: Structure, Function, and Bioinformatics;2022-10-26

2. Survey of the Intermolecular Disulfide Bonds Observed in Protein Crystal Structures Deposited in the Protein Data Bank;Life;2022-06-30

3. Developing a bioinformatics pipeline for comparative protein classification analysis;BMC Genomic Data;2022-06-06

4. Network Pharmacology and Molecular Docking to Elucidate the Potential Mechanism of Ligusticum Chuanxiong Against Osteoarthritis;Frontiers in Pharmacology;2022-04-14