PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications-Reference-Cited by-同舟云学术

PLAS-5k: Dataset of Protein-Ligand Affinities from Molecular Dynamics for Machine Learning Applications

Published:2022-09-07 Issue:1 Volume:9 Page:
ISSN:2052-4463
Container-title:Scientific Data
language:en
Short-container-title:Sci Data

Author:

Korlepara Divya B.,Vasavi C. S.,Jeurkar Shruti,Pal Pradeep Kumar,Roy Subhajit^ORCID,Mehta Sarvesh,Sharma Shubham^ORCID,Kumar Vishal,Muvva Charuvaka,Sridharan Bhuvanesh,Garg Akshit,Modee Rohit,Bhati Agastya P.,Nayar Divya,Priyakumar U. Deva

Abstract

AbstractComputational methods and recently modern machine learning methods have played a key role in structure-based drug design. Though several benchmarking datasets are available for machine learning applications in virtual screening, accurate prediction of binding affinity for a protein-ligand complex remains a major challenge. New datasets that allow for the development of models for predicting binding affinities better than the state-of-the-art scoring functions are important. For the first time, we have developed a dataset, PLAS-5k comprised of 5000 protein-ligand complexes chosen from PDB database. The dataset consists of binding affinities along with energy components like electrostatic, van der Waals, polar and non-polar solvation energy calculated from molecular dynamics simulations using MMPBSA (Molecular Mechanics Poisson-Boltzmann Surface Area) method. The calculated binding affinities outperformed docking scores and showed a good correlation with the available experimental values. The availability of energy components may enable optimization of desired components during machine learning-based drug design. Further, OnionNet model has been retrained on PLAS-5k dataset and is provided as a baseline for the prediction of binding affinities.

Funder

Department of Science and Technology, Ministry of Science and Technology

DST | Science and Engineering Research Board

IHub-Data, IIIT Hyderabad Kohli Center on Intelligent Systems,IIIT Hyderabad

Publisher

Springer Science and Business Media LLC

Subject

Library and Information Sciences,Statistics, Probability and Uncertainty,Computer Science Applications,Education,Information Systems,Statistics and Probability

Link

https://www.nature.com/articles/s41597-022-01631-9.pdf

Reference77 articles.

1. Kairys, V., Baranauskiene, L., Kazlauskiene, M., Matulis, D. & Kazlauskas, E. Binding affinity in drug design: experimental and computational techniques. Expert opinion on drug discovery 14, 755–768 (2019).

2. Srivastava, H. K. & Sastry, G. N. Molecular dynamics investigation on a series of hiv protease inhibitors: assessing the performance of mm-pbsa and mm-gbsa approaches. Journal of chemical information and modeling 52, 3088–3098 (2012).

3. Kimber, T. B., Chen, Y. & Volkamer, A. Deep learning in virtual screening: Recent applications and developments. International Journal of Molecular Sciences 22, 4435 (2021).

4. Mordalski, S., Kosciolek, T., Kristiansen, K., Sylte, I. & Bojarski, A. J. Protein binding site analysis by means of structural interaction fingerprint patterns. Bioorganic & medicinal chemistry letters 21, 6816–6819 (2011).

5. Da, C. & Kireev, D. Structural protein–ligand interaction fingerprints (splif) for structure-based virtual screening: method and benchmark study. Journal of chemical information and modeling 54, 2555–2561 (2014).

Cited by 3 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Machine learning small molecule properties in drug discovery;Artificial Intelligence Chemistry;2023-12

2. Synthesis and molecular docking studies of novel tricyclic and angular tetracyclic benzothiadiazines via sp³‐C‐Hactivation as potential colon cancer inhibitors;Journal of Heterocyclic Chemistry;2023-08-14

3. MISATO - Machine learning dataset of protein-ligand complexes for structure-based drug discovery;2023-05-24