A survey of algorithms for transforming molecular dynamics data into metadata for in situ analytics based on machine learning methods-Reference-Cited by-同舟云学术

A survey of algorithms for transforming molecular dynamics data into metadata for in situ analytics based on machine learning methods

Published:2020-01-20 Issue:2166 Volume:378 Page:20190063
ISSN:1364-503X
Container-title:Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences
language:en
Short-container-title:Phil. Trans. R. Soc. A.

Author:

Taufer Michela¹^ORCID,Estrada Trilce²^ORCID,Johnston Travis³^ORCID

Affiliation:

1. Electrical Engineering and Computer Science Department, The University of Tennessee Knoxville, 401 Min H. Kao Bldg., 1520 Middle Drive, Knoxville, TN 37996-2250, USA

2. Computer Science Department, University of New Mexico, MSC01 1130, Albuquerque, NM 87131-1070, USA

3. Oak Ridge National Laboratory, PO Box 2008, Oak Ridge, TN 37831, USA

Abstract

This paper presents the survey of three algorithms to transform atomic-level molecular snapshots from molecular dynamics (MD) simulations into metadata representations that are suitable for in situ analytics based on machine learning methods. MD simulations studying the classical time evolution of a molecular system at atomic resolution are widely recognized in the fields of chemistry, material sciences, molecular biology and drug design; these simulations are one of the most common simulations on supercomputers. Next-generation supercomputers will have a dramatically higher performance than current systems, generating more data that needs to be analysed (e.g. in terms of number and length of MD trajectories). In the future, the coordination of data generation and analysis can no longer rely on manual, centralized analysis traditionally performed after the simulation is completed or on current data representations that have been defined for traditional visualization tools. Powerful data preparation phases (i.e. phases in which original row data is transformed to concise and still meaningful representations) will need to proceed data analysis phases. Here, we discuss three algorithms for transforming traditionally used molecular representations into concise and meaningful metadata representations. The transformations can be performed locally. The new metadata can be fed into machine learning methods for runtime in situ analysis of larger MD trajectories supported by high-performance computing. In this paper, we provide an overview of the three algorithms and their use for three different applications: protein–ligand docking in drug design; protein folding simulations; and protein engineering based on analytics of protein functions depending on proteins' three-dimensional structures. This article is part of a discussion meeting issue ‘Numerical algorithms for high-performance computational science’.

Funder

NSF SCI: Collaborative Research: DAPLDS - a Dynamically Adaptive Protein-Ligand Docking System based on Multi-Scale Modeling

NSF BIGDATA: IA: Collaborative Research: In Situ Data Analytics for Next Generation Molecular Dynamics Workflows

NSF SHF: Small: Collaborative Research: Modeling and Analyzing Big Data on Peta- and Exascale Distributed Systems supported by MapReduce Methodologies

Publisher

The Royal Society

Subject

General Physics and Astronomy,General Engineering,General Mathematics

Link

https://royalsocietypublishing.org/doi/pdf/10.1098/rsta.2019.0063

Reference27 articles.

1. Perilla JR Goh BC Cassidy CK Liu B Bernardi RC Rudack T Yu H Wu Z Schulten K. 2015 Molecular dynamics simulations of large macromolecular complexes. 31 64–74. (doi:10.1016/j.sbi.2015.03.007)

2. The Amber biomolecular simulation programs

3. CHARMM: The biomolecular simulation program

4. Scalable molecular dynamics with NAMD

5. Luu H Winslett M Gropp W Ross R Carns P Harms K Prabhat M Byna S Yao Y. 2015 A Multiplatform Study of I/O Behavior on Petascale Supercomputers. In Proc. 24th Int. Symp. on High-Performance Parallel and Distributed Computing pp. 33–44.

Cited by 6 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Computational prediction of ω-transaminase selectivity by deep learning analysis of molecular dynamics trajectories;QRB Discovery;2022-12-12

2. A Methodology to Generate Efficient Neural Networks for Classification of Scientific Datasets;2022 IEEE 18th International Conference on e-Science (e-Science);2022-10

3. Machine learning, artificial intelligence, and chemistry: How smart algorithms are reshaping simulation and the laboratory;Pure and Applied Chemistry;2022-08-01

4. Performance assessment of ensembles of in situ workflows under resource constraints;Concurrency and Computation: Practice and Experience;2022-06-08

5. High frequency accuracy and loss data of random neural networks trained on image datasets;Data in Brief;2022-02