Parsers, Data Structures and Algorithms for Macromolecular Analysis Toolkit (MAT): Design and Implementation-Reference-Cited by-同舟云学术

Parsers, Data Structures and Algorithms for Macromolecular Analysis Toolkit (MAT): Design and Implementation

Published:2019-04-11 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Kalyan Gazal^ORCID,Junghare Vivek,S S John,Chattopadhyay Anupam,Mitra Pralay,Hazra Saugata

Abstract

AbstractThe structural information of biological macromolecules are stored in .pdb, .mm-cif and lately mmtf files and thus it requires accurate and efficient biological tools for various utilities. Here, we describe Macromolecular Analysis Toolkit (MAT) that parses .pdb, .mmcif and .mmtf files; and builds data structures from the input. This original program is written in C++ programming language to ensure efficiency and consistency to organize structural information in an integral way. The novelty of the program lies in the addition of new structure-based biological algorithms and applications. This package also stands out from other similar libraries by being 1) faster and 2) accurate. We also provide detailed comparison of available parsers on the whole PDB database. The parser of MAT is designed in such a way that it allows quick extraction and organized loading of the core data structure. The same data structure is extended to accommodate information from the .mmcif and .mmtf file parsers. Tokenization of the data allows the extraction of information from disordered text, making it compatible for accurate identification of the entities present in the .pdb file. Additionally, we add a new approach of performance optimization by creating a few derived data structures, namely kD-Tree, Octree and graphs, for certain applications that need spatial coordinate calculations. MAT provides advanced data structure which is time efficient and is designed to avail reusability and consistency in a systematic framework. MAT parser can be accessed online through bitbucket at https://bitbucket.org/gazalk/pdb_parser/.

Publisher

Cold Spring Harbor Laboratory

Reference54 articles.

1. The protein data bank: A computer-based archival file for macromolecular structures

2. BIDS apps: Improving ease of use, accessibility, and reproducibility of neuroimaging data analysis methods

3. Z. Honguy , J. Michael , M. Parag , C++ computational libraries for bioinformatics, version 0.3 (2006). URL http://biocpp.sourceforge.net/

4. R. Daniel , A simple c++ pdb reader (2004). URL http://graphics.stanford.edu/~drussel/pdb/index.html

5. Design and application of PDBlib, a C++ macromolecular class library

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Split-bucket partition (SBP): a novel execution model for top-K and selection algorithms on GPUs;The Journal of Supercomputing;2024-03-29

2. Synthesis of Dihydrobenzofuro[3,2‐ b ]chromenes as Potential 3CLpro Inhibitors of SARS‐CoV‐2: A Molecular Docking and Molecular Dynamics Study;ChemMedChem;2022-02-17

3. Anti-hypertensive Peptide Predictor: A Machine Learning-Empowered Web Server for Prediction of Food-Derived Peptides with Potential Angiotensin-Converting Enzyme-I Inhibitory Activity;Journal of Agricultural and Food Chemistry;2021-12-02

4. Understanding structure-based dynamic interactions of antihypertensive peptides extracted from food sources;Journal of Biomolecular Structure and Dynamics;2020-02-12