Sequence alignment using machine learning for accurate template-based protein structure prediction-Reference-Cited by-同舟云学术

Sequence alignment using machine learning for accurate template-based protein structure prediction

Published:2019-06-14 Issue:1 Volume:36 Page:104-111
ISSN:1367-4803
Container-title:Bioinformatics
language:en
Short-container-title:

Author:

Makigaki Shuichiro¹^ORCID,Ishida Takashi¹

Affiliation:

1. Department of Computer Science, School of Computing, Tokyo Institute of Technology, Meguro-ku, Tokyo 152-8550, Japan

Abstract

Abstract Motivation Template-based modeling, the process of predicting the tertiary structure of a protein by using homologous protein structures, is useful if good templates can be found. Although modern homology detection methods can find remote homologs with high sensitivity, the accuracy of template-based models generated from homology-detection-based alignments is often lower than that from ideal alignments. Results In this study, we propose a new method that generates pairwise sequence alignments for more accurate template-based modeling. The proposed method trains a machine learning model using the structural alignment of known homologs. It is difficult to directly predict sequence alignments using machine learning. Thus, when calculating sequence alignments, instead of a fixed substitution matrix, this method dynamically predicts a substitution score from the trained model. We evaluate our method by carefully splitting the training and test datasets and comparing the predicted structure’s accuracy with that of state-of-the-art methods. Our method generates more accurate tertiary structure models than those produced from alignments obtained by other methods. Availability and implementation https://github.com/shuichiro-makigaki/exmachina. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

JSPS KAKENHI

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

Link

http://academic.oup.com/bioinformatics/advance-article-pdf/doi/10.1093/bioinformatics/btz483/28934934/btz483.pdf

Reference33 articles.

1. Basic local alignment search tool;Altschul;J. Mol. Biol,1990

2. Gapped BLAST and PSI-BLAST: a new generation of protein database search programs;Altschul;Nucleic Acids Res,1997

3. Protein Data Bank: the single global archive for 3D macromolecular structure data;Burley;Nucleic Acids Res,2018

4. Domain enhanced lookup time accelerated BLAST;Boratyn;Biol. Direct,2012

5. Deepqa: improving the estimation of single protein model quality with deep belief networks;Cao;BMC Bioinformatics,2016

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Protein subcellular localization prediction tools;Computational and Structural Biotechnology Journal;2024-12

2. Bioinspired Algorithms for Multiple Sequence Alignment: A Systematic Review and Roadmap;Applied Sciences;2024-03-13

3. Machine learning on alignment features for parent-of-origin classification of simulated hybrid RNA-seq;BMC Bioinformatics;2024-03-12

4. Bioinformatics-aided Protein Sequence Analysis and Engineering;Current Protein & Peptide Science;2023-07

5. Efficient mapping of RNA‐binding residues in RNA‐binding proteins using local sequence features of binding site residues in protein‐RNA complexes;Proteins: Structure, Function, and Bioinformatics;2023-05-31