PredPSD: A Gradient Tree Boosting Approach for Single-Stranded and Double-Stranded DNA Binding Protein Prediction

Author:

Tan Changgeng,Wang Tong,Yang Wenyi,Deng LeiORCID

Abstract

Interactions between proteins and DNAs play essential roles in many biological processes. DNA binding proteins can be classified into two categories. Double-stranded DNA-binding proteins (DSBs) bind to double-stranded DNA and are involved in a series of cell functions such as gene expression and regulation. Single-stranded DNA-binding proteins (SSBs) are necessary for DNA replication, recombination, and repair and are responsible for binding to the single-stranded DNA. Therefore, the effective classification of DNA-binding proteins is helpful for functional annotations of proteins. In this work, we propose PredPSD, a computational method based on sequence information that accurately predicts SSBs and DSBs. It introduces three novel feature extraction algorithms. In particular, we use the autocross-covariance (ACC) transformation to transform feature matrices into fixed-length vectors. Then, we put the optimal feature subset obtained by the minimal-redundancy-maximal-relevance criterion (mRMR) feature selection algorithm into the gradient tree boosting (GTB). In 10-fold cross-validation based on a benchmark dataset, PredPSD achieves promising performances with an AUC score of 0.956 and an accuracy of 0.912, which are better than those of existing methods. Moreover, our method has significantly improved the prediction accuracy in independent testing. The experimental results show that PredPSD can significantly recognize the binding specificity and differentiate DSBs and SSBs.

Funder

National Natural Science Foundation of China

Publisher

MDPI AG

Subject

Chemistry (miscellaneous),Analytical Chemistry,Organic Chemistry,Physical and Theoretical Chemistry,Molecular Medicine,Drug Discovery,Pharmaceutical Science

Reference71 articles.

1. Role of the single-stranded DNA-binding protein SsbB in pneumococcal transformation: Maintenance of a reservoir for genetic plasticity;Laetitia;PLoS Genet.,2011

2. Single- and double-stranded DNA binding proteins act in concert to conserve a telomeric DNA core sequence

3. Single-stranded DNA-binding protein hSSB1 is critical for genomic stability

4. Telomere DNA recognition in Saccharomycotina yeast: Potential lessons for the co-evolution of ssDNA and dsDNA-binding proteins and their target sites;Olga;Front. Genet.,2015

5. Human single-stranded DNA binding protein 1 (hSSB1, OBFC2B), a critical component of the DNA damage response

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3