Affiliation:
1. Department of Molecular Medicine, Computational Medicine Group, University of Padova, Padova (PD), 35131, Italy
Abstract
Abstract:
Predicting the function of proteins is a major challenge in the scientific community, particularly
in the post-genomic era. Traditional methods of determining protein functions, such as experiments,
are accurate but can be resource-intensive and time-consuming. The development of Next Generation
Sequencing (NGS) techniques has led to the production of a large number of new protein sequences,
which has increased the gap between available raw sequences and verified annotated sequences.
To address this gap, automated protein function prediction (AFP) techniques have been developed as
a faster and more cost-effective alternative, aiming to maintain the same accuracy level.
:
Several automatic computational methods for protein function prediction have recently been developed
and proposed. This paper reviews the best-performing AFP methods presented in the last decade and
analyzes their improvements over time to identify the most promising strategies for future methods.
:
Identifying the most effective method for predicting protein function is still a challenge. The Critical
Assessment of Functional Annotation (CAFA) has established an international standard for evaluating
and comparing the performance of various protein function prediction methods. In this study, we analyze
the best-performing methods identified in recent editions of CAFA. These methods are divided into
five categories based on their principles of operation: sequence-based, structure-based, combined-based,
ML-based and embeddings-based.
:
After conducting a comprehensive analysis of the various protein function prediction methods, we observe
that there has been a steady improvement in the accuracy of predictions over time, mainly due to
the implementation of machine learning techniques. The present trend suggests that all the bestperforming
methods will use machine learning to improve their accuracy in the future.
:
We highlight the positive impact that the use of machine learning (ML) has had on protein function prediction.
Most recent methods developed in this area use ML, demonstrating its importance in analyzing
biological information and making predictions. Despite these improvements in accuracy, there is still a
significant gap compared with experimental evidence. The use of new approaches based on Deep
Learning (DL) techniques will probably be necessary to close this gap, and while significant progress
has been made in this area, there is still more work to be done to fully realize the potential of DL.
Funder
Ministero dell’Istruzione, dell’Università e della Ricerca, PON
Università degli Studi di Padova, Italy
Publisher
Bentham Science Publishers Ltd.
Subject
Computational Mathematics,Genetics,Molecular Biology,Biochemistry
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献