Author:
Li Hao,Zhang ShiQi,Chen Lei,Pan Xiaoyong,Li ZhanDong,Huang Tao,Cai Yu-Dong
Abstract
In current biology, exploring the biological functions of proteins is important. Given the large number of proteins in some organisms, exploring their functions one by one through traditional experiments is impossible. Therefore, developing quick and reliable methods for identifying protein functions is necessary. Considerable accumulation of protein knowledge and recent developments on computer science provide an alternative way to complete this task, that is, designing computational methods. Several efforts have been made in this field. Most previous methods have adopted the protein sequence features or directly used the linkage from a protein–protein interaction (PPI) network. In this study, we proposed some novel multi-label classifiers, which adopted new embedding features to represent proteins. These features were derived from functional domains and a PPI network via word embedding and network embedding, respectively. The minimum redundancy maximum relevance method was used to assess the features, generating a feature list. Incremental feature selection, incorporating RAndom k-labELsets to construct multi-label classifiers, used such list to construct two optimum classifiers, corresponding to two key measurements: accuracy and exact match. These two classifiers had good performance, and they were superior to classifiers that used features extracted by traditional methods.
Subject
Genetics (clinical),Genetics,Molecular Medicine
Reference63 articles.
1. Mass-spectrometric Exploration of Proteome Structure and Function;Aebersold;Nature,2016
2. On Ontologies for Biologists: the Gene Ontology-Uuntangling the Web;Ashburner;Novartis Found. Symp.,2002
3. The Quantitative Proteome of a Human Cell Line;Beck;Mol. Syst. Biol.,2011
4. The InterPro Protein Families and Domains Database: 20 Years on;Blum;Nucleic Acids Res.,2021
5. Random Forests;Breiman;Mach. Learn.,2001
Cited by
8 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献