iFeature: a Python package and web server for features extraction and selection from protein and peptide sequences

Author:

Chen Zhen1,Zhao Pei2,Li Fuyi3,Leier André45,Marquez-Lago Tatiana T45,Wang Yanan6,Webb Geoffrey I7,Smith A Ian3,Daly Roger J3,Chou Kuo-Chen89,Song Jiangning37ORCID

Affiliation:

1. School of Basic Medical Science, Qingdao University, 38 Dengzhou Road, Qingdao, China

2. State Key Laboratory of Cotton Biology, Institute of Cotton Research of Chinese Academy of Agricultural Sciences (CAAS), Anyang, China

3. Biomedicine Discovery Institute and Department of Biochemistry and Molecular Biology, Monash University, Melbourne, VIC, Australia

4. Department of Genetics, School of Medicine, University of Alabama at Birmingham, AL, USA

5. Department of Cell, Developmental and Integrative Biology, School of Medicine, University of Alabama at Birmingham, AL, USA

6. Institute of Image Processing and Pattern Recognition, Shanghai Jiao Tong University, Shanghai, China

7. Monash Centre for Data Science, Faculty of Information Technology, Monash University, Melbourne, VIC, Australia

8. Gordon Life Science Institute, Boston, MA, USA

9. Center for Informational Biology, School of Life Science and Technology, University of Electronic Science and Technology of China, Chengdu, China

Abstract

Abstract Summary Structural and physiochemical descriptors extracted from sequence data have been widely used to represent sequences and predict structural, functional, expression and interaction profiles of proteins and peptides as well as DNAs/RNAs. Here, we present iFeature, a versatile Python-based toolkit for generating various numerical feature representation schemes for both protein and peptide sequences. iFeature is capable of calculating and extracting a comprehensive spectrum of 18 major sequence encoding schemes that encompass 53 different types of feature descriptors. It also allows users to extract specific amino acid properties from the AAindex database. Furthermore, iFeature integrates 12 different types of commonly used feature clustering, selection and dimensionality reduction algorithms, greatly facilitating training, analysis and benchmarking of machine-learning models. The functionality of iFeature is made freely available via an online web server and a stand-alone toolkit. Availability and implementation http://iFeature.erc.monash.edu/; https://github.com/Superzchen/iFeature/. Supplementary information Supplementary data are available at Bioinformatics online.

Funder

Australian Research Council

National Natural Science Foundation of China

National Health and Medical Research Council of Australia

National Institute of Allergy and Infectious Diseases of the National Institutes of Health

Major Inter-Disciplinary Research

Monash University

UAB School of Medicine

Publisher

Oxford University Press (OUP)

Subject

Computational Mathematics,Computational Theory and Mathematics,Computer Science Applications,Molecular Biology,Biochemistry,Statistics and Probability

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3