PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel-Reference-Cited by-同舟云学术

PredCRG: A computational method for recognition of plant circadian genes by employing support vector machine with Laplace kernel

Published:2021-04-26 Issue:1 Volume:17 Page:
ISSN:1746-4811
Container-title:Plant Methods
language:en
Short-container-title:Plant Methods

Author:

Meher Prabina Kumar^ORCID,Mohapatra Ansuman,Satpathy Subhrajit,Sharma Anuj,Saini Isha,Pradhan Sukanta Kumar,Rai Anil

Abstract

Abstract Background Circadian rhythms regulate several physiological and developmental processes of plants. Hence, the identification of genes with the underlying circadian rhythmic features is pivotal. Though computational methods have been developed for the identification of circadian genes, all these methods are based on gene expression datasets. In other words, we failed to search any sequence-based model, and that motivated us to deploy the present computational method to identify the proteins encoded by the circadian genes. Results Support vector machine (SVM) with seven kernels, i.e., linear, polynomial, radial, sigmoid, hyperbolic, Bessel and Laplace was utilized for prediction by employing compositional, transitional and physico-chemical features. Higher accuracy of 62.48% was achieved with the Laplace kernel, following the fivefold cross- validation approach. The developed model further secured 62.96% accuracy with an independent dataset. The SVM also outperformed other state-of-art machine learning algorithms, i.e., Random Forest, Bagging, AdaBoost, XGBoost and LASSO. We also performed proteome-wide identification of circadian proteins in two cereal crops namely, Oryza sativa and Sorghum bicolor, followed by the functional annotation of the predicted circadian proteins with Gene Ontology (GO) terms. Conclusions To the best of our knowledge, this is the first computational method to identify the circadian genes with the sequence data. Based on the proposed method, we have developed an R-package PredCRG (https://cran.r-project.org/web/packages/PredCRG/index.html) for the scientific community for proteome-wide identification of circadian genes. The present study supplements the existing computational methods as well as wet-lab experiments for the recognition of circadian genes.

Funder

Indian Council of Agricultural Research

Publisher

Springer Science and Business Media LLC

Subject

Plant Science,Genetics,Biotechnology

Link

https://link.springer.com/content/pdf/10.1186/s13007-021-00744-3.pdf

Reference83 articles.

1. Webb AA. The physiology of circadian rhythms in plants. New Phytol. 2003;160:281–303.

2. Dunlap JC, Loros JJ, DeCoursey P. Chronobiology: biological timekeeping. Sunderland, MA: Sinauer Associates; 2004.

3. Harmer SL, Panda S, Kay SA. Molecular bases of circadian rhythms. Annu Rev Cell Dev Biol. 2001;17:215–53.