An Effective Integrated Machine Learning Framework for Identifying Severity of Tomato Yellow Leaf Curl Virus and Their Experimental Validation

Author:

Bupi Nattanong1,Sangaraju Vinoth Kumar2,Phan Le Thi2,Lal Aamir1,Vo Thuy Thi Bich1,Ho Phuong Thi1,Qureshi Muhammad Amir1,Tabassum Marjia1,Lee Sukchan1,Manavalan Balachandran2

Affiliation:

1. Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

2. Computational Biology and Bioinformatics Laboratory, Department of Integrative Biotechnology, College of Biotechnology and Bioengineering, Sungkyunkwan University, Suwon 16419, Gyeonggi-do, Republic of Korea.

Abstract

Tomato yellow leaf curl virus (TYLCV) dispersed across different countries, specifically to subtropical regions, associated with more severe symptoms. Since TYLCV was first isolated in 1931, it has been a menace to tomato industrial production worldwide over the past century. Three groups were newly isolated from TYLCV-resistant tomatoes in 2022; however, their functions are unknown. The development of machine learning (ML)-based models using characterized sequences and evaluating blind predictions is one of the major challenges in interdisciplinary research. The purpose of this study was to develop an integrated computational framework for the accurate identification of symptoms (mild or severe) based on TYLCV sequences (isolated in Korea). For the development of the framework, we first extracted 11 different feature encodings and hybrid features from the training data and then explored 8 different classifiers and developed their respective prediction models by using randomized 10-fold cross-validation. Subsequently, we carried out a systematic evaluation of these 96 developed models and selected the top 90 models, whose predicted class labels were combined and considered as reduced features. On the basis of these features, a multilayer perceptron was applied and developed the final prediction model (IML-TYLCVs). We conducted blind prediction on 3 groups using IML-TYLCVs, and the results indicated that 2 groups were severe and 1 group was mild. Furthermore, we confirmed the prediction with virus-challenging experiments of tomato plant phenotypes using infectious clones from 3 groups. Plant virologists and plant breeding professionals can access the user-friendly online IML-TYLCVs web server at https://balalab-skku.org/IML-TYLCVs , which can guide them in developing new protection strategies for newly emerging viruses.

Publisher

American Association for the Advancement of Science (AAAS)

Subject

Multidisciplinary

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3