A Multi-Modal Open Object Detection Model for Tomato Leaf Diseases with Strong Generalization Performance Using PDC-VLD

Author:

Li Jinyang1ORCID,Zhao Fengting1ORCID,Zhao Hongmin1,Zhou Guoxiong1ORCID,Xu Jiaxin1ORCID,Gao Mingzhou2ORCID,Li Xin3ORCID,Dai Weisi1ORCID,Zhou Honliang1ORCID,Hu Yahui4ORCID,He Mingfang1ORCID

Affiliation:

1. Central South University of Forestry and Technology, Changsha 410004, Hunan, China.

2. Inner Mongolia Agriculture University, Hohhot 010010, Inner Mongolia Autonomous Region, China.

3. Inner Mongolia University, Hohhot 010021, Inner Mongolia Autonomous Region, China.

4. Plant Protection Institute, Hunan Academy of Agricultural Sciences, Changsha 410125, Hunan, China.

Abstract

Precise disease detection is crucial in modern precision agriculture, especially in ensuring the health of tomato crops and enhancing agricultural productivity and product quality. Although most existing disease detection methods have helped growers identify tomato leaf diseases to some extent, these methods typically target fixed categories. When faced with new diseases, extensive and costly manual annotation is required to retrain the dataset. To overcome these limitations, this study proposes a multimodal model PDC-VLD based on the open-vocabulary object detection (OVD) technology within the VLDet framework, which can accurately identify new tomato leaf diseases without manual annotation by using only image–text pairs. First, we developed a progressive visual transformer-convolutional pyramid module (PVT-C) that effectively extracts tomato leaf disease features and optimizes anchor box positioning using the self-supervised learning algorithm DINO, suppressing interference from irrelevant backgrounds. Then, a context feature guided module (CFG) was adopted to address the low adaptability and recognition accuracy of the model in data-scarce environments. To validate the model’s effectiveness, we constructed a tomato leaf disease image dataset containing 4 base classes and 2 new categories. Experimental results show that the PDC-VLD model achieved 61.2% on the main evaluation metric mAP novel 50 , and 56.4% on mAP novel 75 , 87.7% on mAP base 50 , 81.0% on mAP all 50 , and 45.5% on average recall, outperforming existing OVD models. Our research provides an innovative solution for efficiently and accurately detecting new diseases, substantially reducing the need for manual annotation, and offering critical technical support and practical reference for agricultural workers.

Funder

Science and Technology Bureau, Changsha

National Natural Science Foundation in China

key projects of Department of Education Hunan Province

Hunan Key Laboratory of Intelligent Logistics Technology

National Natural Science Foundation of China

Publisher

American Association for the Advancement of Science (AAAS)

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3