Affiliation:
1. College of Computer and Information Engineering, Xinjiang Agricultural University, Urumqi 830052, China
2. Engineering Research Center of Intelligent Agriculture Ministry of Education, Urumqi 830052, China
3. Xinjiang Agricultural Informatization Engineering Technology Research Center, Urumqi 830052, China
4. School of Information Engineering, Henan Institute of Science and Technology, Xinxiang 453003, China
5. College of Forestry and Landscape Architecture, Xinjiang Agricultural University, Urumqi 830052, China
6. Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100080, China
Abstract
Accurately recognizing apples in complex environments is essential for automating apple picking operations, particularly under challenging natural conditions such as cloudy, snowy, foggy, and rainy weather, as well as low-light situations. To overcome the challenges of reduced apple target detection accuracy due to branch occlusion, apple overlap, and variations between near and far field scales, we propose the Rep-ViG-Apple algorithm, an advanced version of the YOLO model. The Rep-ViG-Apple algorithm features a sophisticated architecture designed to enhance apple detection performance in difficult conditions. To improve feature extraction for occluded and overlapped apple targets, we developed the inverted residual multi-scale structural reparameterized feature extraction block (RepIRD Block) within the backbone network. We also integrated the sparse graph attention mechanism (SVGA) to capture global feature information, concentrate attention on apples, and reduce interference from complex environmental features. Moreover, we designed a feature extraction network with a CNN-GCN architecture, termed Rep-Vision-GCN. This network combines the local multi-scale feature extraction capabilities of a convolutional neural network (CNN) with the global modeling strengths of a graph convolutional network (GCN), enhancing the extraction of apple features. The RepConvsBlock module, embedded in the neck network, forms the Rep-FPN-PAN feature fusion network, which improves the recognition of apple targets across various scales, both near and far. Furthermore, we implemented a channel pruning algorithm based on LAMP scores to balance computational efficiency with model accuracy. Experimental results demonstrate that the Rep-ViG-Apple algorithm achieves precision, recall, and average accuracy of 92.5%, 85.0%, and 93.3%, respectively, marking improvements of 1.5%, 1.5%, and 2.0% over YOLOv8n. Additionally, the Rep-ViG-Apple model benefits from a 22% reduction in size, enhancing its efficiency and suitability for deployment in resource-constrained environments while maintaining high accuracy.
Funder
Natural Science Foundation of Xinjiang Uygur Autonomous Region
Autonomous Region Postgraduate Research Innovation Project
Science and Technology Innovation 2030—“New Generation Artificial Intelligence” Major Project
Xinjiang Uygur Autonomous Region Major Science and Technology Project “Research on Key Technologies for Farm Digitalization and Intelligentization”
Reference45 articles.
1. Measurement of Concentration of Apple Production in China’s Main Production Areas and Analysis of Their Competitiveness;Wang;J. Hebei Agric. Sci.,2023
2. Current status and future development of the key technologies for apple picking robots;Chen;Trans. Chin. Soc. Agric. Eng. (Trans. CSAE),2023
3. Development trend of apple industry in China since 2000;Chang;North. Hortic.,2021
4. FBoT-Net: Focal bottleneck transformer network for small green apple detection;Sun;Comput. Electron. Agric.,2023
5. Yao, Q., Zheng, X., Zhou, G., and Zhang, J. (2024). SGR-YOLO: A method for detecting seed germination rate in wild rice. Front. Plant Sci., 14.