Affiliation:
1. College of Big Data and Intelligent Engineering, Southwest Forestry University, Kunming 650224, China
2. Key Lab of State Forestry and GrassIand Administration on Forestry Ecological Big Data, Southwest Forestry University, Kunming 650224, China
Abstract
Automated fruit-picking equipment has the potential to significantly enhance the efficiency of picking. Accurate detection and localization of fruits are particularly crucial in this regard. However, current methods rely on expensive tools such as depth cameras and LiDAR. This study proposes a low-cost method based on monocular images to achieve target detection and depth estimation. To improve the detection accuracy of targets, especially small targets, an advanced YOLOv8s detection algorithm is introduced. This approach utilizes the BiFormer block, an attention mechanism for dynamic query-aware sparsity, as the backbone feature extractor. It also adds a small-target-detection layer in the Neck and employs EIoU Loss as the loss function. Furthermore, a fused depth estimation method is proposed, which incorporates high-resolution, low-resolution, and local high-frequency depth estimation to obtain depth information with both high-frequency details and low-frequency structure. Finally, the spatial 3D coordinates of the fruit are obtained by fusing the planar coordinates and depth information. The experimental results with citrus as the target result in an improved YOLOv8s network mAP of 88.45% and a recognition accuracy of 94.7%. The recognition of citrus in a natural environment was improved by 2.7% compared to the original model. In the detection range of 30 cm~60 cm, the depth-estimation results (MAE, RSME) are 0.53 and 0.53. In the illumination intensity range of 1000 lx to 5000 lx, the average depth estimation results (MAE, RSME) are 0.49 and 0.64. In the simulated fruit-picking scenario, the success rates of grasping at 30 cm and 45 cm were 80.6% and 85.1%, respectively. The method has the advantage of high-resolution depth estimation without constraints of camera parameters and fruit size that monocular geometric and binocular localization do not have, providing a feasible and low-cost localization method for fruit automation equipment.
Funder
Agricultural Joint Project of Yunnan Province
Key Laboratory of State Forestry and Grass and Administration on Forestry Ecological Big Data, Southwest Forestry University
Subject
Agronomy and Crop Science
Reference36 articles.
1. Xie, S., Girshick, R., Dollár, P., Tu, Z., and He, K. (2017, January 21–26). Aggregated residual transformations for deep neural networks. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA.
2. Implementing bilinear interpolation with quantum images;Yan;Digit. Signal Process.,2021
3. Zheng, S., Lin, Z., Xie, Z., Liao, M., Gao, S., Zhang, X., and Qiu, T. (2021, January 26–28). Maturity recognition of citrus fruits by Yolov4 neural network. Proceedings of the 2021 IEEE 2nd International Conference on Big Data, Artificial Intelligence and Internet of Things Engineering (ICBAIE), Nanchang, China.
4. Real-time growth stage detection model for high degree of occultation using DenseNet-fused YOLOv4;Roy;Comput. Electron. Agric.,2022
5. Litchi detection in the field using an improved YOLOv3 model;Peng;Int. J. Agric. Biol. Eng.,2022
Cited by
5 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献