Author:
Wang Yachao,Liu Yaju,Wang Dongxuan,Liu Yu
Abstract
As deep learning continues to advance, object detection technology holds potential and promising prospects in the recognition of cylindrical objects’ quantity, such as in industries like timber processing, construction, and pipeline engineering. The traditional manual counting methods have lower efficiency, a higher error rate, and demand a greater amount of manpower. The introduction of object detection technology can effectively address these issues, enhance work efficiency, and reduce labor costs. Therefore, this research paper introduces a novel variant of the YOLOv5s algorithm, called YOLOv5-COC, specifically designed to tackle the task of counting cylindrical objects. This paper makes the following significant contributions: Firstly, introducing the utilization of data augmentation techniques to augment the dataset, thereby enhancing the generalization ability of the model. Secondly, the K-means++ algorithm is employed as an alternative to the conventional K-means algorithm in order to enhance the initialization of anchor boxes. Thirdly, introduce distinct methodologies, including the incorporation of a coordinated attention mechanism, the amalgamation of the Bidirectional Feature Pyramid Network (BiFPN), and the substitution of the loss function, in order to further refine the model and enhance its recognition precision. Finally, employ ablation experiments to assess the optimization outcomes of the aforementioned methodologies. The experimental results reveal that the YOLOv5-COC model proposed in this study attains an mAP of 98.7%, operates at a frame rate of 60 FPS, attains a Precision of 98.3%, and boasts a Recall of 99.1%. The mAP@0.5:0.95 stands at 72.4%. In comparison to the original YOLOv5s model, the mAP value exhibits an improvement of 1.3%, the FPS experiences a remarkable surge of 27.7%, detection accuracy elevates by 1%, the recall rate advances by 1%, and the mAP@0.5:0.95 escalates by 3.5%. In summary, the YOLOv5-COC model demonstrates a sufficiently high level of accuracy in object detection tasks, mitigating instances of both false negatives and false positives. It efficiently accomplishes the task of object detection.