Research on the Visual Perception of Ship Engine Rooms Based on Deep Learning
-
Published:2023-07-20
Issue:7
Volume:11
Page:1450
-
ISSN:2077-1312
-
Container-title:Journal of Marine Science and Engineering
-
language:en
-
Short-container-title:JMSE
Author:
Wang Yongkang1, Zhang Jundong1, Zhu Jinting1ORCID, Ge Yuequn1, Zhai Guanyu1
Affiliation:
1. College of Marine Engineering, Dalian Maritime University, Dalian 116026, China
Abstract
In the intelligent engine room, the visual perception of ship engine room equipment is the premise of defect identification and the replacement of manual operation. This paper improves YOLOv5 for the problems of mutual occlusion of cabin equipment, an unbalanced number of different categories, and a large proportion of small targets. First, a coordinate attention (CA) mechanism is introduced into the backbone-extraction network to improve the ability of the network to extract main features. Secondly, this paper improves the neck network so that the network can learn a relatively important resolution for feature-fusion and enrich the semantic information between different layers. At the same time, this paper uses the Swin transformer as the prediction head (SPH). This enables the network to establish global connections in complex environments, which can improve detection accuracy. In order to solve the problem of cabin equipment covering each other, this paper replaces the original non-maxima suppression (NMS) with Soft-NMS. Finally, this paper uses the K-means algorithm based on the genetic algorithm to cluster new anchor boxes to match the dataset better. This paper is evaluated on the laboratory’s engine room equipment dataset (EMER) and the public dataset PASCAL VOC. Compared with YOLOv5m, the mAP of CBS-YOLOv5m increased by 3.34% and 1.8%, respectively.
Funder
High-technology Ship Research Program
Subject
Ocean Engineering,Water Science and Technology,Civil and Structural Engineering
Reference37 articles.
1. Zhang, H., Cisse, M., Dauphin, Y.N., and Lopez-Paz, D. (2017). mixup: Beyond Empirical Risk Minimization. arXiv. 2. Bochkovskiy, J., Wang, C.Y., and Liao, H.Y.M. (2020). YOLOv4: Optimal Speed and Accuracy of Object Detection. arXiv. 3. Dalal, N., and Triggs, B. (2005, January 20–25). Histograms of oriented gradients for human detection. Proceedings of the 2005 IEEE Computer Society Conference on Computer Vision and Pattern Recognition (CVPR’05), San Diego, CA, USA. 4. Distinctive image features from scale-invariant keypoints;Lowe;Int. J. Comput. Vis.,2004 5. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23–28). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.
|
|