Affiliation:
1. College of Information and Communication Engineering, Harbin Engineering University, Harbin 150001, China
2. College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin 150001, China
Abstract
The underwater imaging environment is complex, and the application of conventional target detection algorithms to the underwater environment has yet to provide satisfactory results. Therefore, underwater optical image target detection remains one of the most challenging tasks involved with neighborhood-based techniques in the field of computer vision. Small underwater targets, dispersion, and sources of distortion (such as sediment and particles) often render neighborhood-based techniques insufficient, as existing target detection algorithms primarily focus on improving detection accuracy and enhancing algorithm complexity and computing power. However, excessive extraction of deep-level features leads to the loss of small targets and decrease in detection accuracy. Moreover, most underwater optical image target detection is performed by underwater unmanned platforms, which have a high demand of algorithm lightweight requirements due to the limited computing power of the underwater unmanned platform with the mobile vision processing platform. In order to meet the lightweight requirements of the underwater unmanned platform without affecting the detection accuracy of the target, we propose an underwater target detection model based on mobile vision transformer (MobileViT) and YOLOX, and we design a new coordinate attention (CA) mechanism named a double CA (DCA) mechanism. This model utilizes MobileViT as the algorithm backbone network, improving the global feature extraction ability of the algorithm and reducing the amount of algorithm parameters. The double CA (DCA) mechanism can improve the extraction of shallow features as well as the detection accuracy, even for difficult targets, using a minimum of parameters. Research validated in the Underwater Robot Professional Contest 2020 (URPC2020) dataset revealed that this method has an average accuracy rate of 72.00%. In addition, YOLOX’s ability to compress the model parameters by 49.6% efficiently achieves a balance between underwater optical image detection accuracy and parameter quantity. Compared with the existing algorithm, the proposed algorithm can carry on the underwater unmanned platform better.
Funder
National Natural Science Foundation of China
Subject
Ocean Engineering,Water Science and Technology,Civil and Structural Engineering
Reference35 articles.
1. Advancements in the field of autonomous underwater vehicle;Sahoo;Ocean. Eng.,2019
2. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23–28). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA.
3. Girshick, R. (2015, January 7–13). Fast R-CNN. Proceedings of the IEEE International Conference on Computer Vision 2015, Santiago, Chile.
4. Faster R-CNN: Towards real-time object detection with region proposal networks;Ren;Adv. Neural Info Rmation Process. Syst.,2015
5. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You Only Look Once: Unified, Real-Time Object Detection. Proceedings of the CVPR, Las Vegas, NV, USA.
Cited by
14 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献