Affiliation:
1. College of Mechanical and Electrical Engineering, Harbin Engineering University, Harbin 150001, China
2. College of Shipbuilding Engineering, Harbin Engineering University, Harbin 150001, China
Abstract
The ability to detect underwater objects accurately is important in marine environmental engineering. Although many kinds of underwater object detection algorithms with relatively high accuracy have been proposed, they involve a large number of parameters and floating point operations (FLOPs), and often fail to yield satisfactory results in complex underwater environments. In light of the demand for an algorithm with the capability to extract high-quality features in complex underwater environments, we proposed a one-stage object detection algorithm called the enhanced feature-based underwater object detection algorithm (EF-UODA), which was based on the architecture of Next-ViT, the loss function of YOLOv8, and Ultralytics. First, we developed a highly efficient module for convolutions, called efficient multi-scale pointwise convolution (EMPC). Second, we proposed a feature pyramid architecture called the multipath fast fusion-feature pyramid network (M2F-FPN) based on different modes of feature fusion. Finally, we integrated the Next-ViT and the minimum point distance intersection over union loss functions in our proposed algorithm. Specifically, on the URPC2020 dataset, EF-UODA surpasses the state-of-the-art (SOTA) convolution-based object detection algorithm YOLOv8X by 2.9% mean average precision (mAP), and surpasses the SOTA ViT-based object detection algorithm real-time detection transformer (RT-DETR) by 2.1%. Meanwhile, it achieves the lowest FLOPs and parameters. The results of extensive experiments showed that EF-UODA had excellent feature extraction capability, and was adequately balanced in terms of the number of FLOPs and parameters.
Funder
National Key Research and Development Program of China