FSN-YOLO: Nearshore Vessel Detection via Fusing Receptive-Field Attention and Lightweight Network
-
Published:2024-05-24
Issue:6
Volume:12
Page:871
-
ISSN:2077-1312
-
Container-title:Journal of Marine Science and Engineering
-
language:en
-
Short-container-title:JMSE
Author:
Du Na1, Feng Qing2, Liu Qichuang3, Li Hui3ORCID, Guo Shikai34ORCID
Affiliation:
1. Navigation College, Dalian Maritime University, Dalian 116026, China 2. School of Maritime Economics and Management, Dalian Maritime University, Dalian 116026, China 3. School of Information Science and Technology, Dalian Maritime University, Dalian 116026, China 4. Dalian Key Laboratory of Artificial Intelligence, Dalian 116024, China
Abstract
Vessel detection is critical for ensuring maritime transportation and navigational safety, creating a pressing need for detection methodologies that are more efficient, precise, and intelligent in the maritime domain. Nonetheless, accurately detecting vessels across multiple scales remains challenging due to the diversity in vessel types and locations, similarities in ship hull shapes, and disturbances from complex environmental conditions. To address these issues, we introduce an innovative FSN-YOLO framework that utilizes enhanced YOLOv8 with multi-layer attention feature fusion. Specifically, FSN-YOLO employs the backbone structure of FasterNet, enriching feature representations through super-resolution processing with a lightweight Convolutional Neural Network (CNN), thereby achieving a balance between processing speed and model size without compromising accuracy. Furthermore, FSN-YOLO uses the Receptive-Field Attention (RFA) mechanism to adaptively fine-tune the feature responses between channels, significantly boosting the network’s capacity to capture critical information and, in turn, improve the model’s overall performance and enrich the discriminative feature representation of ships. Experimental validation on the Seaship7000 dataset showed that, compared to the baseline YOLOv8l approach, FSN-YOLO considerably increased accuracy, recall rates, and mAP@0.5:0.95 by absolute margins of 0.82%, 1.54%, and 1.56%, respectively, positioning it at the forefront of current state-of-the-art models.
Funder
National Natural Science Foundation of China Fundamental Research Funds for the Central Universities Dalian Outstanding Young Talents Program
Reference33 articles.
1. Girshick, R., Donahue, J., Darrell, T., and Malik, J. (2014, January 23–28). Rich feature hierarchies for accurate object detection and semantic segmentation. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Columbus, OH, USA. 2. Girshick, R. (2015, January 7–13). Fast r-cnn. Proceedings of the IEEE International Conference on Computer Vision, Santiago, Chile. 3. Ren, S., He, K., Girshick, R., and Sun, J. (2015). Faster r-cnn: Towards real-time object detection with region proposal networks. Adv. Neural Inf. Process. Syst., 28. 4. Wei, S., Chen, H., Zhu, X., and Zhang, H. (2020, January 27–29). Ship detection in remote sensing image based on faster R-CNN with dilated convolution. Proceedings of the 2020 39th Chinese Control Conference (CCC), Shenyang, China. 5. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.
|
|