CIS: A Coral Instance Segmentation Network Model with Novel Upsampling, Downsampling, and Fusion Attention Mechanism-Reference-Cited by-同舟云学术

CIS: A Coral Instance Segmentation Network Model with Novel Upsampling, Downsampling, and Fusion Attention Mechanism

Published:2024-08-28 Issue:9 Volume:12 Page:1490
ISSN:2077-1312
Container-title:Journal of Marine Science and Engineering
language:en
Short-container-title:JMSE

Author:

Li Tianrun¹^ORCID,Liang Zhengyou¹²^ORCID,Zhao Shuqi³

Affiliation:

1. School of Computer, Electronics and Information, Guangxi University, Nanning 530004, China

2. Guangxi Key Laboratory of Multimedia Communications and Network Technology, Nanning 530004, China

3. School of Marine Sciences, Guangxi University, Nanning 530004, China

Abstract

Coral segmentation poses unique challenges due to its irregular morphology and camouflage-like characteristics. These factors often result in low precision, large model parameters, and poor real-time performance. To address these issues, this paper proposes a novel coral instance segmentation (CIS) network model. Initially, we designed a novel downsampling module, ADown_HWD, which operates at multiple resolution levels to extract image features, thereby preserving crucial information about coral edges and textures. Subsequently, we integrated the bi-level routing attention (BRA) mechanism into the C2f module to form the C2f_BRA module within the neck network. This module effectively removes redundant information, enhancing the ability to distinguish coral features and reducing computational redundancy. Finally, dynamic upsampling, Dysample, was introduced into the CIS to better retain the rich semantic and key feature information of corals. Validation on our self-built dataset demonstrated that the CIS network model significantly outperforms the baseline YOLOv8n model, with improvements of 6.3% and 10.5% in PB and PM and 2.3% and 2.4% in mAP50B and mAP50M, respectively. Furthermore, the reduction in model parameters by 10.1% correlates with a notable 10.7% increase in frames per second (FPS) to 178.6, thus effectively meeting real-time operational requirements.

Funder

Undergraduate Innovation and Entrepreneurship Training Program of Guangxi University

Publisher

MDPI AG

Link

https://www.mdpi.com/2077-1312/12/9/1490/pdf

Reference50 articles.

1. Candela, A., Edelson, K., Gierach, M.M., Thompson, D.R., Woodward, G., and Wettergreen, D. (2021). Using remote sensing and in situ measurements for efficient mapping and optimal sampling of coral reefs. Front. Mar. Sci., 8.

2. Liu, W., Anguelov, D., Erhan, D., Szegedy, C., Reed, S., Fu, C.-Y., and Berg, A.C. (2016, January 11–14). Ssd: Single shot multibox detector. Proceedings of the Computer Vision–ECCV 2016: 14th European Conference, Amsterdam, The Netherlands.

3. Redmon, J., Divvala, S., Girshick, R., and Farhadi, A. (2016, January 27–30). You only look once: Unified, real-time object detection. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Las Vegas, NV, USA.

4. Wang, C.-Y., Bochkovskiy, A., and Liao, H.-Y.M. (2023, January 18–22). YOLOv7: Trainable bag-of-freebies sets new state-of-the-art for real-time object detectors. Proceedings of the IEEE/CVF Conference on Computer Vision and Pattern Recognition, Vancouver, BC, Canada.

5. Wang, C.-Y., Yeh, I.-H., and Liao, H.-Y.M. (2024, August 14). YOLOv9: Learning What You Want to Learn Using Programmable Gradient Information. Available online: https://github.com/WongKinYiu/yolov9.