A Marine Organism Detection Framework Based on Dataset Augmentation and CNN-ViT Fusion

Author:

Jiang Xiao1,Zhang Yaxin1,Pan Mian1,Lv Shuaishuai1ORCID,Yang Gang2,Li Zhu1ORCID,Liu Jingbiao3,Yu Haibin145ORCID

Affiliation:

1. College of Electronics and Information, Hangzhou Dianzi University, Hangzhou 310018, China

2. First Institute of Oceanography, Ministry of Natural Resources, Qingdao 266061, China

3. Ocean Technology and Equipment Research Center, Hangzhou Dianzi University, Hangzhou 310018, China

4. Zhejiang Provincial Key Lab of Equipment Electronics, Hangzhou 310018, China

5. Ningbo Institute of Oceanography, Ningbo 315832, China

Abstract

Underwater vision-based detection plays an important role in marine resources exploration, marine ecological protection and other fields. Due to the restricted carrier movement and the clustering effect of some marine organisms, the size of some marine organisms in the underwater image is very small, and the samples in the dataset are very unbalanced, which aggravate the difficulty of vision detection of marine organisms. To solve these problems, this study proposes a marine organism detection framework with a dataset augmentation strategy and Convolutional Neural Networks (CNN)-Vision Transformer (ViT) fusion model. The proposed framework adopts two data augmentation methods, namely, random expansion of small objects and non-overlapping filling of scarce samples, to significantly improve the data quality of the dataset. At the same time, the framework takes YOLOv5 as the baseline model, introduces ViT, deformable convolution and trident block in the feature extraction network, and extracts richer features of marine organisms through multi-scale receptive fields with the help of the fusion of CNN and ViT. The experimental results show that, compared with various one-stage detection models, the mean average precision (mAP) of the proposed framework can be improved by 27%. At the same time, it gives consideration to both performance and real-time, so as to achieve high-precision real-time detection of the marine organisms on the underwater mobile platform.

Funder

National Key Research and Development Project of China

Publisher

MDPI AG

Subject

Ocean Engineering,Water Science and Technology,Civil and Structural Engineering

Reference39 articles.

1. Underwater target recognition methods based on the framework of deep learning: A survey;Teng;Int. J. Adv. Robot. Syst.,2020

2. An Unmixing-Based Network for Underwater Target Detection From Hyperspectral Imagery;Qi;IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens.,2021

3. Rova, A., Mori, G., and Dill, L.M. (2007). One fish, two fish, butterfish, trumpeter: Recognizing fish in underwater video. DBLP, 404–407.

4. A Biological Sensor System Using Computer Vision for Water Quality Monitoring;Yuan;IEEE Access,2018

5. Girshick, R. (2015, January 7–13). Fast R-CNN. Proceedings of the 2015 IEEE International Conference on Computer Vision (ICCV), Santiago, Chile.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3