Affiliation:
1. School of Software Engineering, Tongji University, China
2. Department of Computer and Information Science, University of Macau, China
Abstract
Long-term monitoring and recognition of underwater organism objects are of great significance in marine ecology, fisheries science and many other disciplines. Traditional techniques in this field, including manual fishing-based ones and sonar-based ones, are usually flawed. Specifically, the method based on manual fishing is time-consuming and unsuitable for scientific researches, while the sonar-based one, has the defects of low acoustic image accuracy and large echo errors. In recent years, the rapid development of deep learning and its excellent performance in computer vision tasks make vision-based solutions feasible. However, the researches in this area are still relatively insufficient in mainly two aspects. First, to our knowledge, there is still a lack of large-scale datasets of underwater organism images with accurate annotations. Second, in consideration of the limitation on hardware resources of underwater devices, an underwater organism detection algorithm that is both accurate and lightweight enough to be able to infer in real time is still lacking. As an attempt to fill in the aforementioned research gaps to some extent, we established the Multiple Kinds of Underwater Organisms (MKUO) dataset with accurate bounding box annotations of taxonomic information, which consists of 10,043 annotated images, covering eighty-four underwater organism categories. Based on our benchmark dataset, we evaluated a series of existing object detection algorithms to obtain their accuracy and complexity indicators as the baseline for future reference. In addition, we also propose a novel lightweight module, namely Sparse Ghost Module, designed especially for object detection networks. By substituting the standard convolution with our proposed one, the network complexity can be significantly reduced and the inference speed can be greatly improved without obvious detection accuracy loss. To make our results reproducible, the dataset and the source code are available online at
https://cslinzhang.github.io/MKUO-and-Sparse-Ghost-Module/
.
Funder
National Natural Science Foundation of China
Shanghai Science and Technology Innovation Plan
Shuguang Program of Shanghai Education Development Foundation and Shanghai Municipal Education Commission
Fundamental Research Funds for the Central Universities
Publisher
Association for Computing Machinery (ACM)
Reference60 articles.
1. Local inter-session variability modelling for object classification
2. Bias in hydroacoustic estimates of fish abundance due to acoustic shadowing: Evidence from day–night surveys of vertically migrating fish;Appenzeller A. R.;Canadian Journal of Fisheries and Aquatic Sciences,1992
3. O. Beijbom, P. J. Edmunds, D. I. Kline, B. G. Mitchell, and D. Kriegman. 2012. Automated annotation of coral reef survey images. In IEEE Conference on Computer Vision and Pattern Recognition. 1170–1177.
4. Improving automated annotation of benthic survey images using wide-band fluorescence;Beijbom O.;Scientific Reports,2016
5. B. J. Boom, P. X. Huang, J. He, and R. B. Fisher. 2012. Supporting ground-truth annotation of image datasets using clustering. In International Conference on Pattern Recognition. 1542–1545.