Author:
Zhang Zhiyong,Kaveti Pushyami,Singh Hanumant,Powell Abigail,Fruh Erica,Clarke M. Elizabeth
Abstract
This paper presents a labeling methodology for marine life data using a weakly supervised learning framework. The methodology iteratively trains a deep learning model using non-expert labels obtained from crowdsourcing. This approach enables us to converge on a labeled image dataset through multiple training and production loops that leverage crowdsourcing interfaces. We present our algorithm and its results on two separate sets of image data collected using the Seabed autonomous underwater vehicle. The first dataset consists of 10,505 images that were point annotated by NOAA biologists. This dataset allows us to validate the accuracy of our labeling process. We also apply our algorithm and methodology to a second dataset consisting of 3,968 completely unlabeled images. These image categories are challenging to label, such as sponges. Qualitatively, our results indicate that training with a tiny subset and iterating on those results allows us to converge to a large, highly annotated dataset with a small number of iterations. To demonstrate the effectiveness of our methodology quantitatively, we tabulate the mean average precision (mAP) of the model as the number of iterations increases.
Subject
Ocean Engineering,Water Science and Technology,Aquatic Science,Global and Planetary Change,Oceanography
Cited by
1 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献