On the Importance of Diversity When Training Deep Learning Segmentation Models with Error-Prone Pseudo-Labels

Author:

Yang Nana1,Rongione Charles2,Jacquemart Anne-Laure2,Draye Xavier2,Vleeschouwer Christophe De1ORCID

Affiliation:

1. ICTEAM Institute, UCLouvain, 1348 Louvain-la-Neuve, Belgium

2. ELI Institute, UCLouvain, 1348 Louvain-la-Neuve, Belgium

Abstract

The key to training deep learning (DL) segmentation models lies in the collection of annotated data. The annotation process is, however, generally expensive in human resources. Our paper leverages deep or traditional machine learning methods trained on a small set of manually labeled data to automatically generate pseudo-labels on large datasets, which are then used to train so-called data-reinforced deep learning models. The relevance of the approach is demonstrated in two applicative scenarios that are distinct both in terms of task and pseudo-label generation procedures, enlarging the scope of the outcomes of our study. Our experiments reveal that (i) data reinforcement helps, even with error-prone pseudo-labels, (ii) convolutional neural networks have the capability to regularize their training with respect to labeling errors, and (iii) there is an advantage to increasing diversity when generating the pseudo-labels, either by enriching the manual annotation through accurate annotation of singular samples, or by considering soft pseudo-labels per sample when prior information is available about their certainty.

Funder

China Scholarship Council

Belgian F.N.R.S

Publisher

MDPI AG

Reference54 articles.

1. Visualizing the effects of predictor variables in black box supervised learning models;Apley;J. R. Stat. Soc. Ser. B Stat. Methodol.,2020

2. Supervised learning: Classification;Castelli;Encycl. Bioinform. Comput. Biol.,2018

3. Application of supervised learning to validation of damage detection;Sarmadi;Arch. Appl. Mech.,2021

4. Zhou, Z.H. (2021). Semi-supervised learning. Machine Learning, Springer.

5. Ouali, Y., Hudelot, C., and Tami, M. (2020). An overview of deep semi-supervised learning. arXiv.

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3