Synthesized Image Training Techniques: On Improving Model Performance using Confusion.

Author:

Idris Azeez1ORCID,Khaleel Mohammed1ORCID,Tavanapong Wallapak1ORCID,C. De Groen Piet2ORCID

Affiliation:

1. Department of Computer Science, Iowa State University

2. Department of Medicine, University of Minnesota

Abstract

The performance of supervised deep learning image classifiers has significantly improved with large, labeled datasets and increased computing power. However, obtaining large, labeled image datasets in areas like medicine is expensive. This study seeks to improve model performance on limited labeled datasets by reducing confusion. We observed that misclassification (or confusion) between classes is usually more prevalent between specific classes. Thus, we developed synthesized image training techniques (SIT2), a novel confusion-based training framework that identifies pairs of classes with high confusion and synthesizes not-sure images from these pairs. The not-sure images are utilized in three new training strategies as follows. (1) The not-sure training strategy pretrains a model using not-sure images and the original training images. (2) The sure-or-not strategy pretrains with synthesized sure or not-sure images. (3) The multi-label strategy pretrains with synthesized images but predicts the original class(es) of the synthesized images. Finally, the pretrained model is finetuned on the original dataset. An extensive evaluation was conducted on five medical and non-medical datasets. Several improvements are statistically significant, which shows the promising future of our confusion-based training framework.

Publisher

Association for Computing Machinery (ACM)

Reference50 articles.

1. ImageNet: A large-scale hierarchical image database

2. Jeremy Irvin et al, “CheXpert: A Large Chest Radiograph Dataset with Uncertainty Labels and Expert Comparison,” presented at the AAAI, 2019. [Online]. Available: https://arxiv.org/pdf/1901.07031.pdf

3. “The 2nd Learning from Limited Labeled Data (LLD) Workshop,” Learning with Limited Labeled Data. https://lld-workshop.github.io/(accessed Aug. 18, 2022).

4. Hippocampal atrophy based Alzheimer’s disease diagnosis via machine learning methods

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3