ADQE: Obtain Better Deep Learning Models by Evaluating the Augmented Data Quality Using Information Entropy-Reference-Cited by-同舟云学术

ADQE: Obtain Better Deep Learning Models by Evaluating the Augmented Data Quality Using Information Entropy

Published:2023-09-28 Issue:19 Volume:12 Page:4077
ISSN:2079-9292
Container-title:Electronics
language:en
Short-container-title:Electronics

Author:

Cui Xiaohui¹²^ORCID,Li Yu¹²,Xie Zheng¹²,Liu Hanzhang¹,Yang Shijie¹,Mou Chao¹²^ORCID

Affiliation:

1. School of Information Science and Technology, Beijing Forestry University, Beijing 100083, China

2. Engineering Research Center for Forestry-Oriented Intelligent Information Processing of National Forestry and Grassland Administration, Beijing 100083, China

Abstract

Data augmentation, as a common technique in deep learning training, is primarily used to mitigate overfitting problems, especially with small-scale datasets. However, it is difficult for us to evaluate whether the augmented dataset truly benefits the performance of the model. If the training model is relied upon in each case to validate the quality of the data augmentation and the dataset, it will take a lot of time and resources. This article proposes a simple and practical approach to evaluate the quality of data augmentation for image classification tasks, enriching the theoretical research on data augmentation quality evaluation. Based on the information entropy, multiple dimensional metrics for data quality augmentation are established, including diversity, class balance, and task relevance. Additionally, a comprehensive data augmentation quality fusion metric is proposed. Experimental results on the CIFAR-10 and CUB-200 datasets show that our method maintains optimal performance in a variety of scenarios. The cosine similarity between the score of our method and the precision of model is up to 99.9%. A rigorous evaluation of data augmentation quality is necessary to guide the improvement of DL model performance. The quality standards and evaluation defined in this article can be utilized by researchers to train high-performance DL models in situations where data are limited.

Funder

Outstanding Youth Team Project of Central Universities

Publisher

MDPI AG

Subject

Electrical and Electronic Engineering,Computer Networks and Communications,Hardware and Architecture,Signal Processing,Control and Systems Engineering

Link

https://www.mdpi.com/2079-9292/12/19/4077/pdf

Reference48 articles.

1. Intelligent fault diagnosis of machines with small & imbalanced data: A state-of-the-art review and possible extensions;Zhang;ISA Trans.,2022

2. A review of medical image data augmentation techniques for deep learning applications;Chlap;J. Med. Imaging Radiat. Oncol.,2021

3. A general reinforcement learning algorithm that masters chess, shogi, and Go through self-play;Silver;Science,2018

4. Hao, X., Liu, L., Yang, R., Yin, L., Zhang, L., and Li, X. (2023). A Review of Data Augmentation Methods of Remote Sensing Image Target Recognition. Remote Sens., 15.

5. Chen, Y., Yang, X.H., Wei, Z., Heidari, A.A., Zheng, N., Li, Z., Chen, H., Hu, H., Zhou, Q., and Guan, Q. (2022). Generative adversarial networks in medical image augmentation: A review. Comput. Biol. Med., 144.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Research and design of image style transfer technology based on multi‐scale convolutional neural network feature fusion;Electronics Letters;2024-06