Breaking the data barrier: a review of deep learning techniques for democratizing AI with small datasets-Reference-Cited by-同舟云学术

Breaking the data barrier: a review of deep learning techniques for democratizing AI with small datasets

Published:2024-08-02 Issue:9 Volume:57 Page:
ISSN:1573-7462
Container-title:Artificial Intelligence Review
language:en
Short-container-title:Artif Intell Rev

Author:

Rather Ishfaq Hussain,Kumar Sushil,Gandomi Amir H.

Abstract

AbstractJustifiably, while big data is the primary interest of research and public discourse, it is essential to acknowledge that small data remains prevalent. The same technological and societal forces that generate big datasets also produce a more significant number of small datasets. Contrary to the notion that more data is inherently superior, real-world constraints such as budget limitations and increased analytical complexity present critical challenges. Quality versus quantity trade-offs necessitate strategic decision-making, where small data often leads to quicker, more accurate, and cost-effective insights. Concentrating AI research, particularly in deep learning (DL), on big datasets exacerbates AI inequality, as tech giants such as Meta, Amazon, Apple, Netflix and Google (MAANG) can easily lead AI research due to their access to vast datasets, creating a barrier for small and mid-sized enterprises that lack similar access. This article addresses this imbalance by exploring DL techniques optimized for small datasets, offering a comprehensive review of historic and state-of-the-art DL models developed specifically for small datasets. This study aims to highlight the feasibility and benefits of these approaches, promoting a more inclusive and equitable AI landscape. Through a PRISMA-based literature search, 175+ relevant articles are identified and subsequently analysed based on various attributes, such as publisher, country, utilization of small dataset technique, dataset size, and performance. This article also delves into current DL models and highlights open research problems, offering recommendations for future investigations. Additionally, the article highlights the importance of developing DL models that effectively utilize small datasets, particularly in domains where data acquisition is difficult and expensive.

Funder

Óbuda University

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s10462-024-10859-3.pdf

Reference170 articles.

1. Afshar P, Mohammadi A, Plataniotis KN, Oikonomou A, Benali H (2019) From handcrafted to deep-learning-based cancer radiomics: challenges and opportunities. IEEE Signal Process Mag 36(4):132–160. https://doi.org/10.1109/MSP.2019.2900993

2. Agarwal P, Aghaee M, Tamer M, Budman H (2022) A novel unsupervised approach for batch process monitoring using deep learning. Comput Chem Eng 159:107694. https://doi.org/10.1016/J.COMPCHEMENG.2022.107694

3. Ahmad Z, ul Abidin Jaffri Z, Chen M, Bao S (2024) Understanding GANs: fundamentals, variants, training challenges, applications, and open problems. Multimed Tools Appl. https://doi.org/10.1007/S11042-024-19361-Y

4. Ahmed SF et al (2023) Deep learning modelling techniques: current progress, applications, advantages, and challenges. Artif Intell Rev 2023:1–97. https://doi.org/10.1007/S10462-023-10466-8

5. Akcakaya M, Yaman B, Chung H, Ye JC (2022) Unsupervised deep learning methods for biological image reconstruction and enhancement: an overview from a signal processing perspective. IEEE Signal Process Mag 39(2):28–44. https://doi.org/10.1109/MSP.2021.3119273