A survey on data‐efficient algorithms in big data era-Reference-Cited by-同舟云学术

A survey on data‐efficient algorithms in big data era

Published:2021-01-26 Issue:1 Volume:8 Page:
ISSN:2196-1115
Container-title:Journal of Big Data
language:en
Short-container-title:J Big Data

Author:

Adadi Amina^ORCID

Abstract

AbstractThe leading approaches in Machine Learning are notoriously data-hungry. Unfortunately, many application domains do not have access to big data because acquiring data involves a process that is expensive or time-consuming. This has triggered a serious debate in both the industrial and academic communities calling for more data-efficient models that harness the power of artificial learners while achieving good results with less training data and in particular less human supervision. In light of this debate, this work investigates the issue of algorithms’ data hungriness. First, it surveys the issue from different perspectives. Then, it presents a comprehensive review of existing data-efficient methods and systematizes them into four categories. Specifically, the survey covers solution strategies that handle data-efficiency by (i) using non-supervised algorithms that are, by nature, more data-efficient, by (ii) creating artificially more data, by (iii) transferring knowledge from rich-data domains into poor-data domains, or by (iv) altering data-hungry algorithms to reduce their dependency upon the amount of samples, in a way they can perform well in small samples regime. Each strategy is extensively reviewed and discussed. In addition, the emphasis is put on how the four strategies interplay with each other in order to motivate exploration of more robust and data-efficient algorithms. Finally, the survey delineates the limitations, discusses research challenges, and suggests future opportunities to advance the research on data-efficiency in machine learning.

Publisher

Springer Science and Business Media LLC

Subject

Information Systems and Management,Computer Networks and Communications,Hardware and Architecture,Information Systems

Link

http://link.springer.com/content/pdf/10.1186/s40537-021-00419-9.pdf

Reference321 articles.

1. Silver D, Huang A, Maddison C, Guez AJ, et al. Mastering the game of Go with deep neural networks and tree search. Nature. 2016;529(7587):484. .

2. He K, Zhang X, Ren S, Sun J. Deep Residual Learning for Image Recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR). 2016. p. 1063–6919.

3. Adiwardana D, Luong M, David R, et al. Towards a human-like open-domain chatbot. arXiv preprint arXiv:2001.09977(2020). 2020.

4. Marcus G. Deep learning: a critical appraisal. arXiv preprint arXiv:1801.00631 , 2018.

5. Ford M. Architects of Intelligence: the Truth About AI From the People Building It. Kindle. Birmingham: Packt Publishing; 2018.

Cited by 163 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Pre-trained regional models for extracting buildings from high resolution satellite imagery to support public health initiatives;Remote Sensing Applications: Society and Environment;2024-11

2. Transferable and data efficient metamodeling of storm water system nodal depths using auto-regressive graph neural networks;Water Research;2024-11

3. A systematic review and evaluation of synthetic simulated data generation strategies for deep learning applications in construction;Advanced Engineering Informatics;2024-10

4. Serendipitous, Open Big Data Management and Analytics: The SeDaSOMA Framework;Modelling;2024-09-04

5. Metagenomic profiling of rhizosphere microbiota: Unraveling the plant-soil dynamics;Physiological and Molecular Plant Pathology;2024-09