A Survey on Stability of Learning with Limited Labelled Data and its Sensitivity to the Effects of Randomness

Author:

Pecher Branislav12ORCID,Srba Ivan2ORCID,Bielikova Maria23ORCID

Affiliation:

1. Faculty of Information Technology, Brno University of Technology, Brno, Czech Republic

2. Kempelen Institute of Intelligent Technologies, Bratislava Slovakia

3. Slovak.AI, Bratislava Slovakia

Abstract

Learning with limited labelled data, such as prompting, in-context learning, fine-tuning, meta-learning or few-shot learning, aims to effectively train a model using only a small amount of labelled samples. However, these approaches have been observed to be excessively sensitive to the effects of uncontrolled randomness caused by non-determinism in the training process. The randomness negatively affects the stability of the models, leading to large variances in results across training runs. When such sensitivity is disregarded, it can unintentionally, but unfortunately also intentionally, create an imaginary perception of research progress. Recently, this area started to attract research attention and the number of relevant studies is continuously growing. In this survey, we provide a comprehensive overview of 415 papers addressing the effects of randomness on the stability of learning with limited labelled data. We distinguish between four main tasks addressed in the papers (investigate/evaluate; determine; mitigate; benchmark/compare/report randomness effects), providing findings for each one. Furthermore, we identify and discuss seven challenges and open problems together with possible directions to facilitate further research. The ultimate goal of this survey is to emphasise the importance of this growing research area, which so far has not received an appropriate level of attention, and reveal impactful directions for future research.

Publisher

Association for Computing Machinery (ACM)

Reference160 articles.

1. Rishabh Adiga Lakshminarayanan Subramanian and Varun Chandrasekaran. 2024. Designing Informative Metrics for Few-Shot Example Selection. arXiv preprint arXiv:2403.03861(2024).

2. Mayank Agarwal, Mikhail Yurochkin, and Yuekai Sun. 2021. On sensitivity of meta-learning to support data. In Advances in Neural Information Processing Systems, Vol.  34. Curran Associates, Inc., 20447–20460.

3. Anirudh Ajith Mengzhou Xia Ameet Deshpande and Karthik R Narasimhan. 2023. InstructEval: Systematic Evaluation of Instruction Selection Methods. In R0-FoMo:Robustness of Few-shot and Zero-shot Learning in Large Foundation Models. https://openreview.net/forum?id=6FwaSOEeKD

4. Riccardo Albertoni Sara Colantonio Piotr Skrzypczyński and Jerzy Stefanowski. 2023. Reproducibility of Machine Learning: Terminology Recommendations and Open Issues. arXiv preprint arXiv:2302.12691(2023).

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3