Strategic Data Navigation: Information Value-based Sample Selection-Reference-Cited by-同舟云学术

Strategic Data Navigation: Information Value-based Sample Selection

Published:2024-03-26 Issue: Volume: Page:
ISSN:
Container-title:
language:
Short-container-title:

Author:

Balogh Csanád Levente¹,Pelenczei Bálint²,Kővári Bálint¹,Bécsi Tamás¹

Affiliation:

1. Faculty of Transportation Engineering and Vehicle Engineering, Budapest University of Technology and Economics

2. HUN-REN Institute for Computer Science and Control (SZTAKI)

Abstract

Artificial Intelligence represents a rapidly expanding domain, with several industrial applications demonstrating its superiority over traditional techniques. Despite numerous advancements within the subfield of Machine Learning, it encounters persistent challenges, highlighting the importance of ongoing research efforts. Among its primary branches, this study delves into two categories, being Supervised and Reinforcement Learning, particularly addressing the common issue of data selection for training. The inherent variability in informational content among data points is apparent, wherein certain samples offer more valuable information to the neural network than others. However, evaluating the significance of various data points remains a non-trivial task, generating the need for a robust method to effectively prioritize samples. Drawing inspiration from Reinforcement Learning principles, this paper introduces a novel sample prioritization approach, applied to Supervised Learning scenarios, aimed at enhancing classification accuracy through strategic data navigation, while exploring the boundary between Reinforcement and Supervised Learning techniques. We provide a comprehensive description of our methodology, while revealing the identification of an optimal prioritization balance and demonstrating its beneficial impact on model performance. Although classification accuracy serves as the primary validation metric, the concept of information density-based prioritization encompasses wider applicability. Additionally, the paper investigates parallels and distinctions between Reinforcement and Supervised Learning methods, declaring that the foundational principle is equally relevant, hence completely adaptable to Supervised Learning with appropriate adjustments due to different learning frameworks. Project page and source codes are available at: https://csanad-l-balogh.github.io/sl_prioritized_sampling/.

Publisher

Springer Science and Business Media LLC

Reference51 articles.

1. LeCun, Y. and Cortes, C. and Burges, {C.J.C.}. The MNIST Database of Handwritten Digits. New York, USA.. 1998

2. Baldominos, Alejandro and Saez, Yago and Isasi, Pedro (2019) A Survey of Handwritten Character Recognition with MNIST and EMNIST. Applied Sciences 9(15) https://doi.org/10.3390/app9153169, 2076-3417, https://www.mdpi.com/2076-3417/9/15/3169, 3169

3. Voulodimos, Athanasios and Doulamis, Nikolaos and Doulamis, Anastasios and Protopapadakis, Eftychios and others (2018) Deep learning for computer vision: A brief review. Computational intelligence and neuroscience 2018Hindawi

4. Aldoski, Ziyad N. and Koren, Csaba (2023) Impact of Traffic Sign Diversity on Autonomous Vehicles: A Literature Review. Periodica Polytechnica Transportation Engineering 51(4): 338 –350 https://doi.org/10.3311/PPtr.21484, https://pp.bme.hu/tr/article/view/21484

5. Haiping Wu and Bin Xiao and Noel Codella and Mengchen Liu and Xiyang Dai and Lu Yuan and Lei Zhang. CvT: Introducing Convolutions to Vision Transformers. cs.CV, arXiv, 2103.15808, 2021