Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm-Reference-Cited by-同舟云学术

Self Adaptive Methods for Learning Rate Parameter of Q-Learning Algorithm

Published:2023-09-23 Issue:2 Volume:6 Page:191-198
ISSN:2651-3927
Container-title:Journal of Intelligent Systems: Theory and Applications
language:en
Short-container-title:jista

Author:

ÇİMEN Murat Erhan¹^ORCID,GARİP Zeynep²^ORCID,YALÇIN Yaprak³^ORCID,KUTLU Mustafa²^ORCID,BOZ Ali Fuat²^ORCID

Affiliation:

1. Sakarya Uygulamalı Bilimler Üniversitesi

2. SAKARYA UNIVERSITY OF APPLIED SCIENCES

3. İSTANBUL TEKNİK ÜNİVERSİTESİ

Abstract

Machine learning methods can generally be categorized as supervised, unsupervised and reinforcement learning. One of these methods, Q learning algorithm in reinforcement learning, is an algorithm that can interact with the environment and learn from the environment and produce actions accordingly. In this study, eight different on-line methods have been proposed to determine online the value of the learning parameter in the Q learning algorithm depending on different situations. In order to test the performance of the proposed methods, these algorithms are applied to Frozen Lake and Car Pole systems and the results are compared graphically and statistically. When the obtained results are examined, Method 1 has produced better performance for Frozen Lake, which is a discrete system, while Method 7 has produced better results for the Cart Pole System, which is a continuous system.

Publisher

Journal of Intelligent Systems: Theory and Applications, Harun TASKIN

Subject

General Medicine

Reference33 articles.

1. Adigüzel, F., Yalçin, Y., 2018. Discrete-Time Backstepping Control for Cart-Pendulum System with Disturbance Attenuation via I&i Disturbance Estimation. in 2018 2nd International Symposium on Multidisciplinary Studies and Innovative Technologies (ISMSIT).

2. Adıgüzel, F., Yalçin, Y., 2022. “Backstepping Control for a Class of Underactuated Nonlinear Mechanical Systems with a Novel Coordinate Transformation in the Discrete-Time Setting.” in Proceedings of the Institution of Mechanical Engineers, Part I: Journal of Systems and Control Engineering.

3. Akyurek, H.A., Bucak İ.Ö., 2012. Zamansal-Fark, Uyarlanır Dinamik Programlama ve SARSA Etmenlerinin Tipik Arazi Aracı Problemi Için Öğrenme Performansları. in Akıllı Sistemlerde Yenilikler ve Uygulamaları Sempozyumu. Trabzon.

4. Angiuli, A., Fouque J.P., Laurière M., 2022. Unified Reinforcement Q-Learning for Mean Field Game and Control Problems. Mathematics of Control, Signals, and Systems 34(2):217–71.

5. Barlow, H. B., 1989. Unsupervised Learning. Neural Computation 1(3).