Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning-Reference-Cited by-同舟云学术

Truncating Temporal Differences: On the Efficient Implementation of TD(lambda) for Reinforcement Learning

Published:1995-01-01 Issue: Volume:2 Page:287-318
ISSN:1076-9757
Container-title:Journal of Artificial Intelligence Research
language:
Short-container-title:jair

Author:

Cichosz P.

Abstract

Temporal difference (TD) methods constitute a class of methods for learning predictions in multi-step prediction problems, parameterized by a recency factor lambda. Currently the most important application of these methods is to temporal credit assignment in reinforcement learning. Well known reinforcement learning algorithms, such as AHC or Q-learning, may be viewed as instances of TD learning. This paper examines the issues of the efficient and general implementation of TD(lambda) for arbitrary lambda, for use with reinforcement learning algorithms optimizing the discounted sum of rewards. The traditional approach, based on eligibility traces, is argued to suffer from both inefficiency and lack of generality. The TTD (Truncated Temporal Differences) procedure is proposed as an alternative, that indeed only approximates TD(lambda), but requires very little computation per action and can be used with arbitrary function representation methods. The idea from which it is derived is fairly simple and not new, but probably unexplored so far. Encouraging experimental results are presented, suggesting that using lambda > 0 with the TTD procedure allows one to obtain a significant learning speedup at essentially the same cost as usual TD(0) learning.

Publisher

AI Access Foundation

Subject

Artificial Intelligence

Cited by 18 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Value function assessment to different RL algorithms for heparin treatment policy of patients with sepsis in ICU;Artificial Intelligence in Medicine;2024-01

2. Comparing Supervised and Unsupervised Machine Learning Techniques to Decision Support Systems in Healthcare;Advances in Intelligent Systems and Computing;2023

3. ReCom: A deep reinforcement learning approach for semi-supervised tabular data labeling;Information Sciences;2022-04

4. Real-time energy purchase optimization for a storage-integrated photovoltaic system by deep reinforcement learning;Control Engineering Practice;2021-01

5. A reinforcement learning approach to rare trajectory sampling;New Journal of Physics;2021-01-01