Online Reinforcement Learning Using a Probability Density Estimation-Reference-Cited by-同舟云学术

Online Reinforcement Learning Using a Probability Density Estimation

Published:2017-01 Issue:1 Volume:29 Page:220-246
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Agostini Alejandro¹,Celaya Enric²

Affiliation:

1. Bernstein Center for Computational Neuroscience, 37077 Göttingen, Germany

2. Institut de Robòtica i Informàtica Industrial (CSIC-UPC), 08028 Barcelona, Spain

Abstract

Function approximation in online, incremental, reinforcement learning needs to deal with two fundamental problems: biased sampling and nonstationarity. In this kind of task, biased sampling occurs because samples are obtained from specific trajectories dictated by the dynamics of the environment and are usually concentrated in particular convergence regions, which in the long term tend to dominate the approximation in the less sampled regions. The nonstationarity comes from the recursive nature of the estimations typical of temporal difference methods. This nonstationarity has a local profile, varying not only along the learning process but also along different regions of the state space. We propose to deal with these problems using an estimation of the probability density of samples represented with a gaussian mixture model. To deal with the nonstationarity problem, we use the common approach of introducing a forgetting factor in the updating formula. However, instead of using the same forgetting factor for the whole domain, we make it dependent on the local density of samples, which we use to estimate the nonstationarity of the function at any given input point. To address the biased sampling problem, the forgetting factor applied to each mixture component is modulated according to the new information provided in the updating, rather than forgetting depending only on time, thus avoiding undesired distortions of the approximation in less sampled regions.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/NECO_a_00906

Reference17 articles.

1. Online EM with Weight-Based Forgetting