Learning Tetris Using the Noisy Cross-Entropy Method-Reference-Cited by-同舟云学术

Learning Tetris Using the Noisy Cross-Entropy Method

Published:2006-12 Issue:12 Volume:18 Page:2936-2941
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Szita István,Lörincz András¹

Affiliation:

1. Department of Information Systems, Eötvös Loránd University, Budapest, Hungary H-1117,

Abstract

The cross-entropy method is an efficient and general optimization algorithm. However, its applicability in reinforcement learning (RL) seems to be limited because it often converges to suboptimal policies. We apply noise for preventing early convergence of the cross-entropy method, using Tetris, a computer game, for demonstration. The resulting policy outperforms previous RL algorithms by almost two orders of magnitude.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.2006.18.12.2936

Reference2 articles.

1. A Tutorial on the Cross-Entropy Method

2. Basis Function Adaptation in Temporal Difference Reinforcement Learning

Cited by 86 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Learning the rational choice perspective: A reinforcement learning approach to simulating offender behaviours in criminological agent-based models;Computers, Environment and Urban Systems;2024-09

2. Entropy adjustment by interpolation for exploration in Proximal Policy Optimization (PPO);Engineering Applications of Artificial Intelligence;2024-07

3. FEDKA: Federated Knowledge Augmentation for Multi-Center Medical Image Segmentation on non-IID Data;ICASSP 2024 - 2024 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP);2024-04-14

4. How fast can we play Tetris greedily with rectangular pieces?;Theoretical Computer Science;2024-04

5. A hybrid framework for mean-CVaR portfolio selection under jump-diffusion processes: Combining cross-entropy method with beluga whale optimization;AIMS Mathematics;2024