Abstract
Initially, Anticipatory Classifier Systems (ACS) were designed to address both single and multistep decision problems. In the latter case, the objective was to maximize the total discounted rewards, usually based on Q-learning algorithms. Studies on other Learning Classifier Systems (LCS) revealed many real-world sequential decision problems where the preferred objective is the maximization of the average of successive rewards. This paper proposes a relevant modification toward the learning component, allowing us to address such problems. The modified system is called AACS2 (Averaged ACS2) and is tested on three multistep benchmark problems.
Subject
Fluid Flow and Transfer Processes,Computer Science Applications,Process Chemistry and Technology,General Engineering,Instrumentation,General Materials Science
Reference37 articles.
1. Escaping brittleness: The possibilities of general-purpose learning algorithms applied to parallel rule-based systems;Holland,1986
2. BioHEL: Bioinformatics-Oriented Hierarchical Evolutionary Learning;Bacardit,2006
3. ExSTraCS 2.0: description and evaluation of a scalable learning classifier system
4. Customer satisfaction prediction with Michigan-style learning classifier system
Cited by
6 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献