Affiliation:
1. Nihonbashi-Horidomecho Chuo-ku, Tokyo , Japan
Abstract
Abstract
One of the essential aspect in biological agents is dynamic stability. This aspect, called homeostasis, is widely discussed in ethology, neuroscience and during the early stages of artificial intelligence. Ashby’s homeostats are general-purpose learning machines for stabilizing essential variables of the agent in the face of general environments. However, despite their generality, the original homeostats couldn’t be scaled because they searched their parameters randomly. In this paper, first we re-define the objective of homeostats as the maximization of a multi-step survival probability from the view point of sequential decision theory and probabilistic theory. Then we show that this optimization problem can be treated by using reinforcement learning algorithms with special agent architectures and theoretically-derived intrinsic reward functions. Finally we empirically demonstrate that agents with our architecture automatically learn to survive in a given environment, including environments with visual stimuli. Our survival agents can learn to eat food, avoid poison and stabilize essential variables through theoretically-derived single intrinsic reward formulations.
Reference42 articles.
1. Ashby, W. R. 1960. Design for a Brain. Springer Science & Business Media.
2. Barto, A. G.; Singh, S.; and Chentanez, N. 2004. Intrinsically motivated learning of hierarchical collections of skills. In Proc. 3rd Int. Conf. Development Learn, 112-119.
3. Cañamero, D. 1997. Modeling motivations and emotions as a basis for intelligent behavior. In Proceedings of the first international conference on Autonomous agents, 148-155. ACM.
4. Dawkins, R. 1976. The Selfish Gene. Oxford University Press, Oxford, UK.
5. Dayan, P., and Hinton, G. E. 1996. Varieties of Helmholtz machine. Neural Networks 9(8):1385-1403.
Cited by
11 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献