1. Avrachenkov, K.E., Altman, E.: Sensitive discount optimality via nested linear programs for ergodic Markov decision processes. In: Proceedings of Information Decision and Control 1999, Adelaide, Australia, pp. 53–58. IEEE, Los Alamitos (1999)
2. Berry, D.A., Fristedt, B.: Bandit Problems: Sequential Allocation of Experiments. Chapman and Hall, London (1985)
3. Bertsekas, D.P., Tsitsiklis, J.N.: Neuro-Dynamic Programming. In: Athena Scientific, Belmont, MA (1996)
4. Frederick, S., Loewenstein, G., O’Donoghue, T.: Time discounting and time preference: A critical review. Journal of Economic Literature 40, 351–401 (2002)
5. LNAI;M. Hutter,2002