1. Wald, A.: Sequential Analysis. John Wiley & Sons, Chichester (1947) (Republished by Dover in 2004)
2. DeGroot, M.H.: Optimal Statistical Decisions. John Wiley & Sons, Chichester (1970) (Republished in 2004)
3. Bellman, R.E.: A problem in the sequential design of experiments. Sankhya 16, 221–229 (1957)
4. Mannor, S., Tsitsiklis, J.N.: The sample complexity of exploration in the multi-armed bandit problem. Journal of Machine Learning Research 5, 623–648 (2004)
5. Dearden, R., Friedman, N., Russell, S.J.: Bayesian Q-learning. In: AAAI/IAAI, pp. 761–768 (1998)