1. Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47 2--3 (2002) 235--256. 10.1023/A:1013689704352 Peter Auer Nicolo Cesa-Bianchi and Paul Fischer. 2002. Finite-time analysis of the multiarmed bandit problem. Machine learning 47 2--3 (2002) 235--256. 10.1023/A:1013689704352
2. Christopher M Bishop. 2006. Pattern recognition and machine learning. Springer. Christopher M Bishop. 2006. Pattern recognition and machine learning. Springer.
3. A Survey of Monte Carlo Tree Search Methods