1. Yasin Abbasi-Yadkori Dávid Pál and Csaba Szepesvári. 2011. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems. 2312--2320. Yasin Abbasi-Yadkori Dávid Pál and Csaba Szepesvári. 2011. Improved algorithms for linear stochastic bandits. In Advances in Neural Information Processing Systems. 2312--2320.
2. Yasin Abbasi-Yadkori and Csaba Szepesvári . 2011 . Regret bounds for the adaptive control of linear quadratic systems . In Conference on Learning Theory. 1--26 . Yasin Abbasi-Yadkori and Csaba Szepesvári. 2011. Regret bounds for the adaptive control of linear quadratic systems. In Conference on Learning Theory. 1--26.
3. The role and use of the stochastic linear-quadratic-Gaussian problem in control system design
4. Dimitri Bertsekas . 2012. Dynamic programming and optimal control : Volume I . Vol. 1. Athena scientific. Dimitri Bertsekas. 2012. Dynamic programming and optimal control: Volume I. Vol. 1. Athena scientific.
5. Omar Besbes , Yonatan Gur , and Assaf Zeevi . 2014. Stochastic multi-armed-bandit problem with non-stationary rewards. Advances in neural information processing systems , Vol. 27 ( 2014 ), 199--207. Omar Besbes, Yonatan Gur, and Assaf Zeevi. 2014. Stochastic multi-armed-bandit problem with non-stationary rewards. Advances in neural information processing systems , Vol. 27 (2014), 199--207.