1. Naive exploration is optimal for online lqr;Simchowitz
2. Regret bounds for the adaptive control of linear quadratic systems;Abbasi-Yadkori
3. Optimism-Based Adaptive Regulation of Linear-Quadratic Systems
4. Learning linear-quadratic regulators efficiently with only $\sqrt T $ regret;Cohen
5. Efficient optimistic exploration in linear-quadratic regulators via lagrangian relaxation;Abeille,2020