1. Jean‐YvesAudibertandSébastienBubeck.Best arm identification in multi‐armed bandits.2010.http://sbubeck.com/COLT10_ABM.pdf.
2. Using confidence bounds for exploitation‐exploration trade‐offs;Auer Peter;Journal of Machine Learning Research,2002
3. OmarBesbes YonatanGur andAssafZeevi.Optimal exploration‐exploitation in a multi‐armed‐bandit problem with nonstationary rewards.Available at SSRN 2436629 2018.
4. John R.Boyd.Organic design for command and control (in “A Discourse on Winning and Losing”).http://www.ausairpower.net/JRB/organic_design.pdf 1987.