1. David Silver , Aja Huang , Chris J Maddison , Arthur Guez , Laurent Sifre , GeorgeVan Den Driessche , Julian Schrittwieser, Ioannis Antonoglou , Veda Panneershelvam, Marc Lanctot , 2016 . Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489 David Silver, Aja Huang, Chris J Maddison, Arthur Guez, Laurent Sifre, GeorgeVan Den Driessche, Julian Schrittwieser, Ioannis Antonoglou, Veda Panneershelvam, Marc Lanctot, 2016. Mastering the game of go with deep neural networks and tree search. nature, 529(7587):484–489
2. Christopher Berner , Greg Brockman , Brooke Chan , Vicki Cheung , Przemyslaw Debiak , Christy Dennison , David Farhi , Quirin Fischer , Shariq Hashme , Christopher Hesse , Rafal Józefowicz , Scott Gray , Catherine Olsson , Jakub Pachocki , Michael Petrov , Henrique Pondé de Oliveira Pinto , Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, and Susan Zhang. 2019 . Dota 2 with large scale deep reinforcement learning. CoRR , abs/1912.06680 Christopher Berner, Greg Brockman, Brooke Chan, Vicki Cheung, Przemyslaw Debiak, Christy Dennison, David Farhi, Quirin Fischer, Shariq Hashme, Christopher Hesse, Rafal Józefowicz, Scott Gray, Catherine Olsson, Jakub Pachocki,Michael Petrov, Henrique Pondé de Oliveira Pinto, Jonathan Raiman, Tim Salimans, Jeremy Schlatter, Jonas Schneider, Szymon Sidor, Ilya Sutskever, Jie Tang, Filip Wolski, and Susan Zhang. 2019. Dota 2 with large scale deep reinforcement learning. CoRR, abs/1912.06680
3. Grandmaster level in StarCraft II using multi-agent reinforcement learning
4. Learning dexterous in-hand manipulation
5. V Nguyen , SB Orbell , Dominic T Lennon , Hyungil Moon , Florian Vigneau , Leon C Camenzind , Liuqi Yu , Dominik M Zumbühl , G Andrew D Briggs , Michael A Osborne , 2021 . Deep reinforcement learning for efficient measurement of quantum devices. npj Quantum Information , 7(1):1–9 V Nguyen, SB Orbell, Dominic T Lennon, Hyungil Moon, Florian Vigneau, Leon C Camenzind, Liuqi Yu, Dominik M Zumbühl, G Andrew D Briggs, Michael A Osborne, 2021. Deep reinforcement learning for efficient measurement of quantum devices. npj Quantum Information, 7(1):1–9