1. Marcin Andrychowicz , Filip Wolski , Alex Ray , Jonas Schneider , Rachel Fong , Peter Welinder , Bob McGrew , Josh Tobin , Open AI Pieter Abbeel , and Wojciech Zaremba . 2017 . Hindsight experience replay . In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017 , December 4-9, 2017, Long Beach, CA, USA,, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5048--5058. Marcin Andrychowicz, Filip Wolski, Alex Ray, Jonas Schneider, Rachel Fong, Peter Welinder, Bob McGrew, Josh Tobin, OpenAI Pieter Abbeel, and Wojciech Zaremba. 2017. Hindsight experience replay. In Advances in Neural Information Processing Systems 30: Annual Conference on Neural Information Processing Systems 2017, December 4-9, 2017, Long Beach, CA, USA,, Isabelle Guyon, Ulrike von Luxburg, Samy Bengio, Hanna M. Wallach, Rob Fergus, S. V. N. Vishwanathan, and Roman Garnett (Eds.). 5048--5058.
2. Xueying Bai , Jian Guan , and Hongning Wang . 2019. A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation . In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc , E. Fox, and R. Garnett (Eds.), Vol. 32 . Curran Associates, Inc. , 10735--10746. https://proceedings.neurips.cc/paper/ 2019 /file/e49eb6523da9e1c347bc148ea8ac55d3-Paper.pdf Xueying Bai, Jian Guan, and Hongning Wang. 2019. A Model-Based Reinforcement Learning with Adversarial Training for Online Recommendation. In Advances in Neural Information Processing Systems, H. Wallach, H. Larochelle, A. Beygelzimer, F. d'Alché-Buc, E. Fox, and R. Garnett (Eds.), Vol. 32. Curran Associates, Inc., 10735--10746. https://proceedings.neurips.cc/paper/2019/file/e49eb6523da9e1c347bc148ea8ac55d3-Paper.pdf
3. Large-Scale Interactive Recommendation with Tree-Structured Policy Gradient
4. Stabilizing Reinforcement Learning in Dynamic Environment with Application to Online Recommendation
5. Knowledge-guided Deep Reinforcement Learning for Interactive Recommendation