1. Jacob Abernethy , Pranjal Awasthi , Matthäus Kleindessner , Jamie Morgenstern , Chris Russell , and Jie Zhang . 2020. Active sampling for min-max fairness. arXiv preprint arXiv:2006.06879 ( 2020 ). Jacob Abernethy, Pranjal Awasthi, Matthäus Kleindessner, Jamie Morgenstern, Chris Russell, and Jie Zhang. 2020. Active sampling for min-max fairness. arXiv preprint arXiv:2006.06879 (2020).
2. Richard E Bellman . 2010. Dynamic programming . Princeton university press . Richard E Bellman. 2010. Dynamic programming. Princeton university press.
3. Greg Brockman , Vicki Cheung , Ludwig Pettersson , Jonas Schneider , John Schulman , Jie Tang , and Wojciech Zaremba . 2016. Openai gym. arXiv preprint arXiv:1606.01540 ( 2016 ). Greg Brockman, Vicki Cheung, Ludwig Pettersson, Jonas Schneider, John Schulman, Jie Tang, and Wojciech Zaremba. 2016. Openai gym. arXiv preprint arXiv:1606.01540 (2016).
4. High-Value Prioritized Experience Replay for Off-Policy Reinforcement Learning
5. Erick Delage and Yinyu Ye. 2010. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations research 58, 3 ( 2010 ), 595–612. Erick Delage and Yinyu Ye. 2010. Distributionally robust optimization under moment uncertainty with application to data-driven problems. Operations research 58, 3 (2010), 595–612.