1. Andrychowicz, M., Crow, D., Ray, A., Schneider, J., Fong, R., Welinder, P., McGrew, B., Tobin, J., Abbeel, P., & Zaremba, W. (2017). Hindsight Experience Replay. In Advances in neural information processing systems 30: Annual conference on neural information processing systems (pp. 5048–5058).
2. Statistics of financial time series;Arratia,2014
3. A distributional perspective on reinforcement learning;Bellemare,2017
4. A Markovian decision process;Bellman;Indiana University Mathematics Journal,1957
5. Brim, A. (2020). Deep Reinforcement Learning Pairs Trading with a Double Deep Q-Network. In 10th annual computing and communication workshop and conference (pp. 0222–0227).