Abstract
It was shown that deep reinforcement learning (DRL) has the potential to solve portfolio management problems in recent years. The Twin Delayed Deep Deterministic policy gradient algorithm (TD3) is an actor-critic method, a typical DRL method in continuous action space. Despite the success of DRL in financial trading, surprisingly, most of the literature ignores the element of risk control. The research is proposed to combine long- and short-term risk (LSTR) control with the TD3 algorithm to build a portfolio model with risk management capabilities. Using Chinese stock data from the Shanghai Stock Exchange, we train and validate the proposed portfolio model. Performances were compared to the TD3 model without risk control. The results indicated that our proposal offers better risk control and investment returns.
Subject
General Mathematics,Engineering (miscellaneous),Computer Science (miscellaneous)
Reference24 articles.
1. Portfolio Selection;Markowitz;J. Finance,1952
2. Gains from Markowitz Optimization: Evidence from Reoptimization of Mutual Fund Holdings;Elavia;J. Portf. Manag.,2022
3. Agarwal, A., Hazan, E., Kale, S., and Schapire, R.E. (2006, January 25–29). Algorithms for portfolio management based on the Newton method. Proceedings of the 23rd International Conference on Machine Learning, Pittsburgh, PA, USA.
4. Universal Portfolios;Cover;Math. Finance,1991
5. On-Line Portfolio Selection Using Multiplicative Updates;Helmbold;Math. Finance,1998
Cited by
4 articles.
订阅此论文施引文献
订阅此论文施引文献,注册后可以免费订阅5篇论文的施引文献,订阅后可以查看论文全部施引文献