A Selective Portfolio Management Algorithm with Off-Policy Reinforcement Learning Using Dirichlet Distribution-Reference-Cited by-同舟云学术

A Selective Portfolio Management Algorithm with Off-Policy Reinforcement Learning Using Dirichlet Distribution

Published:2022-11-23 Issue:12 Volume:11 Page:664
ISSN:2075-1680
Container-title:Axioms
language:en
Short-container-title:Axioms

Author:

Yang Hyunjun,Park Hyeonjun,Lee Kyungjae^ORCID

Abstract

Existing methods in portfolio management deterministically produce an optimal portfolio. However, according to modern portfolio theory, there exists a trade-off between a portfolio’s expected returns and risks. Therefore, the optimal portfolio does not exist definitively, but several exist, and using only one deterministic portfolio is disadvantageous for risk management. We proposed Dirichlet Distribution Trader (DDT), an algorithm that calculates multiple optimal portfolios by taking Dirichlet Distribution as a policy. The DDT algorithm makes several optimal portfolios according to risk levels. In addition, by obtaining the pi value from the distribution and applying importance sampling to off-policy learning, the sample is used efficiently. Furthermore, the architecture of our model is scalable because the feed-forward of information between portfolio stocks occurs independently. This means that even if untrained stocks are added to the portfolio, the optimal weight can be adjusted. We also conducted three experiments. In the scalability experiment, it was shown that the DDT extended model, which is trained with only three stocks, had little difference in performance from the DDT model that learned all the stocks in the portfolio. In an experiment comparing the off-policy algorithm and the on-policy algorithm, it was shown that the off-policy algorithm had good performance regardless of the stock price trend. In an experiment comparing investment results according to risk level, it was shown that a higher return or a better Sharpe ratio could be obtained through risk control.

Funder

Korea Government

Publisher

MDPI AG

Subject

Geometry and Topology,Logic,Mathematical Physics,Algebra and Number Theory,Analysis

Link

https://www.mdpi.com/2075-1680/11/12/664/pdf

Reference29 articles.

1. Portfolio Selection;J. Financ.,1952

2. An intelligent financial portfolio trading strategy using deep Q-learning;Expert Syst. Appl.,2020

3. Deep reinforcement learning for portfolio management of markets with a dynamic number of assets;Expert Syst. Appl.,2021

4. Schulman, J., Wolski, F., Dhariwal, P., Radford, A., and Klimov, O. (2017). Proximal Policy Optimization Algorithms. arXiv.

5. Jiang, Z., Xu, D., and Liang, J. (2017). A Deep Reinforcement Learning Framework for the Financial Portfolio Management Problem. arXiv.

Cited by 1 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Heavy-Tailed Reinforcement Learning With Penalized Robust Estimator;IEEE Access;2024