Constrained Portfolio Management Using Action Space Decomposition for Reinforcement Learning-Reference-Cited by-同舟云学术

Constrained Portfolio Management Using Action Space Decomposition for Reinforcement Learning

Published:2023 Issue: Volume: Page:373-385
ISSN:0302-9743
Container-title:Advances in Knowledge Discovery and Data Mining
language:
Short-container-title:

Author:

Winkel David^ORCID,Strauß Niklas^ORCID,Schubert Matthias^ORCID,Ma Yunpu^ORCID,Seidl Thomas^ORCID

Abstract

AbstractFinancial portfolio managers typically face multi-period optimization tasks such as short-selling or investing at least a particular portion of the portfolio in a specific industry sector. A common approach to tackle these problems is to use constrained Markov decision process (CMDP) methods, which may suffer from sample inefficiency, hyperparameter tuning, and lack of guarantees for constraint violations. In this paper, we propose Action Space Decomposition Based Optimization (ADBO) for optimizing a more straightforward surrogate task that allows actions to be mapped back to the original task. We examine our method on two real-world data portfolio construction tasks. The results show that our new approach consistently outperforms state-of-the-art benchmark approaches for general CMDPs.

Publisher

Springer Nature Switzerland

Link

https://link.springer.com/content/pdf/10.1007/978-3-031-33377-4_29

Reference21 articles.

1. Lecture Notes in Computer Science (Lecture Notes in Artificial Intelligence);C Abrate,2021

2. Achiam, J., Held, D., Tamar, A., Abbeel, P.: Constrained policy optimization. In: International Conference on Machine Learning, pp. 22–31. PMLR (2017)

3. Altman, E.: Constrained Markov decision processes: stochastic modeling. Routledge (1999)

4. Ammar, H.B., Tutunov, R., Eaton, E.: Safe policy search for lifelong reinforcement learning with sublinear regret. In: International Conference on Machine Learning, pp. 2361–2369. PMLR (2015)

5. Bhatnagar, S., Lakshmanan, K.: An online actor-critic algorithm with function approximation for constrained Markov decision processes. J. Optim. Theory Appl. 153(3), 688–708 (2012)