Abstract
AbstractThe purpose of applying reinforcement learning (RL) to portfolio management is commonly the maximization of profit. The extrinsic reward function used to learn an optimal strategy typically does not take into account any other preferences or constraints. We have developed a regularization method that ensures that strategies have global intrinsic affinities, i.e., different personalities may have preferences for certain asset classes which may change over time. We capitalize on these intrinsic policy affinities to make our RL model inherently interpretable. We demonstrate how RL agents can be trained to orchestrate such individual policies for particular personality profiles and still achieve high returns.
Publisher
Springer Science and Business Media LLC
Reference43 articles.
1. Andres, A., Villar-Rodriguez, E., & Ser J. D. (2022). Collaborative training of heterogeneous reinforcement learning agents in environments with sparse rewards: What and when to share? arXiv:2202.12174
2. Apeh, E. T., Gabrys, B., & Schierz, A. (2011). Customer profile classification using transactional data. 2011 Third World Congress on Nature and Biologically Inspired Computing (pp. 37–43). Salamanca, Spain.
3. Aubret, A., Matignon, L., & Hassas, S. (2019). A survey on intrinsic motivation in reinforcement learning. arXiv:1908.06976
4. Barocas, S., & Selbst, A. D. (2016). Big data’s disparate impact. California Law Review, 104(3), 671–732.
5. Barredo Arrieta, A., Díaz-Rodríguez, N., Del Ser, J., et al. (2020). Explainable artificial intelligence (XAI): Concepts, taxonomies, opportunities and challenges toward responsible AI. Information Fusion, 58, 82–115.