Abstract
AbstractWe consider a premium control problem in discrete time, formulated in terms of a Markov decision process. In a simplified setting, the optimal premium rule can be derived with dynamic programming methods. However, these classical methods are not feasible in a more realistic setting due to the dimension of the state space and lack of explicit expressions for transition probabilities. We explore reinforcement learning techniques, using function approximation, to solve the premium control problem for realistic stochastic models. We illustrate the appropriateness of the approximate optimal premium rule compared with the true optimal premium rule in a simplified setting and further demonstrate that the approximate optimal premium rule outperforms benchmark rules in more realistic settings where classical approaches fail.
Publisher
Cambridge University Press (CUP)
Subject
Economics and Econometrics,Finance,Accounting
Reference31 articles.
1. Chong, W. F. , Cui, H. and Li, Y. (2021) Pseudo-model-free hedging for variable annuities via deep reinforcement learning. arXiv preprint arXiv:2107.03340.
2. Deep hedging
3. An analysis of temporal-difference learning with function approximation
4. Value Function Approximation in Reinforcement Learning Using the Fourier Basis
5. Sutton, R.S. , McAllester, D. , Singh, S. and Mansour, Y. (1999) Policy gradient methods for reinforcement learning with function approximation. Advances in Neural Information Processing Systems 12.