Affiliation:
1. School of Data Science, The Chinese University of Hong Kong, Shenzhen 518172, China
Abstract
This study investigates the dividend optimization problem in the entropy regularization framework in the continuous-time reinforcement learning setting. The exploratory HJB is established, and the optimal exploratory dividend policy is a truncated exponential distribution. We show that, for suitable choices of the maximal dividend-paying rate and the temperature parameter, the value function of the exploratory dividend optimization problem can be significantly different from the value function in the classical dividend optimization problem. In particular, the value function of the exploratory dividend optimization problem can be classified into three cases based on its monotonicity. Additionally, numerical examples are presented to show the effect of the temperature parameter on the solution. Our results suggest that insurance companies can adopt new exploratory dividend payout strategies in unknown market environments.
Funder
National Science Foundation of China
Shenzhen Science and Technology Program
Reference39 articles.
1. Controlled diffusion models for optimal dividend pay-out;Asmussen;Insurance: Mathematics and Economics,1997
2. Optimal risk control and dividend distribution policies. example of excess-of loss reinsurance for an insurance corporation;Asmussen;Finance and Stochastics,2000
3. Finite-time analysis of the multiarmed bandit problem;Auer;Machine learning,2002
4. On the optimal dividend problem for a spectrally negative lévy process;Avram;The Annals of Applied Probability,2007
5. Optimal reinsurance and dividend distribution policies in the cramér-lundberg model;Azcue;Mathematical Finance: An International Journal of Mathematics, Statistics and Financial Economics,2005