Optimizing the CVaR via Sampling-Reference-Cited by-同舟云学术

Optimizing the CVaR via Sampling

Published:2015-02-21 Issue:1 Volume:29 Page:
ISSN:2374-3468
Container-title:Proceedings of the AAAI Conference on Artificial Intelligence
language:
Short-container-title:AAAI

Author:

Tamar Aviv,Glassner Yonatan,Mannor Shie

Abstract

Conditional Value at Risk (CVaR) is a prominent risk measure that is being used extensively in various domains. We develop a new formula for the gradient of the CVaR in the form of a conditional expectation. Based on this formula, we propose a novel sampling-based estimator for the gradient of the CVaR, in the spirit of the likelihood-ratio method. We analyze the bias of the estimator, and prove the convergence of a corresponding stochastic gradient descent algorithm to a local CVaR optimum. Our method allows to consider CVaR optimization in new domains. As an example, we consider a reinforcement learning application, and learn a risk-sensitive controller for the game of Tetris.

Publisher

Association for the Advancement of Artificial Intelligence (AAAI)

Subject

General Medicine

Cited by 16 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Error Density-dependent Empirical Risk Minimization;Expert Systems with Applications;2024-11

2. Constrained Risk-Sensitive Deep Reinforcement Learning for eMBB-URLLC Joint Scheduling;IEEE Transactions on Wireless Communications;2024-09

3. An Asymptotic CVaR Measure of Risk for Markov Chains;2024 International Conference on Signal Processing and Communications (SPCOM);2024-07-01

4. A survey on model-based reinforcement learning;Science China Information Sciences;2024-01-23

5. Learning Robust Communication by Adversarial Training in Networked System Control;Lecture Notes in Electrical Engineering;2024