SEMI-MARKOV DECISION PROCESSES-Reference-Cited by-同舟云学术

SEMI-MARKOV DECISION PROCESSES

Published:2007-10 Issue:4 Volume:21 Page:635-657
ISSN:0269-9648
Container-title:Probability in the Engineering and Informational Sciences
language:en
Short-container-title:Prob. Eng. Inf. Sci.

Author:

Baykal-Gürsoy M.,Gürsoy K.

Abstract

Considered are semi-Markov decision processes (SMDPs) with finite state and action spaces. We study two criteria: the expected average reward per unit time subject to a sample path constraint on the average cost per unit time and the expected time-average variability. Under a certain condition, for communicating SMDPs, we construct (randomized) stationary policies that are ε-optimal for each criterion; the policy is optimal for the first criterion under the unichain assumption and the policy is optimal and pure for a specific variability function in the second criterion. For general multichain SMDPs, by using a state space decomposition approach, similar results are obtained.

Publisher

Cambridge University Press (CUP)

Subject

Industrial and Manufacturing Engineering,Management Science and Operations Research,Statistics, Probability and Uncertainty,Statistics and Probability

Reference39 articles.

1. Markov-Renewal Programming. II: Infinite Return Models, Example

2. Multichain Markov Decision Processes with a Sample Path Constraint: A Decomposition Approach

3. Constrained Semi-Markov decision processes with average rewards

4. Maximal Average-Reward Policies for Semi-Markov Decision Processes With Arbitrary State and Action Space

Cited by 15 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Multi-agent Reinforcement Learning for Unmanned Aerial Vehicle Capture-the-Flag Game Behavior;Lecture Notes in Networks and Systems;2024

2. To Code or Not to Code: When and How to Use Network Coding in Energy Harvesting Wireless Multi-Hop Networks;IEEE Access;2024

3. Semi-Markovian Maintenance Optimization for Reinforced Concrete Enabled by a Synthesized Deterioration Model;IEEE Transactions on Reliability;2021

4. Delay-Optimal Edge Cache Replacement with Non-Markovian Content Fetching;GLOBECOM 2020 - 2020 IEEE Global Communications Conference;2020-12

5. Q-Learning Classifier;Intrusion Detection;2020