Affiliation:
1. School of Operations Research and Information Engineering, Cornell University, Ithaca, New York 14853;
2. Kellogg School of Management, Northwestern University, Evanston, Illinois 60208
Abstract
Markov Decision Process Tayloring for Approximation Design Optimal control problems are difficult to solve for problems on large state spaces, calling for the development of approximate solution methods. In “A Low-rank Approximation for MDPs via Moment Coupling,” Zhang and Gurvich introduce a novel framework to approximate Markov decision processes (MDPs) that stands on two pillars: (i) state aggregation, as the algorithmic infrastructure, and (ii) central-limit-theorem-type approximations, as the mathematical underpinning. The theoretical guarantees are grounded in the approximation of the Bellman equation by a partial differential equation (PDE) where, in the spirit of the central limit theorem, the transition matrix of the controlled Markov chain is reduced to its local first and second moments. Instead of solving the PDE, the algorithm introduced in the paper constructs a “sister”' (controlled) Markov chain whose two local transition moments are approximately identical with those of the focal chain. Because of this moment matching, the original chain and its sister are coupled through the PDE, facilitating optimality guarantees. Embedded into standard soft aggregation, moment matching provides a disciplined mechanism to tune the aggregation and disaggregation probabilities.
Publisher
Institute for Operations Research and the Management Sciences (INFORMS)
Subject
Management Science and Operations Research,Computer Science Applications