Randomized Shortest-Path Problems: Two Related Models-Reference-Cited by-同舟云学术

Randomized Shortest-Path Problems: Two Related Models

Published:2009-08 Issue:8 Volume:21 Page:2363-2404
ISSN:0899-7667
Container-title:Neural Computation
language:en
Short-container-title:Neural Computation

Author:

Saerens Marco¹,Achbany Youssef¹,Fouss François²,Yen Luh¹

Affiliation:

1. Information Systems Unit and Machine Learning Group, Université catholique de Louvain, Louvain-la Neuve B-1348, Belgium

2. Information Systems Unit and Machine Learning Group, Université catholique de Louvain, Louvain-la Neuve B-1348, Belgium, and Management Sciences Department, Facultés Universitaires Catholiques de Mons, Mons 7000, Belgium

Abstract

This letter addresses the problem of designing the transition probabilities of a finite Markov chain (the policy) in order to minimize the expected cost for reaching a destination node from a source node while maintaining a fixed level of entropy spread throughout the network (the exploration). It is motivated by the following scenario. Suppose you have to route agents through a network in some optimal way, for instance, by minimizing the total travel cost—nothing particular up to now—you could use a standard shortest-path algorithm. Suppose, however, that you want to avoid pure deterministic routing policies in order, for instance, to allow some continual exploration of the network, avoid congestion, or avoid complete predictability of your routing strategy. In other words, you want to introduce some randomness or unpredictability in the routing policy (i.e., the routing policy is randomized). This problem, which will be called the randomized shortest-path problem (RSP), is investigated in this work. The global level of randomness of the routing policy is quantified by the expected Shannon entropy spread throughout the network and is provided a priori by the designer. Then, necessary conditions to compute the optimal randomized policy—minimizing the expected routing cost—are derived. Iterating these necessary conditions, reminiscent of Bellman's value iteration equations, allows computing an optimal policy, that is, a set of transition probabilities in each node. Interestingly and surprisingly enough, this first model, while formulated in a totally different framework, is equivalent to Akamatsu's model ( 1996 ), appearing in transportation science, for a special choice of the entropy constraint. We therefore revisit Akamatsu's model by recasting it into a sum-over-paths statistical physics formalism allowing easy derivation of all the quantities of interest in an elegant, unified way. For instance, it is shown that the unique optimal policy can be obtained by solving a simple linear system of equations. This second model is therefore more convincing because of its computational efficiency and soundness. Finally, simulation results obtained on simple, illustrative examples show that the models behave as expected.

Publisher

MIT Press - Journals

Subject

Cognitive Neuroscience,Arts and Humanities (miscellaneous)

Link

https://www.mitpressjournals.org/doi/pdf/10.1162/neco.2009.11-07-643

Reference63 articles.

1. Optimal Tuning of Continual Online Exploration in Reinforcement Learning

2. Tuning continual exploration in reinforcement learning: An optimality property of the Boltzmann strategy

3. Algorithms for Games

4. Cyclic flows, Markov process and stochastic traffic assignment

5. Decomposition of Path Choice Entropy in General Transport Networks

Cited by 79 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. On-Demand Meal Delivery: A Markov Model for Circulating Couriers;Transportation Science;2024-09-10

2. Advances and challenges in ecological connectivity science;Ecology and Evolution;2024-09

3. Sparse randomized policies for Markov decision processes based on Tsallis divergence regularization;Knowledge-Based Systems;2024-09

4. Identifying the environmental drivers of corridors and predicting connectivity between seasonal ranges in multiple populations of Alpine ibex (Capra ibex) as tools for conserving migration;Diversity and Distributions;2024-06-04

5. Sensitivity to network perturbations in the randomized shortest paths framework: theory and applications in ecological connectivity;Journal of Physics: Complexity;2024-05-29