Monte Carlo tree search for dynamic shortest‐path interdiction-Reference-Cited by-同舟云学术

Monte Carlo tree search for dynamic shortest‐path interdiction

Published:2024-07-10 Issue: Volume: Page:
ISSN:0028-3045
Container-title:Networks
language:en
Short-container-title:Networks

Author:

Bochkarev Alexey A.¹^ORCID,Smith J. Cole²^ORCID

Affiliation:

1. Department of Mathematics RPTU Kaiserslautern‐Landau Kaiserslautern Germany

2. Department of Electrical Engineering and Computer Science Syracuse University Syracuse New York USA

Abstract

AbstractWe present a reinforcement learning‐based heuristic for a two‐player interdiction game called the dynamic shortest path interdiction problem (DSPI). The DSPI involves an evader and an interdictor who take turns in the problem, with the interdictor selecting a set of arcs to attack and the evader choosing an arc to traverse at each step of the game. Our model employs the Monte Carlo tree search framework to learn a policy for the players using randomized roll‐outs. This policy is stored as an asymmetric game tree and can be further refined as the game unfolds. We leverage alpha–beta pruning and existing bounding schemes in the literature to prune suboptimal branches. Our numerical experiments demonstrate that the prescribed approach yields near‐optimal solutions in many cases and allows for flexibility in balancing solution quality and computational effort.

Publisher

Wiley

Link

https://onlinelibrary.wiley.com/doi/pdf/10.1002/net.22243

Reference43 articles.

1. P.Almasan J.Suárez‐Varela A.Badia‐Sampera K.Rusek P.Barlet‐Ros andA.Cabellos‐Aparicio.Deep reinforcement learning meets graph neural networks: Exploring a routing optimization use case. arXiv preprint arXiv:1910.07421 2020.

2. Machine Learning Based Approaches to Solve the Maximum Flow Network Interdiction Problem

3. Shortest path network interdiction with asymmetric information

4. Partitioning procedures for solving mixed-variables programming problems