Learning 2-Opt Heuristics for Routing Problems via Deep Reinforcement Learning-Reference-Cited by-同舟云学术

Learning 2-Opt Heuristics for Routing Problems via Deep Reinforcement Learning

Published:2021-07-23 Issue:5 Volume:2 Page:
ISSN:2662-995X
Container-title:SN Computer Science
language:en
Short-container-title:SN COMPUT. SCI.

Author:

da Costa Paulo^ORCID,Rhuggenaath Jason,Zhang Yingqian,Akcay Alp,Kaymak Uzay

Abstract

AbstractRecent works using deep learning to solve routing problems such as the traveling salesman problem (TSP) have focused on learning construction heuristics. Such approaches find good quality solutions but require additional procedures such as beam search and sampling to improve solutions and achieve state-of-the-art performance. However, few studies have focused on improvement heuristics, where a given solution is improved until reaching a near-optimal one. In this work, we propose to learn a local search heuristic based on 2-opt operators via deep reinforcement learning. We propose a policy gradient algorithm to learn a stochastic policy that selects 2-opt operations given a current solution. Moreover, we introduce a policy neural network that leverages a pointing attention mechanism, which can be easily extended to more general k-opt moves. Our results show that the learned policies can improve even over random initial solutions and approach near-optimal solutions faster than previous state-of-the-art deep learning methods for the TSP. We also show we can adapt the proposed method to two extensions of the TSP: the multiple TSP and the Vehicle Routing Problem, achieving results on par with classical heuristics and learned methods.

Funder

Nederlandse Organisatie voor Wetenschappelijk Onderzoek

Publisher

Springer Science and Business Media LLC

Link

https://link.springer.com/content/pdf/10.1007/s42979-021-00779-2.pdf

Reference40 articles.

1. Angeniol B, Vaubois GDLC, Le Texier JY. Self-organizing feature maps and the travelling salesman problem. Neural Netw. 1988;1(4):289–93.

2. Applegate DL, Bixby RE, Chvatal V, Cook WJ. The traveling salesman problem: a computational study. Princeton: Princeton University Press; 2006.

3. Arora S. Polynomial time approximation schemes for Euclidean traveling salesman and other geometric problems. J ACM. 1998;45(5):753–82.

4. Bektas T. The multiple traveling salesman problem: an overview of formulations and solution procedures. Omega. 2006;34(3):209–19.

5. Bello I, Pham H. Neural combinatorial optimization with reinforcement learning. In: Proceedings of the 5th international conference on learning representations (ICLR), 2017.

Cited by 35 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. A scalable learning approach for the capacitated vehicle routing problem;Computers & Operations Research;2024-11

2. Optimization of UAV Flight Paths in Multi-UAV Networks for Efficient Data Collection;Arabian Journal for Science and Engineering;2024-07-29

3. Collaborative orchard pesticide spraying routing problem with multi-vehicles supported multi-UAVs;Journal of Cleaner Production;2024-06

4. Machine Learning to Solve Vehicle Routing Problems: A Survey;IEEE Transactions on Intelligent Transportation Systems;2024-06

5. Generalization in Deep RL for TSP Problems via Equivariance and Local Search;SN Computer Science;2024-03-29