Multi-Task Multi-Objective Evolutionary Search Based on Deep Reinforcement Learning for Multi-Objective Vehicle Routing Problems with Time Windows-Reference-Cited by-同舟云学术

Multi-Task Multi-Objective Evolutionary Search Based on Deep Reinforcement Learning for Multi-Objective Vehicle Routing Problems with Time Windows

Published:2024-08-12 Issue:8 Volume:16 Page:1030
ISSN:2073-8994
Container-title:Symmetry
language:en
Short-container-title:Symmetry

Author:

Deng Jianjun¹,Wang Junjie²,Wang Xiaojun²,Cai Yiqiao²,Liu Peizhong³^ORCID

Affiliation:

1. Chengdu Aeronautic Polytechnic, Chengdu 610100, China

2. College of Computer Science and Technology, Huaqiao University, Xiamen 361021, China

3. College of Engineering, Huaqiao University, Quanzhou 362000, China

Abstract

The vehicle routing problem with time windows (VRPTW) is a widely studied combinatorial optimization problem in supply chains and logistics within the last decade. Recent research has explored the potential of deep reinforcement learning (DRL) as a promising solution for the VRPTW. However, the challenge of addressing the VRPTW with many conflicting objectives (MOVRPTW) still remains for DRL. The MOVRPTW considers five conflicting objectives simultaneously: minimizing the number of vehicles required, the total travel distance, the travel time of the longest route, the total waiting time for early arrivals, and the total delay time for late arrivals. To tackle the MOVRPTW, this study introduces the MTMO/DRP-AT, a multi-task multi-objective evolutionary search algorithm, by making full use of both DRL and the multitasking mechanism. In the MTMO/DRL-AT, a two-objective MOVRPTW is constructed as an assisted task, with the objectives being to minimize the total travel distance and the travel time of the longest route. Both the main task and the assisted task are simultaneously solved in a multitasking scenario. Each task is decomposed into scalar optimization subproblems, which are then solved by an attention model trained using DRL. The outputs of these trained models serve as the initial solutions for the MTMO/DRL-AT. Subsequently, the proposed algorithm incorporates knowledge transfer and multiple local search operators to further enhance the quality of these promising solutions. The simulation results on real-world benchmarks highlight the superior performance of the MTMO/DRL-AT compared to several other algorithms in solving the MOVRPTW.

Funder

Natural Science Foundation of Fujian Province of China

Fujian Provincial Science and Technology Major Project

Quanzhou Science and Technology Major Project

Publisher

MDPI AG

Link

https://www.mdpi.com/2073-8994/16/8/1030/pdf

Reference50 articles.

1. Kallehauge, B., Larsen, J., Madsen, O.B., and Solomon, M.M. (2005). Vehicle routing problem with time windows. Column Generation, Springer.

2. The vehicle routing problem: State of the art classification and review;Braekers;Comput. Ind. Eng.,2016

3. New Shades of the Vehicle Routing Problem: Emerging Problem Formulations and Computational Intelligence Solution Methods;IEEE Trans. Emerg. Top. Comput. Intell.,2019

4. Fathollahi-Fard, A.M., Ahmadi, A., and Karimi, B. (2021). Multi-objective optimization of home healthcare with working-time balancing and care continuity. Sustainability, 13.

5. Sustainable vehicle routing problem for coordinated solid waste management;Mojtahedi;J. Ind. Inf. Integr.,2021