Abstract
This paper proposes an effective algorithm framework based on deep reinforcement learning (DRL) to solve the multi-objective permutation flow shop scheduling problem (MOPFSP) with optimization objectives of maximum completion time and energy consumption, named DRL-MOPFSP. Firstly, the PFSP is modeled as a pointer network using the DRL-PFSP method and trained using Actor-Critic reinforcement learning to minimize the makespan. Subsequently, a neighborhood search method based on critical path is employed to further enhance the quality of solutions obtained by the DRL-PFSP algorithm. Additionally, an energy-saving strategy based on job setback is introduced to optimize the energy consumption objective. Finally, simulation and comparative experiments with classical multi-objective algorithms are conducted on 24 different-scale instances. The results demonstrate that the proposed DRL-MOPFSP algorithm exhibits fast solution speed, scalability without size limitations, and strong generalization ability.