Affiliation:
1. Ordos Power Supply Branch of Inner Mongolia Electric Power (Group) Co., Ltd., Inner Mongolia, Ordos, 017004, China
2. School of Control and Computer Engineering, North China Electric Power University, Baoding 071000, China
Abstract
<abstract>
<p>Considering the problem of time scale differences among subsystems in the integrated energy system of a park, as well as the increasing complexity of the system structure and number of control variables, there may be a deep reinforcement learning (DRL) "curse of dimensionality" problem, which hinders the further improvement of economic benefits and energy utilization efficiency of park-level integrated energy systems (PIES). This article proposes a reinforcement learning optimization algorithm for comprehensive energy PPO (Proximal Policy Optimization) in industrial parks considering multiple time scales for energy management. First, PIES are divided into upper and lower layers, the first containing power and thermal systems, and the second containing gas systems. The upper and lower layers of energy management models are built based on the PPO; then, both layers formulate the energy management schemes of the power, thermal, and gas systems in a long (30 min) and short time scale (6 min). Through confirmatory and comparative experiments, it is shown that the proposed method can not only effectively overcome the curse of dimensionality in DRL algorithms during training but can also develop different energy system management plans for PIES on a differentiated time scale, improving the overall economic benefits of the system and reducing carbon emissions.</p>
</abstract>
Publisher
American Institute of Mathematical Sciences (AIMS)