A Review of Reinforcement Learning-Based Powertrain Controllers: Effects of Agent Selection for Mixed-Continuity Control and Reward Formulation-Reference-Cited by-同舟云学术

A Review of Reinforcement Learning-Based Powertrain Controllers: Effects of Agent Selection for Mixed-Continuity Control and Reward Formulation

Published:2023-04-14 Issue:8 Volume:16 Page:3450
ISSN:1996-1073
Container-title:Energies
language:en
Short-container-title:Energies

Author:

Egan Daniel¹^ORCID,Zhu Qilun¹^ORCID,Prucka Robert¹^ORCID

Affiliation:

1. Department of Automotive Engineering, Clemson University, Clemson, SC 29634, USA

Abstract

One major cost of improving the automotive fuel economy while simultaneously reducing tailpipe emissions is increased powertrain complexity. This complexity has consequently increased the resources (both time and money) needed to develop such powertrains. Powertrain performance is heavily influenced by the quality of the controller/calibration. Since traditional control development processes are becoming resource-intensive, better alternate methods are worth pursuing. Recently, reinforcement learning (RL), a machine learning technique, has proven capable of creating optimal controllers for complex systems. The model-free nature of RL has the potential to streamline the control development process, possibly reducing the time and money required. This article reviews the impact of choices in two areas on the performance of RL-based powertrain controllers to provide a better awareness of their benefits and consequences. First, we examine how RL algorithm action continuities and control–actuator continuities are matched, via native operation or conversion. Secondly, we discuss the formulation of the reward function. RL is able to optimize control policies defined by a wide spectrum of reward functions, including some functions that are difficult to implement with other techniques. RL action and control–actuator continuity matching affects the ability of the RL-based controller to understand and operate the powertrain while the reward function defines optimal behavior. Finally, opportunities for future RL-based powertrain control development are identified and discussed.

Publisher

MDPI AG

Subject

Energy (miscellaneous),Energy Engineering and Power Technology,Renewable Energy, Sustainability and the Environment,Electrical and Electronic Engineering,Control and Optimization,Engineering (miscellaneous),Building and Construction

Link

https://www.mdpi.com/1996-1073/16/8/3450/pdf

Reference141 articles.

1. Atkinson, C. (2014). Fuel Efficiency Optimization Using Rapid Transient Engine Calibration, SAE International. SAE Technical Paper No. 2014-01-2359.

2. Sequential DoE framework for steady state model based calibration;Kianifar;SAE Int. J. Engines,2013

3. Multi-objective optimization of transient air-fuel ratio limitation of a diesel engine using DoE based Pareto-optimal approach;Gurel;SAE Int. J. Commer. Veh.,2017

4. Powell, W.B. (2007). Approximate Dynamic Programming: Solving the Curses of Dimensionality, John Wiley & Sons.

5. Onori, S., Serrao, L., and Rizzoni, G. (2016). Dynamic Programming, Springer.

Cited by 4 articles. 订阅此论文施引文献订阅此论文施引文献，注册后可以免费订阅5篇论文的施引文献，订阅后可以查看论文全部施引文献

1. Comparative study of real-time A-ECMS and rule-based energy management strategies in long haul heavy-duty PHEVs;Energy Conversion and Management: X;2024-07

2. Synergizing Transfer Learning and Multi-Agent Systems for Thermal Parametrization in Induction Traction Motors;Applied Sciences;2024-05-23

3. Deep reinforcement learning implementation on IC engine idle speed control;Ain Shams Engineering Journal;2024-05

4. Comparative Study of Real-Time A-Ecms and Rule-Based Energy Management Strategies in Long Haul Heavy-Duty Phevs;2024