Reliability assessment of off-policy deep reinforcement learning: A benchmark for aerodynamics-Reference-Cited by-同舟云学术

Reliability assessment of off-policy deep reinforcement learning: A benchmark for aerodynamics

Published:2024 Issue: Volume:5 Page:
ISSN:2632-6736
Container-title:Data-Centric Engineering
language:en
Short-container-title:DCE

Author:

Berger Sandrine^ORCID,Arroyo Ramo Andrea,Guillet Valentin,Lahire Thibault,Martin Brice,Jardin Thierry,Rachelson Emmanuel,Bauerheim Michaël^ORCID

Abstract

Abstract Deep reinforcement learning (DRL) is promising for solving control problems in fluid mechanics, but it is a new field with many open questions. Possibilities are numerous and guidelines are rare concerning the choice of algorithms or best formulations for a given problem. Besides, DRL algorithms learn a control policy by collecting samples from an environment, which may be very costly when used with Computational Fluid Dynamics (CFD) solvers. Algorithms must therefore minimize the number of samples required for learning (sample efficiency) and generate a usable policy from each training (reliability). This paper aims to (a) evaluate three existing algorithms (DDPG, TD3, and SAC) on a fluid mechanics problem with respect to reliability and sample efficiency across a range of training configurations, (b) establish a fluid mechanics benchmark of increasing data collection cost, and (c) provide practical guidelines and insights for the fluid dynamics practitioner. The benchmark consists in controlling an airfoil to reach a target. The problem is solved with either a low-cost low-order model or with a high-fidelity CFD approach. The study found that DDPG and TD3 have learning stability issues highly dependent on DRL hyperparameters and reward formulation, requiring therefore significant tuning. In contrast, SAC is shown to be both reliable and sample efficient across a wide range of parameter setups, making it well suited to solve fluid mechanics problems and set up new cases without tremendous effort. In particular, SAC is resistant to small replay buffers, which could be critical if full-flow fields were to be stored.

Publisher

Cambridge University Press (CUP)

Subject

Applied Mathematics,Computer Science Applications,General Engineering,Statistics and Probability

Reference52 articles.

1. Controlled gliding and perching through deep-reinforcement-learning;Novati;Physical Review Fluids,2019

2. Henderson, P , Islam, R , Bachman, P , Pineau, J , Precup, D and Meger, D (2018) Deep reinforcement learning that matters. In Proceedings of the AAAI Conference on Artificial Intelligence, Vol. 32. Association for the Advancement of Artificial Intelligence (AAAI), Palo Alto, California, USA.

3. Control of chaotic systems by deep reinforcement learning;Bucci;Proceedings of the Royal Society A: Mathematical, Physical and Engineering Sciences,2019

4. Exploiting locality and translational invariance to design effective deep reinforcement learning control of the 1-dimensional unstable falling liquid film;Belus;AIP Advances,2019

5. Deep reinforcement learning for large-eddy simulation modeling in wall-bounded turbulence;Kim;Physics of Fluids,2022