Author:
Hohbach Annika,Jordaan Hendrik Willem,Engelbrecht Japie
Abstract
Payload transport using rotary-wing unmanned aerial vehicles (RUAVs) has grown in popularity. Suspending the payload from the RUAV expands the range of use cases but does so at the expense of changing the dynamics that may result in instability. This paper explores how a model-free solution can be used to plan robust trajectories that account for the added oscillations if the payload is varied. The twin-delayed deep deterministic policy gradient (TD3) algorithm is a model-free reinforcement learning method and is used to train an agent to be implemented as a local planner to plan optimal trajectories that can complete two defined circuits whilst minimizing the swing of the payload. A non-linear model predictive controller (NMPC) is implemented as a model-based solution to evaluate the capabilities of the model-free results determining its viability, without choosing a superior approach. The results indicate that the model-free TD3 agent has comparable results to the model-based NMPC for two defined circuits, with increased robustness to payload uncertainty when trained with different payload parameters.
Subject
Computer Networks and Communications,Hardware and Architecture,Software