Affiliation:
1. Department of Operational Sciences, Air Force Institute of Technology, USA
Abstract
The military medical evacuation (MEDEVAC) dispatching problem involves determining optimal policies for evacuating combat casualties to maximize patient survivability during military operations. This study explores a variation of the MEDEVAC dispatching problem, focusing on controlling armed escorts using a Markov decision process (MDP) model and model-based reinforcement learning (RL) approaches. A discounted, continuous-time MDP model over an infinite horizon is developed to maximize the expected total discounted reward of the system. Two model-based RL solution approaches are proposed: one utilizing semi-gradient descent Q-learning and another employing semi-gradient descent SARSA. A computational example, set in western and central Africa during contingency operations, assesses the performance of the RL-generated policies against the myopic policy, which military medical planners currently employ. Solution quality is derived from expected response time, a crucial determinant of life-saving potential in MEDEVAC operations. The research also explores sensitivity analysis and excursion scenarios to evaluate the RL-generated policies further. By explicitly controlling armed escort assets, dispatching authorities can better manage the location and allocation of these resources throughout combat operations. The findings of this study have the potential to inform military medical planning, operations, and tactics, ultimately leading to improved MEDEVAC system performance and higher patient survivability rates.
Funder
Air Force Office of Scientific Research