Clinical VMAT machine parameter optimization for localized prostate cancer using deep reinforcement learning

Author:

Hrinivich William T.1,Bhattacharya Mahasweta1,Mekki Lina1,McNutt Todd1,Jia Xun1,Li Heng1,Song Daniel Y.1,Lee Junghoon1

Affiliation:

1. Department of Radiation Oncology and Molecular Radiation Sciences Johns Hopkins University Baltimore Maryland USA

Abstract

AbstractBackgroundVolumetric modulated arc therapy (VMAT) machine parameter optimization (MPO) remains computationally expensive and sensitive to input dose objectives creating challenges for manual and automatic planning. Reinforcement learning (RL) involves machine learning through extensive trial‐and‐error, demonstrating performance exceeding humans, and existing algorithms in several domains.PurposeTo develop and evaluate an RL approach for VMAT MPO for localized prostate cancer to rapidly and automatically generate deliverable VMAT plans for a clinical linear accelerator (linac) and compare resultant dosimetry to clinical plans.MethodsWe extended our previous RL approach to enable VMAT MPO of a 3D beam model for a clinical linac through a policy network. It accepts an input state describing the current control point and predicts continuous machine parameters for the next control point, which are used to update the input state, repeating until plan termination. RL training was conducted to minimize a dose‐based cost function for prescription of 60 Gy in 20 fractions using CT scans and contours from 136 retrospective localized prostate cancer patients, 20 of which had existing plans used to initialize training. Data augmentation was employed to mitigate over‐fitting, and parameter exploration was achieved using Gaussian perturbations. Following training, RL VMAT was applied to an independent cohort of 15 patients, and the resultant dosimetry was compared to clinical plans. We also combined the RL approach with our clinical treatment planning system (TPS) to automate final plan refinement, and creating the potential for manual review and edits as required for clinical use.ResultsRL training was conducted for 5000 iterations, producing 40 000 plans during exploration. Mean ± SD execution time to produce deliverable VMAT plans in the test cohort was 3.3 ± 0.5 s which were automatically refined in the TPS taking an additional 77.4 ± 5.8 s. When normalized to provide equivalent target coverage, the RL+TPS plans provided a similar mean ± SD overall maximum dose of 63.2 ± 0.6 Gy and a lower mean rectum dose of 17.4 ± 7.4 compared to 63.9 ± 1.5 Gy (p = 0.061) and 21.0 ± 6.0 (p = 0.024) for the clinical plans.ConclusionsAn approach for VMAT MPO using RL for a clinical linac model was developed and applied to automatically generate deliverable plans for localized prostate cancer patients, and when combined with the clinical TPS shows potential to rapidly generate high‐quality plans. The RL VMAT approach shows promise to discover advanced linac control policies through trial‐and‐error, and algorithm limitations and future directions are identified and discussed.

Funder

Commonwealth Fund

Publisher

Wiley

同舟云学术

1.学者识别学者识别

2.学术分析学术分析

3.人才评估人才评估

"同舟云学术"是以全球学者为主线,采集、加工和组织学术论文而形成的新型学术文献查询和分析系统,可以对全球学者进行文献检索和人才价值评估。用户可以通过关注某些学科领域的顶尖人物而持续追踪该领域的学科进展和研究前沿。经过近期的数据扩容,当前同舟云学术共收录了国内外主流学术期刊6万余种,收集的期刊论文及会议论文总量共计约1.5亿篇,并以每天添加12000余篇中外论文的速度递增。我们也可以为用户提供个性化、定制化的学者数据。欢迎来电咨询!咨询电话:010-8811{复制后删除}0370

www.globalauthorid.com

TOP

Copyright © 2019-2024 北京同舟云网络信息技术有限公司
京公网安备11010802033243号  京ICP备18003416号-3