Purpose: High-dose-rate (HDR) brachytherapy (BT) is an effective cancer treatment method in which the radiation source is placed within the body. Treatment planning is a critical component for a successful outcome. Almost all currently proposed treatment planning methods are built on stochastic heuristic algorithms, which limits the generation of higher quality plans. This study proposed a novel treatment planning method to adjust dwell times in a human-like fashion to improve the quality of the plan.
Methods: We built an intelligent treatment planner network (ITPN) based on deep reinforcement learning (DRL). The network architecture of ITPN is Dueling Double-Deep Q Network. The state is the dwell time of each dwell position and the action is which dwell time to adjust and how to adjust it. A hybrid equivalent uniform dose objective function was established and assigned corresponding rewards according to its changes. Experience replay was performed with the epsilon greedy algorithm and SumTree data structure.
Results: In the evaluation of ITPN using 20 patient cases, D, D and V showed no significant difference compared with inverse planning simulated annealing (IPSA) optimization. However, D of bladder, rectum and sigmoid, V and V were significant reduced, and homogeneity index and conformity index were significantly increased.
Conclusion: The proposed ITPN was able to generate higher quality plans based on the learned dwell time adjustment policy than IPSA. This is the first artificial intelligence system that can directly determine the dwell times of HDR BT, which demonstrated the potential feasibility of solving optimization problems via DRL.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.ejmp.2021.12.009 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!