Deep reinforcement learning for automated radiation adaptation in lung cancer.

Huan-Hsin Tseng Yi Luo Sunan Cui Jen-Tzung Chien Randall K Ten Haken Issam El Naqa

Med Phys

Department of Radiation Oncology, University of Michigan, Ann Arbor, MI, USA.

Published: December 2017

Purpose: To investigate deep reinforcement learning (DRL) based on historical treatment plans for developing automated radiation adaptation protocols for nonsmall cell lung cancer (NSCLC) patients that aim to maximize tumor local control at reduced rates of radiation pneumonitis grade 2 (RP2).

Methods: In a retrospective population of 114 NSCLC patients who received radiotherapy, a three-component neural networks framework was developed for deep reinforcement learning (DRL) of dose fractionation adaptation. Large-scale patient characteristics included clinical, genetic, and imaging radiomics features in addition to tumor and lung dosimetric variables. First, a generative adversarial network (GAN) was employed to learn patient population characteristics necessary for DRL training from a relatively limited sample size. Second, a radiotherapy artificial environment (RAE) was reconstructed by a deep neural network (DNN) utilizing both original and synthetic data (by GAN) to estimate the transition probabilities for adaptation of personalized radiotherapy patients' treatment courses. Third, a deep Q-network (DQN) was applied to the RAE for choosing the optimal dose in a response-adapted treatment setting. This multicomponent reinforcement learning approach was benchmarked against real clinical decisions that were applied in an adaptive dose escalation clinical protocol. In which, 34 patients were treated based on avid PET signal in the tumor and constrained by a 17.2% normal tissue complication probability (NTCP) limit for RP2. The uncomplicated cure probability (P+) was used as a baseline reward function in the DRL.

Results: Taking our adaptive dose escalation protocol as a blueprint for the proposed DRL (GAN + RAE + DQN) architecture, we obtained an automated dose adaptation estimate for use at ∼2/3 of the way into the radiotherapy treatment course. By letting the DQN component freely control the estimated adaptive dose per fraction (ranging from 1-5 Gy), the DRL automatically favored dose escalation/de-escalation between 1.5 and 3.8 Gy, a range similar to that used in the clinical protocol. The same DQN yielded two patterns of dose escalation for the 34 test patients, but with different reward variants. First, using the baseline P+ reward function, individual adaptive fraction doses of the DQN had similar tendencies to the clinical data with an RMSE = 0.76 Gy; but adaptations suggested by the DQN were generally lower in magnitude (less aggressive). Second, by adjusting the P+ reward function with higher emphasis on mitigating local failure, better matching of doses between the DQN and the clinical protocol was achieved with an RMSE = 0.5 Gy. Moreover, the decisions selected by the DQN seemed to have better concordance with patients eventual outcomes. In comparison, the traditional temporal difference (TD) algorithm for reinforcement learning yielded an RMSE = 3.3 Gy due to numerical instabilities and lack of sufficient learning.

Conclusion: We demonstrated that automated dose adaptation by DRL is a feasible and a promising approach for achieving similar results to those chosen by clinicians. The process may require customization of the reward function if individual cases were to be considered. However, development of this framework into a fully credible autonomous system for clinical decision support would require further validation on larger multi-institutional datasets.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5734677	PMC
http://dx.doi.org/10.1002/mp.12625	DOI Listing

Publication Analysis

Top Keywords

reinforcement learning

reward function

deep reinforcement

adaptive dose

dose escalation

clinical protocol

dose

automated radiation

radiation adaptation

lung cancer

Similar Publications

Trajectory Tracking Control for Robotic Manipulator Based on Soft Actor-Critic and Generative Adversarial Imitation Learning.

Biomimetics (Basel)

December 2024

School of Computer Science and Technology, Dongguan University of Technology, Dongguan 523808, China.

Jintao Hu Fujie Wang Xing Li Yi Qin Fang Guo

In this paper, a deep reinforcement learning (DRL) approach based on generative adversarial imitation learning (GAIL) and long short-term memory (LSTM) is proposed to resolve tracking control problems for robotic manipulators with saturation constraints and random disturbances, without learning the dynamic and kinematic model of the manipulator. Specifically, it limits the torque and joint angle to a certain range. Firstly, in order to cope with the instability problem during training and obtain a stability policy, soft actor-critic (SAC) and LSTM are combined.

View Article and Find Full Text PDF

Similar Publications

ARS: AI-Driven Recovery Controller for Quadruped Robot Using Single-Network Model.

Biomimetics (Basel)

December 2024

Department of Mechanical Engineering, Sungkyunkwan University, 2066, Seobu-ro, Jangan-gu, Suwon 16419, Republic of Korea.

Han Sol Kang Hyun Yong Lee Ji Man Park Seong Won Nam Yeong Woo Son

Legged robots, especially quadruped robots, are widely used in various environments due to their advantage in overcoming rough terrains. However, falling is inevitable. Therefore, the ability to overcome a falling state is an essential ability for legged robots.

View Article and Find Full Text PDF

Similar Publications

Is it feasible to use AI-based drug design methods in the process of generating effective COVID-19 inhibitors? A validation study using molecular docking, molecular simulation, and pharmacophore methods.

J Biomol Struct Dyn

December 2024

School of Biomedical Engineering and Technology, Tianjin Medical University, Tianjin, China.

Hanyang Qu Shengpeng Wang Mingyang He Yuhui Wu Fei Yan

Although the COVID-19 pandemic has been brought under control to some extent globally, there is still debate in the industry about the feasibility of using artificial intelligence (AI) to generate COVID small-molecule inhibitors. In this study, we explored the feasibility of using AI to design effective inhibitors of COVID-19. By combining a generative model with reinforcement learning and molecular docking, we designed small-molecule inhibitors targeting the COVID-19 3CLpro enzyme.

View Article and Find Full Text PDF

Similar Publications

Learning enhances behaviorally relevant representations in apical dendrites.

Elife

December 2024

Department of Neuroscience, Columbia University, New York, United States.

Sam E Benezra Kripa B Patel Citlali Perez Campos Elizabeth M C Hillman Randy M Bruno

Learning alters cortical representations and improves perception. Apical tuft dendrites in cortical layer 1, which are unique in their connectivity and biophysical properties, may be a key site of learning-induced plasticity. We used both two-photon and SCAPE microscopy to longitudinally track tuft-wide calcium spikes in apical dendrites of layer 5 pyramidal neurons in barrel cortex as mice learned a tactile behavior.

View Article and Find Full Text PDF

Similar Publications

The use of the differential outcomes procedure for the recognition of facial expressions of complex emotions and its electrophysiological correlates.

Front Psychol

December 2024

Department of Psychology, University of Almería, Almería, Spain.

Ángel García-Pérez Isabel Carmona Angeles F Estévez

The differential outcomes procedure (DOP) is an easily applicable method for enhancing discriminative learning and recognition memory. Its effectiveness in improving the recognition of facial expressions of emotion has been recently explored, with mixed success. This study aims to explore whether the expectancies generated via the DOP are reflected as differences in event-related potentials (ERPs) between participants in differential (DOP) or non-differential conditions (NOP) in a facial expression of complex emotion label task.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!