The integration of pharmacokinetic-pharmacodynamic (PK-PD) modeling and simulations with artificial intelligence/machine learning algorithms is one of the most attractive areas of the pharmacometric research. These hybrid techniques are currently under investigation to perform several tasks, among which precision dosing. In this scenario, this paper presents and evaluates a new framework embedding PK-PD models into a reinforcement learning (RL) algorithm, Q-learning (QL), to personalize pharmacological treatment. Each patient is represented with a set of PK-PD parameters and has a personal QL agent which optimizes the individual treatment. In the training phase, leveraging PK-PD simulations, the QL agent assesses different actions, defined consistently with the clinical knowledge to consider only plausible dose-adjustments, in order to find the optimal rules. The proposed framework was evaluated to optimize the erdafitinib treatment in patients with metastatic urothelial carcinoma. This drug was approved by the US Food and Drug Administration (FDA) with a dose-adaptive protocol based on monitoring the levels of serum phosphate, which represent a biomarker of both treatment efficacy and toxicity. To evaluate the flexibility of the methodology, a heterogeneous virtual population of 141 patients was generated using an erdafitinib population PK (PopPK)-PD literature model. For each patient, treatment response was simulated by using both QL-optimized protocol and the clinical one. QL agents outperform the approved dose-adaptive rules, increasing more than 10% the efficacy and the safety of treatment at each end point. Results confirm the great potentialities of the integration of PopPK-PD models and RL algorithms to optimize precision dosing tasks.

Download full-text PDF

Source
http://dx.doi.org/10.1002/cpt.3176DOI Listing

Publication Analysis

Top Keywords

reinforcement learning
8
pk-pd models
8
patients metastatic
8
metastatic urothelial
8
urothelial carcinoma
8
precision dosing
8
treatment
6
pk-pd
5
learning pk-pd
4
models integration
4

Similar Publications

Objective: To explore the effects of acupuncture in comparison with sham acupuncture on cognitive functions in patients with relapsing-remitting multiple sclerosis (RRMS).

Methods: In this randomized controlled trial, 31 RRMS patients in the acupuncture group were treated with traditional Chinese acupuncture based on the treatment principle of calming the mind, reinforcing qi and blood, and 31 patients in the control group were treated with sham acupuncture (shallow needling at non-acupuncture points) twice a week for 12 weeks. The primary outcome was the Brief International Cognitive Assessment for Multiple Sclerosis (BICAMS) score, which was evaluated by a psychologist at baseline and after 12 weeks of treatment.

View Article and Find Full Text PDF

Temporal distribution of schedule-induced behavior depends on the essential value of the reinforcer.

J Exp Anal Behav

January 2025

Animal Learning and Behavior Laboratory, Departamento de Psicología Básica I, Facultad de Psicología, Universidad Nacional de Educación a Distancia (UNED), Madrid, Spain.

The development of schedule-induced drinking depends on different variables affecting the food delivered at the end of the interfood interval. There are mixed results concerning the effects of varying magnitude and/or preference of different reinforcers in the development of schedule-induced drinking, with some studies showing higher levels and other studies showing lower levels of drinking. The purpose of this study was to observe how differences in preference for a flavor of equally nutritious food pellets influence the development and maintenance of schedule-induced drinking.

View Article and Find Full Text PDF

MACRPO: Multi-agent cooperative recurrent policy optimization.

Front Robot AI

December 2024

Intelligent Robotics Group, Electrical Engineering and Automation Department, Aalto University, Helsinki, Finland.

This work considers the problem of learning cooperative policies in multi-agent settings with partially observable and non-stationary environments without a communication channel. We focus on improving information sharing between agents and propose a new multi-agent actor-critic method called (MACRPO). We propose two novel ways of integrating information across agents and time in MACRPO: First, we use a recurrent layer in the critic's network architecture and propose a new framework to use the proposed meta-trajectory to train the recurrent layer.

View Article and Find Full Text PDF

Student comprehension of biochemistry in a flipped classroom format.

Smart Learn Environ

December 2024

Department of Biochemistry, University of Nebraska - Lincoln, 1901 Vine St., Beadle N133, Lincoln, NE 68588 USA.

Unlabelled: Concept-heavy courses such as Biochemistry in life and physical science curricula are challenging for many college-aged students. It is easy for students to disengage in a lecture and not learn the subject matter while in class. To improve student learning participation, we employed a flipped format for the first half of the course and compared learning outcomes and attitudes with the traditional lecture in the second half of the course.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!