Learning parametric policies and transition probability models of markov decision processes from data.

Tingting Xu Henghui Zhu Ioannis Ch Paschalidis

Eur J Control

Department of Electrical and Computer Engineering, Division of Systems Engineering, and Department of Biomedical Engineering, Boston University, 8 St. Mary's St., Boston, MA 02215.

Published: January 2021

We consider the problem of estimating the policy and transition probability model of a Markov Decision Process from data (state, action, next state tuples). The transition probability and policy are assumed to be parametric functions of a sparse set of features associated with the tuples. We propose two regularized maximum likelihood estimation algorithms for learning the transition probability model and policy, respectively. An upper bound is established on the regret, which is the difference between the average reward of the estimated policy under the estimated transition probabilities and that of the original unknown policy under the true (unknown) transition probabilities. We provide a sample complexity result showing that we can achieve a low regret with a relatively small amount of training samples. We illustrate the theoretical results with a healthcare example and a robot navigation experiment.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7944408	PMC
http://dx.doi.org/10.1016/j.ejcon.2020.04.003	DOI Listing

Publication Analysis

Top Keywords

transition probability

markov decision

probability model

transition probabilities

transition

policy

learning parametric

parametric policies

policies transition

probability

Similar Publications

Joint trajectories of physical activity, health, and income before and after statutory retirement: A 22-year follow-up.

PLoS One

January 2025

Department of Public Health, University of Helsinki, Helsinki, Finland.

Tea Lallukka Petteri Kolmonen Ossi Rahkonen Eero Lahelma Jouni Lahti

Background: Health behaviors, health, and income change during aging. However, no previous studies have examined, how they develop together over the transition to statutory retirement. We aimed to examine their joint development and to identify the determinants of any distinct trajectories.

View Article and Find Full Text PDF

Similar Publications

Transitions through the HIV continuum of care in people enrolling in care with advanced HIV disease in Latin America.

IJID Reg

March 2025

Instituto Nacional de Ciencias Médicas y Nutrición Salvador Zubirán, Mexico City, Mexico.

Pablo F Belaunzarán-Zamudio Peter F Rebeiro Yanink Caro-Vega Jessica Castilho Brenda E Crabtree-Ramírez

Objectives: Advanced HIV disease (AHD) at HIV care enrollment is common in Latin America and may bias cross-sectional care continuum estimates. We therefore explored the impact of AHD on HIV care continuum outcomes using a longitudinal approach.

Methods: We analyzed trajectories of 26,174 adult people with HIV enrolled at Caribbean, Central and South America network for HIV epidemiology (CCASAnet) sites (2003-2019) using multi-state Cox regression across five stages: (i) enrolled without antiretroviral therapy (no-ART); (ii) on ART without viral suppression (viral load ≥200 copies/m; ART + non-VS); (iii) on ART with viral suppression (viral load <200 copies/ml; ART + VS); (iv) lost to follow-up; (v) death.

View Article and Find Full Text PDF

Similar Publications

Development and Validation of a Competitive Risk Model in Elderly Patients with Transitional Cell Bladder Carcinoma.

Med Sci Monit

January 2025

Department of Urology, Ningbo Municipal Hospital of Traditional Chinese Medicine (TCM), Affiliated Hospital of Zhejiang Chinese Medical University, Ningbo, Zhejiang, China.

Libin Yang Chao Chen Qianghui Wang Zhiliang Zhuang Tao Sun

BACKGROUND Transitional cell bladder carcinoma (tcBC) is the predominant form of bladder cancer, making up around 95% of reported cases. Prognostic factors for older individuals with tcBC differ from those affecting younger patients. The main purpose of this study was to establish a prognostic competing risk model for elderly patients with tcBC.

View Article and Find Full Text PDF

Similar Publications

Cardiovascular disease, bone fracture, and all-cause mortality risks among postmenopausal women by arthritis and veteran status: A multistate Markov transition analysis.

Geroscience

January 2025

U.S. Department of Veterans Affairs, VA National Center On Homelessness Among Veterans, Washington, DC, USA.

Hind A Beydoun May A Beydoun Jordan Weiss Robert Brunner Nishant K Mishra

Arthritis, a chronic inflammatory condition linked to cardiovascular disease (CVD) and bone fracture, is more frequent among military veterans and postmenopausal women. This study examined correlates of arthritis and relationships of arthritis with risks of developing CVD, bone fractures, and mortality among postmenopausal veteran and non-veteran women. We analyzed longitudinal data on 135,790 (3,436 veteran and 132,354 non-veteran) postmenopausal women from the Women's Health Initiative who were followed-up for an average of 16 years between enrollment (1993-1998) and February 17, 2024.

View Article and Find Full Text PDF

Similar Publications

"Multimodal Sleep Signal Tensor Decomposition and Hidden Markov Modeling for Temazepam-Induced Anomalies Across Age Groups".

J Neurosci Methods

January 2025

School of Electrical and Computer Engineering, Gallogly College of Engineering, University of Oklahoma, Norman, OK 73019, USA.

Parikshat Sirpal William A Sikora Hazem H Refai

Background: Recent advances in multimodal signal analysis enable the identification of subtle drug-induced anomalies in sleep that traditional methods often miss.

New Method: We develop and introduce the Dynamic Representation of Multimodal Activity and Markov States (DREAMS) framework, which embeds explainable artificial intelligence (XAI) techniques to model hidden state transitions during sleep using tensorized EEG, EMG, and EOG signals from 22 subjects across three age groups (18-29, 30-49, and 50-66 years). By combining Tucker decomposition with probabilistic Hidden Markov Modeling, we quantified age-specific, temazepam-induced hidden states and significant differences in transition probabilities.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!