Conditional Generative Models for Synthetic Tabular Data: Applications for Precision Medicine and Diverse Representations.

Annu Rev Biomed Data Sci

2Departments of Bioengineering and Genetics, Stanford University, Stanford, California, USA.

Published: January 2025

Tabular medical datasets, like electronic health records (EHRs), biobanks, and structured clinical trial data, are rich sources of information with the potential to advance precision medicine and optimize patient care. However, real-world medical datasets have limited patient diversity and cannot simulate hypothetical outcomes, both of which are necessary for equitable and effective medical research. Fueled by recent advancements in machine learning, generative models offer a promising solution to these data limitations by generating enhanced synthetic data. This review highlights the potential of conditional generative models (CGMs) to create patient-specific synthetic data for a variety of precision medicine applications. We survey CGM approaches that tackle two medical applications: correcting for data representation biases and simulating digital health twins. We additionally explore how the surveyed methods handle modeling tabular medical data and briefly discuss evaluation criteria. Finally, we summarize the technical, medical, and ethical challenges that must be addressed before CGMs can be effectively and safely deployed in the medical field.

Download full-text PDF

Source
http://dx.doi.org/10.1146/annurev-biodatasci-103123-094844DOI Listing

Publication Analysis

Top Keywords

generative models
12
precision medicine
12
conditional generative
8
tabular medical
8
medical datasets
8
synthetic data
8
data
7
medical
7
models synthetic
4
synthetic tabular
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!