Synthetic data generation is the process of using machine learning methods to train a model that captures the patterns in a real dataset. Then new or synthetic data can be generated from that trained model. The synthetic data does not have a one-to-one mapping to the original data or to real patients, and therefore has the potential of privacy preserving properties. There is a growing interest in the application of synthetic data across health and life sciences, but to fully realize the benefits, further education, research, and policy innovation is required. This article summarizes the opportunities and challenges of SDG for health data, and provides directions for how this technology can be leveraged to accelerate data access for secondary purposes.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9619172PMC
http://dx.doi.org/10.1016/j.isci.2022.105331DOI Listing

Publication Analysis

Top Keywords

synthetic data
20
machine learning
8
data
7
synthetic
5
data enabler
4
enabler machine
4
learning applications
4
applications medicine
4
medicine synthetic
4
data generation
4

Similar Publications

Motivation: Understanding the associations between traits and microbial composition is a fundamental objective in microbiome research. Recently, researchers have turned to machine learning (ML) models to achieve this goal with promising results. However, the effectiveness of advanced ML models is often limited by the unique characteristics of microbiome data, which are typically high-dimensional, compositional, and imbalanced.

View Article and Find Full Text PDF

Developing a decision support tool to predict delayed discharge from hospitals using machine learning.

BMC Health Serv Res

January 2025

Department of Industrial Engineering, Dalhousie University, PO Box 15000, Halifax, B3H 4R2, NS, Canada.

Background: The growing demand for healthcare services challenges patient flow management in health systems. Alternative Level of Care (ALC) patients who no longer need acute care yet face discharge barriers contribute to prolonged stays and hospital overcrowding. Predicting these patients at admission allows for better resource planning, reducing bottlenecks, and improving flow.

View Article and Find Full Text PDF

Biomedical research increasingly relies on three-dimensional (3D) cell culture models and artificial-intelligence-based analysis can potentially facilitate a detailed and accurate feature extraction on a single-cell level. However, this requires for a precise segmentation of 3D cell datasets, which in turn demands high-quality ground truth for training. Manual annotation, the gold standard for ground truth data, is too time-consuming and thus not feasible for the generation of large 3D training datasets.

View Article and Find Full Text PDF

The purpose of this article is to infer patient level outcomes from population level randomized control trials (RCTs). In this pursuit, we utilize the recently proposed synthetic nearest neighbors (SNN) estimator. At its core, SNN leverages information across patients to impute missing data associated with each patient of interest.

View Article and Find Full Text PDF

AntiT2DMP-Pred: Leveraging feature fusion and optimization for superior machine learning prediction of type 2 diabetes mellitus.

Methods

January 2025

Department of Physiology, Ajou University School of Medicine, Suwon 16499 Republic of Korea; Department of Molecular Science and Technology, Ajou University, Suwon 16499 Republic of Korea. Electronic address:

Pancreatic α-amylase breaks down starch into isomaltose and maltose, which are further hydrolyzed by α-glucosidase in the intestine into monosaccharides, rapidly raising blood sugar levels and contributing to type 2 diabetes mellitus (T2DM). Synthetic inhibitors of carbohydrate-digesting enzymes are used to manage T2DM but may harm organ function over time. Bioactive peptides offer a safer alternative, avoiding such adverse effects.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!