Privacy preserving Generative Adversarial Networks to model Electronic Health Records.

Neural Netw

School of Computer Science and Electronic Engineering, University of Essex, Wivenhoe Park, Colchester CO4 3SQ, United Kingdom. Electronic address:

Published: September 2022

Hospitals and General Practitioner (GP) surgeries within National Health Services (NHS), collect patient information on a routine basis to create personal health records such as family medical history, chronic diseases, medications and dosing. The collected information could be used to build and model various machine learning algorithms, to simplify the task of those working within the NHS. However, such Electronic Health Records are not made publicly available due to privacy concerns. In our paper, we propose a privacy-preserving Generative Adversarial Network (pGAN), which can generate synthetic data of high quality, while preserving the privacy and statistical properties of the source data. pGAN is evaluated on two distinct datasets, one posing as a Classification task, and the other as a Regression task. Privacy score of generated data is calculated using the Nearest Neighbour Adversarial Accuracy. Cosine similarity scores of synthetic data from our proposed model indicate that the data generated is similar in nature, but not identical. Additionally, our proposed model was able to preserve privacy while maintaining high utility. Machine learning models trained on both synthetic data and original data have achieved accuracies of 74.3% and 74.5% respectively on the classification dataset; while they have attained an R2-Score of 0.84 and 0.85 on synthetic and original data of the regression task respectively. Our results, therefore, indicate that synthetic data from the proposed model could replace the use of original data for machine learning while preserving privacy.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.neunet.2022.06.022DOI Listing

Publication Analysis

Top Keywords

synthetic data
16
health records
12
machine learning
12
proposed model
12
original data
12
data
10
generative adversarial
8
electronic health
8
preserving privacy
8
regression task
8

Similar Publications

The use of synthetic data is a promising solution to facilitate the sharing and reuse of health-related data beyond its initial collection while addressing privacy concerns. However, there is still no consensus on a standardized approach for systematically evaluating the privacy and utility of synthetic data, impeding its broader adoption. In this work, we present a comprehensive review and systematization of current methods for evaluating synthetic health-related data, focusing on both privacy and utility aspects.

View Article and Find Full Text PDF

Recent barcoding technologies allow reconstructing lineage trees while capturing paired single-cell RNA-sequencing (scRNA-seq) data. Such datasets provide opportunities to compare gene expression memory maintenance through lineage branching and pinpoint critical genes in these processes. Here we develop Permutation, Optimization, and Representation learning based single Cell gene Expression and Lineage ANalysis (PORCELAN) to identify lineage-informative genes or subtrees where lineage and expression are tightly coupled.

View Article and Find Full Text PDF

A dual-domain network with division residual connection and feature fusion for CBCT scatter correction.

Phys Med Biol

January 2025

School of Biomedical Engineering, ShanghaiTech University, No. 1 Zhongke Road, Pudong New Area, Shanghai, Shanghai, 201210, CHINA.

Objective: This study aims to propose a dual-domain network that not only reduces scatter artifacts but also retains structure details in CBCT.

Approach: The proposed network comprises a projection-domain sub-network and an image-domain sub-network. The projection-domain sub-network utilizes a division residual network to amplify the difference between scatter signals and imaging signals, facilitating the learning of scatter signals.

View Article and Find Full Text PDF

The chemical investigation of the fruits of Garcinia schomburgkiana growing in Vietnam led to the isolation of a new anofinic acid derivative, 5-hydroxy-8-methoxyanofinic acid (1), a new xanthone, xanthoschome C (2), and a known synthetic phenolic analogue, 4-(2-hydroxybenzyl)-2-(4-hydroxybenzyl) phenol (3), along with seven known xanthones (4-10). The structures of all isolated compounds were determined using spectroscopic techniques (NMR and MS), in conjunction with comparison to existing literature data. All isolated compounds were assessed for their α-glucosidase inhibitory activity and showed significant inhibition, with IC50 values ranging from 12.

View Article and Find Full Text PDF

Adeno-associated virus (AAV) inverted terminal repeats (ITRs) induce p53-dependent apoptosis in human embryonic stem cells (hESCs). To interrogate this phenomenon, a synthetic ITR (SynITR), harboring substitutions in putative p53 binding sites was generated and evaluated for vector production and gene delivery. Replication of SynITR flanked transgenic genome was similar compared to wild type (wt) ITR, with a modest increase in vector titers.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!