Privacy-preserving heterogeneous health data sharing.

J Am Med Inform Assoc

Department of Computer Science and Software Engineering, Concordia University, Montreal, Quebec, Canada.

Published: May 2013

Objective: Privacy-preserving data publishing addresses the problem of disclosing sensitive data when mining for useful information. Among existing privacy models, ε-differential privacy provides one of the strongest privacy guarantees and makes no assumptions about an adversary's background knowledge. All existing solutions that ensure ε-differential privacy handle the problem of disclosing relational and set-valued data in a privacy-preserving manner separately. In this paper, we propose an algorithm that considers both relational and set-valued data in differentially private disclosure of healthcare data.

Methods: The proposed approach makes a simple yet fundamental switch in differentially private algorithm design: instead of listing all possible records (ie, a contingency table) for noise addition, records are generalized before noise addition. The algorithm first generalizes the raw data in a probabilistic way, and then adds noise to guarantee ε-differential privacy.

Results: We showed that the disclosed data could be used effectively to build a decision tree induction classifier. Experimental results demonstrated that the proposed algorithm is scalable and performs better than existing solutions for classification analysis.

Limitation: The resulting utility may degrade when the output domain size is very large, making it potentially inappropriate to generate synthetic data for large health databases.

Conclusions: Unlike existing techniques, the proposed algorithm allows the disclosure of health data containing both relational and set-valued data in a differentially private manner, and can retain essential information for discriminative analysis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC3628047PMC
http://dx.doi.org/10.1136/amiajnl-2012-001027DOI Listing

Publication Analysis

Top Keywords

relational set-valued
12
set-valued data
12
differentially private
12
data
10
health data
8
problem disclosing
8
ε-differential privacy
8
existing solutions
8
data differentially
8
noise addition
8

Similar Publications

This paper aims to develop the Euler implicit time-discretization of multivariable sliding-mode controllers to solve the numerical chattering problem without modifying the continuous-time control law. To this end, a continuous-time multi-input plant under a multivariable sliding-mode control is studied, and it is shown that the implicit discretization of the continuous-time sliding-mode controller leads to a multivariable generalized equation with several set-valued terms which is not possible to be solved using the graphical interpretations. Subsequently, an algorithm is proposed to solve such a multivariable generalized equation required to synthesize the implicit sliding-mode control signal at each time step.

View Article and Find Full Text PDF

In this study, we examine the Generalized Equations' subregularity in Asplund spaces utilizing a novel approach. We obtain sufficient conditions for a family of multifunctions to be metrically subregular which are stronger than the known sufficient conditions thanks to a modification of the well-known coderivative concept and of the partial sequential normal compactness.

View Article and Find Full Text PDF

Toward assessing clinical trial publications for reporting transparency.

J Biomed Inform

April 2021

Urban Vitality Center of Expertise, Faculty of Health, Amsterdam University of Applied Sciences, Amsterdam, the Netherlands; Department of Cardiology Heart Center, Amsterdam UMC, University of Amsterdam, the Netherlands.

Objective: To annotate a corpus of randomized controlled trial (RCT) publications with the checklist items of CONSORT reporting guidelines and using the corpus to develop text mining methods for RCT appraisal.

Methods: We annotated a corpus of 50 RCT articles at the sentence level using 37 fine-grained CONSORT checklist items. A subset (31 articles) was double-annotated and adjudicated, while 19 were annotated by a single annotator and reconciled by another.

View Article and Find Full Text PDF

In this article, the issues of finite-time synchronization and finite-time adaptive synchronization for the impulsive memristive neural networks (IMNNs) with discontinuous activation functions (DAFs) and hybrid impulsive effects are probed into and elaborated on, where the stabilizing impulses (SIs), inactive impulses (IIs), and destabilizing impulses (DIs) are taken into account, respectively. Not resembling several earlier works, a more extensive range of impulses in the context of impulsive effects has been analyzed without using the known average impulsive interval strategy (AIIS). In light of the theories of differential inclusions and set-valued map, as well as impulsive control, new sufficient criteria with respect to the estimated settling time for synchronization of the related IMNNs are established using two types of switching control approaches, which sufficiently utilize information from not only the SIs, DIs, and DAFs but also the impulse sequences.

View Article and Find Full Text PDF

We present a method called SMDG (Single Multi-Disease Genes) for systematic discovery of monogenic causes of multi-diseases. Multi-disease conditions, quite common in older populations, are difficult to treat due to missing their precise medical guidelines and need for attention of multiple health care providers. Finding monogenic causes of these diseases would enable introducing new therapeutic approaches, focused on the remediation of mutations of single genes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!