Algorithms to anonymize structured medical and healthcare data: A systematic review.

Front Bioinform

Open Source Research Collaboration, Aalborg, Denmark.

Published: December 2022

With many anonymization algorithms developed for structured medical health data (SMHD) in the last decade, our systematic review provides a comprehensive bird's eye view of algorithms for SMHD anonymization. This systematic review was conducted according to the recommendations in the Cochrane Handbook for Reviews of Interventions and reported according to the Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA). Eligible articles from the PubMed, ACM digital library, Medline, IEEE, Embase, Web of Science Collection, Scopus, ProQuest Dissertation, and Theses Global databases were identified through systematic searches. The following parameters were extracted from the eligible studies: author, year of publication, sample size, and relevant algorithms and/or software applied to anonymize SMHD, along with the summary of outcomes. Among 1,804 initial hits, the present study considered 63 records including research articles, reviews, and books. Seventy five evaluated the anonymization of demographic data, 18 assessed diagnosis codes, and 3 assessed genomic data. One of the most common approaches was k-anonymity, which was utilized mainly for demographic data, often in combination with another algorithm; e.g., l-diversity. No approaches have yet been developed for protection against membership disclosure attacks on diagnosis codes. This study reviewed and categorized different anonymization approaches for MHD according to the anonymized data types (demographics, diagnosis codes, and genomic data). Further research is needed to develop more efficient algorithms for the anonymization of diagnosis codes and genomic data. The risk of reidentification can be minimized with adequate application of the addressed anonymization approaches. : [http://www.crd.york.ac.uk/prospero], identifier [CRD42021228200].

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9815524PMC
http://dx.doi.org/10.3389/fbinf.2022.984807DOI Listing

Publication Analysis

Top Keywords

diagnosis codes
16
systematic review
12
genomic data
12
structured medical
8
data
8
demographic data
8
anonymization approaches
8
codes genomic
8
anonymization
6
algorithms
5

Similar Publications

Non-small cell lung cancer (NSCLC) is the predominant form of lung cancer and poses a significant public health challenge. Early detection is crucial for improving patient outcomes, with serum biomarkers such as carcinoembryonic antigen (CEA), squamous cell carcinoma antigen (SCCAg), and cytokeratin fragment 19 (CYFRA 21-1) playing a critical role in early screening and pathological classification of NSCLC. However, due to being mainly based on corresponding antibody binding reactions, existing detection technologies for these serum biomarkers have shortcomings such as complex operations, high false positive rates, and high costs.

View Article and Find Full Text PDF

An increasing number of studies highlight the critical role of both maternal and paternal nutrition and body weight before conception in shaping offspring health. Traditionally, research has focused on maternal factors, particularly in utero exposures, as key determinants of chronic disease development. However, emerging evidence underscores the significant influence of paternal preconception health on offspring metabolic outcomes.

View Article and Find Full Text PDF

Development of Multiplex Assays for the Identification of Zoonotic Species.

Pathogens

December 2024

Intracellular Pathogens Research Laboratory, Comparative Medicine Institute, College of Veterinary Medicine, North Carolina State University, Raleigh, NC 27606, USA.

More than one-hundred species that affect animals and humans have been described, eight of which have been associated with emerging and underdiagnosed zoonoses. Most diagnostic studies in humans have used serology or molecular assays based on the 18S rRNA gene. Because the 18S rRNA gene is highly conserved, obtaining an accurate diagnosis at the species level is difficult, particularly when the amplified DNA fragment is small.

View Article and Find Full Text PDF

Pregnancy complications associated with thrombophilia represent significant risks for maternal and fetal health, leading to adverse outcomes such as pre-eclampsia, recurrent pregnancy loss, and intra-uterine growth restriction (IUGR). They are caused by disruptions in key physiological processes, including the coagulation cascade, trophoblast invasion, angiogenesis, and immune control. Recent advancements in epigenetics have revealed that non-coding RNAs, especially microRNAs (miRNAs), long non-coding RNAs (lncRNAs), and extracellular vesicles (EVs) carrying these RNAs, play crucial roles in the regulation of these biological processes.

View Article and Find Full Text PDF

During the past decade, a vast number of studies were dedicated to unravelling the obscurities of non-coding RNAs in all fields of the medical sciences. A great amount of data has been accumulated, and consequently a natural need for organization and classification in all subfields arises. The aim of this review is to summarize all reports on microRNAs that were delineated as prognostic biomarkers in laryngeal carcinoma.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!