Background: With the growing adoption of electronic medical records, there are increasing demands for the use of this electronic clinical data in observational research. A frequent ethics board requirement for such secondary use of personal health information in observational research is that the data be de-identified. De-identification heuristics are provided in the Health Insurance Portability and Accountability Act Privacy Rule, funding agency and professional association privacy guidelines, and common practice.
Objective: The aim of the study was to evaluate whether the re-identification risks due to record linkage are sufficiently low when following common de-identification heuristics and whether the risk is stable across sample sizes and data sets.
Methods: Two methods were followed to construct identification data sets. Re-identification attacks were simulated on these. For each data set we varied the sample size down to 30 individuals, and for each sample size evaluated the risk of re-identification for all combinations of quasi-identifiers. The combinations of quasi-identifiers that were low risk more than 50% of the time were considered stable.
Results: The identification data sets we were able to construct were the list of all physicians and the list of all lawyers registered in Ontario, using 1% sampling fractions. The quasi-identifiers of region, gender, and year of birth were found to be low risk more than 50% of the time across both data sets. The combination of gender and region was also found to be low risk more than 50% of the time. We were not able to create an identification data set for the whole population.
Conclusions: Existing Canadian federal and provincial privacy laws help explain why it is difficult to create an identification data set for the whole population. That such examples of high re-identification risk exist for mainstream professions makes a strong case for not disclosing the high-risk variables and their combinations identified here. For professional subpopulations with published membership lists, many variables often needed by researchers would have to be excluded or generalized to ensure consistently low re-identification risk. Data custodians and researchers need to consider other statistical disclosure techniques for protecting privacy.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC1794009 | PMC |
http://dx.doi.org/10.2196/jmir.8.4.e28 | DOI Listing |
Ophthalmol Sci
November 2024
Division of Ophthalmology Informatics and Data Science, Viterbi Family Department of Ophthalmology, Shiley Eye Institute, University of California, San Diego, La Jolla, California.
Purpose: The aim is to assess GPT-4V's (OpenAI) diagnostic accuracy and its capability to identify glaucoma-related features compared to expert evaluations.
Design: Evaluation of multimodal large language models for reviewing fundus images in glaucoma.
Subjects: A total of 300 fundus images from 3 public datasets (ACRIMA, ORIGA, and RIM-One v3) that included 139 glaucomatous and 161 nonglaucomatous cases were analyzed.
EClinicalMedicine
February 2025
Emergency Centre, Zhongnan Hospital of Wuhan University, Wuhan, Hubei, China.
Background: Sepsis is a significant health burden on a global scale. Timely identification and treatment of sepsis can greatly improve patient outcomes, including survival rates. However, time-consuming laboratory results are often needed for screening sepsis.
View Article and Find Full Text PDFBelitung Nurs J
January 2025
Doctorate Program of Medical and Health Science, Faculty of Medicine, Public Health, and Nursing, Universitas Gadjah Mada, Yogyakarta, Indonesia.
Background: Acupressure is a popular form of complementary nursing among adults. It stimulates the body's self-healing processes by enhancing energy flow, or "chi," along meridian channels. Acupressure can address many health issues and is also used on children because it is painless and straightforward.
View Article and Find Full Text PDFBiometrika
October 2024
Department of Statistics, North Carolina State University, 2311 Stinson Drive, Raleigh, North Carolina 27695, USA.
Functional principal component analysis has been shown to be invaluable for revealing variation modes of longitudinal outcomes, which serve as important building blocks for forecasting and model building. Decades of research have advanced methods for functional principal component analysis, often assuming independence between the observation times and longitudinal outcomes. Yet such assumptions are fragile in real-world settings where observation times may be driven by outcome-related processes.
View Article and Find Full Text PDFFront Cardiovasc Med
January 2025
Clinical Laboratory, Children's Hospital Affiliated to Shandong University, Jinan, China.
Background: The nomogram is a powerful and robust tool in disease risk prediction that summarizes complex variables into a visual model that is interpretable with a quantified risk probability. In the current study, a nomogram was developed to predict the occurrence of coronary artery lesions (CALs) among patients with Kawasaki disease (KD). This is especially valuable in the early identification of the risk of CALs, which will lead to proper diagnosis and treatment to reduce their associated complications.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!