MEASURES USED TO PROTECT SUBJECTS in publicly distributed microdata files often have a significant negative impact on key analytic uses of the data. For example, it may be important to analyze subpopulations within a data file such as racial minorities, yet these subjects may present the greatest disclosure risk because their records tend to stand out or be unique. Files or records that are linkable create another type of disclosure risk-common elements between two files can be used to link files with sensitive data to externally available files that disclose identity. Examples of disclosure limitation methods used to address these types of issues include blanking out data, coarsening response categories, or withholding data altogether. However, the very detail that creates the greatest risk also provides insight into differences that are of greatest interest to analysts. Restricted-use agreements that provide unaltered versions of the data may not be available, or only selectively so. The public-use version of the data is very important because it is likely to be the only one to which most researchers, policy analysts, teaching faculty, and students will ever have access. Hence, it is the version from which much of the utility of the data is extracted and often it effectively becomes the historical record of the data collection. This underscores the importance that the disclosure review c ommittee s trikes a g ood b alance b etween protection and u tility. In this paper we d escrib e our disclosure review committee's (DRC) analysis and resulting data protection plans for two national studies and one administrative data system. Three distinct disclosure limitation methods were employed, taking key uses of the data into consideration, to protect respondents while still providing statistically accurate and highly useful public-use data. The techniques include data swapping, microaggregation, and suppression of detailed geographic data. We describe the characteristics of the data sets that led to the selection of these methods, provide measures of the statistical impact, and give details of their implementations so that others may also utilize them. We briefly discuss the composition of our DRC, highlighting what we believe to be the important disciplines and experience represented by the group.

Download full-text PDF

Source
http://dx.doi.org/10.1525/jer.2006.1.3.63DOI Listing

Publication Analysis

Top Keywords

data
16
disclosure risk
8
key analytic
8
disclosure limitation
8
limitation methods
8
disclosure review
8
disclosure
7
files
5
solving problems
4
problems disclosure
4

Similar Publications

Treatment of pediatric drug-resistant generalized epilepsy with responsive neurostimulation of the centromedian nucleus of the thalamus: A case series of seven patients.

Epilepsy Res

January 2025

Division of Pediatric Neurology, Department of Pediatrics, University of Pittsburgh School of Medicine, Children's Hospital of Pittsburgh, 8th Floor Faculty Pavilion, 4401 Penn Ave., Pittsburgh, PA 15224, United States. Electronic address:

Purpose: Responsive neurostimulation of the centromedian nucleus of the thalamus (CM RNS) is being investigated for treatment of drug-resistant generalized epilepsy with promising results. The aim of this study is to report outcomes of seven patients with pediatric-onset drug-resistant generalized epilepsy, including both genetic generalized epilepsy (GGE) and Lennox-Gastaut syndrome (LGS), who underwent treatment with bilateral CM RNS.

Methods: A retrospective chart review was performed for patients with drug-resistant generalized epilepsy who underwent treatment with bilateral CM RNS at Children's Hospital of Pittsburgh from 2020 to 2022.

View Article and Find Full Text PDF

Background: Long-leg alignment and joint line obliquity have traditionally been assessed using two-dimensional (2D) radiography, but the accuracy of this measurement has remained unclear. This study aimed to evaluate the accuracy of 2D measurements of lateral distal femoral angle (LDFA) and medial proximal tibial angle (MPTA) using upright three-dimensional (3D) computed tomography (CT).

Methods: This study involved 66 knees from 38 patients (34 women, four men) with knee osteoarthritis (OA), categorized by Kellgren-Lawrence (KL) grade.

View Article and Find Full Text PDF

Nursing staffing and skill mix in nursing homes in Italy: An observational cross-sectional study.

Geriatr Nurs

January 2025

Ordine delle Professioni Infermieristiche di Bergamo, via Pietro Rovelli 45, Bergamo 24125, Italy.

Introduction/objective: The relationship between staffing levels and skill mix in nursing homes is poorly documented in Italy. This study aimed to investigate nursing staffing levels and skill mix in Northern Italian nursing homes.

Methods: A cross-sectional observational study was conducted using a questionnaire sent to several nursing homes.

View Article and Find Full Text PDF

The feasibility of using machine learning to predict COVID-19 cases.

Int J Med Inform

January 2025

School of Geography and the Environment, University of Oxford, South Parks Road, Oxford OX1 3QY, United Kingdom. Electronic address:

Background: Coronavirus Disease 2019 (COVID-19), caused by the SARS-CoV-2 virus, emerged as a global health crisis in 2019, resulting in widespread morbidity and mortality. A persistent challenge during the pandemic has been the accuracy of reported epidemic data, particularly in underdeveloped regions with limited access to COVID-19 test kits and healthcare infrastructure. In the post-COVID era, this issue remains crucial.

View Article and Find Full Text PDF

Identification of an ANCA-associated vasculitis cohort using deep learning and electronic health records.

Int J Med Inform

January 2025

Rheumatology and Allergy Clinical Epidemiology Research Center and Division of Rheumatology, Allergy, and Immunology, and Mongan Institute, Department of Medicine, Massachusetts General Hospital Boston MA USA. Electronic address:

Background: ANCA-associated vasculitis (AAV) is a rare but serious disease. Traditional case-identification methods using claims data can be time-intensive and may miss important subgroups. We hypothesized that a deep learning model analyzing electronic health records (EHR) can more accurately identify AAV cases.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!