Background: Over the previous 4 decennial censuses, the population of the United States has grown older, with the proportion of individuals aged at least 90 years old in the 2010 census being more than 2 and a half times what it was in the 1980 census. This suggests that the threshold for constraining age introduced in the Safe Harbor method of the HIPAA (Health Insurance Portability and Accountability Act) in 1996 may be increased without exceeding the original levels of risk. This is desirable to maintain or even increase the utility of affected data sets without compromising privacy.
View Article and Find Full Text PDFAccurate record linkage depends on the availability and quality of features such as first name and last name. Privacy preserving record linkage methods using tokenization is sensitive to perturbations in the patient features used as inputs. In this study we evaluated the impact of name transformations on the accuracy of patient matching using a large commercial dataset.
View Article and Find Full Text PDFObjective: Our objective was to evaluate tokens commonly used by clinical research consortia to aggregate clinical data across institutions.
Methods: This study compares tokens alone and token-based matching algorithms against manual annotation for 20,002 record pairs extracted from the University of Texas Houston's clinical data warehouse (CDW) in terms of entity resolution.
Results: The highest precision achieved was 99.