Background: Huge amounts of data are collected by healthcare providers and other institutions. However, there are data protection regulations, which limit their utilisation for secondary use, e.g.

Research: In scenarios, where several data sources are obtained without universal identifiers, record linkage methods need to be applied to obtain a comprehensive dataset.

Objectives: In this study, we had the objective to link two datasets comprising data from ergometric performance tests in order to have reference values to free text annotations for assessing their data quality.

Methods: We applied an iterative, distance-based time series record linkage algorithm to find corresponding entries in the two given datasets. Subsequently, we assessed the resulting matching rate. The implementation was done in Matlab.

Results: The matching rate of our record linkage algorithm was 74.5% for matching patients' records with their ergometry records. The highest rate of appropriate free text annotations was 87.9%.

Conclusion: For the given scenario, our algorithm matched 74.5% of the patients. However, we had no gold standard for validating our results. Most of the free text annotations contained the expected values.

Download full-text PDF

Source

Publication Analysis

Top Keywords

record linkage
16
free text
12
text annotations
12
time series
8
linkage algorithm
8
matching rate
8
data
6
patient record
4
linkage
4
linkage data
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!