Background: Despite numerous past endeavors for the semantic harmonization of Alzheimer's disease (AD) cohort studies, an automatic tool has yet to be developed.

Objective: As cohort studies form the basis of data-driven analysis, harmonizing them is crucial for cross-cohort analysis. We aimed to accelerate this task by constructing an automatic harmonization tool.

Methods: We created a common data model (CDM) through cross-mapping data from 20 cohorts, three CDMs, and ontology terms, which was then used to fine-tune a BioBERT model. Finally, we evaluated the model using three previously unseen cohorts and compared its performance to a string-matching baseline model.

Results: Here, we present our AD-Mapper interface for automatic harmonization of AD cohort studies, which outperformed a string-matching baseline on previously unseen cohort studies. We showcase our CDM comprising 1218 unique variables.

Conclusion: AD-Mapper leverages semantic similarities in naming conventions across cohorts to improve mapping performance.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11191441PMC
http://dx.doi.org/10.3233/JAD-240116DOI Listing

Publication Analysis

Top Keywords

cohort studies
16
semantic harmonization
8
harmonization alzheimer's
8
alzheimer's disease
8
automatic harmonization
8
string-matching baseline
8
disease datasets
4
datasets ad-mapper
4
ad-mapper background
4
background despite
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!