Harmonizing Norwegian registries onto OMOP common data model: Mapping challenges and opportunities for pregnancy and COVID-19 research.

Int J Med Inform

PharmacoEpidemiology and Drug Safety Research Group, Department of Pharmacy, Faculty of Mathematics and Natural Sciences, University of Oslo, Oslo, Norway; Department of Child Health and Development, Norwegian Institute of Public Health, Oslo, Norway.

Published: November 2024

AI Article Synopsis

  • - The study aimed to harmonize Norwegian health registries onto the OMOP common data model, incorporating a wealth of real-world data into a format that can support research and emergency preparedness related to COVID-19.
  • - Researchers successfully mapped 1.5 billion rows of health data from multiple registries, revealing detailed demographics and a significant number of COVID-19 cases, while ensuring most data quality checks passed with high accuracy.
  • - The successful integration of this health data enhances the potential for collaborative research on COVID-19 and demonstrates a valuable methodology for similar health registries in the Nordic region.

Article Abstract

Objective: Norwegian health registries covering entire population are used for administration, research, and emergency preparedness. We harmonized these data onto the Observational Medical Outcomes Partnership common data model (OMOP CDM) and enrich real-world data in OMOP format with COVID-19 related data.

Methods: Data from six registries (2018-2021) covering birth registrations, selected primary and secondary care events, vaccinations, and communicable disease notifications were mapped onto the OMOP CDM v5.3. An Extract-Transform-Load (ETL) pipeline was developed on simulated data using data characterization documents and scanning tools. We ran dashboard quality checks, cohort generations, investigated differences between source and mapped data, and refined the ETL accordingly.

Results: We mapped 1.5 billion rows of data of 5,673,845 individuals. Among these, there were 804,277 pregnancies, 483,585 mothers together with 792,477 children, and 472,948 fathers. We identified 382,516 positive tests for COVID-19 in 380,794 patients. These figures are consistent with results from source data. In addition to 11 million source codes mapped automatically, we mapped 237 non-standard codes to standard concepts and introduced 38 custom concepts to accommodate pregnancy-related terminologies that were not supported by OMOP CDM vocabularies. A total of 3,700/3,705 (99.8%) checks passed. The 5 failed checks could be explained by the nature of the data and only represent a small number of records.

Discussion And Conclusion: Norwegian registry data were successfully harmonized onto OMOP CDM with high level of concordance and provides valuable source for federated COVID-19 related research. Our mapping experience is highly valuable for data partners with Nordic health registries.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijmedinf.2024.105602DOI Listing

Publication Analysis

Top Keywords

omop cdm
16
data
13
common data
8
data model
8
health registries
8
omop
6
mapped
5
harmonizing norwegian
4
registries
4
norwegian registries
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!