Background: Given the geographical sparsity of Rare Diseases (RDs), assembling a cohort is often a challenging task. Common data models (CDM) can harmonize disparate sources of data that can be the basis of decision support systems and artificial intelligence-based studies, leading to new insights in the field. This work is sought to support the design of large-scale multi-center studies for rare diseases.
Methods: In an interdisciplinary group, we derived a list of elements of RDs in three medical domains (endocrinology, gastroenterology, and pneumonology) according to specialist knowledge and clinical guidelines in an iterative process. We then defined a RDs data structure that matched all our data elements and built Extract, Transform, Load (ETL) processes to transfer the structure to a joint CDM. To ensure interoperability of our developed CDM and its subsequent usage for further RDs domains, we ultimately mapped it to Observational Medical Outcomes Partnership (OMOP) CDM. We then included a fourth domain, hematology, as a proof-of-concept and mapped an acute myeloid leukemia (AML) dataset to the developed CDM.
Results: We have developed an OMOP-based rare diseases common data model (RD-CDM) using data elements from the three domains (endocrinology, gastroenterology, and pneumonology) and tested the CDM using data from the hematology domain. The total study cohort included 61,697 patients. After aligning our modules with those of Medical Informatics Initiative (MII) Core Dataset (CDS) modules, we leveraged its ETL process. This facilitated the seamless transfer of demographic information, diagnoses, procedures, laboratory results, and medication modules from our RD-CDM to the OMOP. For the phenotypes and genotypes, we developed a second ETL process. We finally derived lessons learned for customizing our RD-CDM for different RDs.
Discussion: This work can serve as a blueprint for other domains as its modularized structure could be extended towards novel data types. An interdisciplinary group of stakeholders that are actively supporting the project's progress is necessary to reach a comprehensive CDM.
Conclusion: The customized data structure related to our RD-CDM can be used to perform multi-center studies to test data-driven hypotheses on a larger scale and take advantage of the analytical tools offered by the OHDSI community.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11325822 | PMC |
http://dx.doi.org/10.1186/s13023-024-03312-9 | DOI Listing |
J Occup Environ Med
November 2024
Objectives: Chronic skin diseases (CSD) may lead to productivity losses. This mixed-methods study investigated symptom severity, social challenges, need for workplace accommodation, sick leave and their association with perceived impaired work performance (IWP) among workers with CSD.
Methods: Data were collected from April to June 2023.
Ann Intern Med
January 2025
Center of Innovation to Accelerate Discovery and Practice Transformation, Durham Veterans Affairs Health Care System; Department of Population Health Sciences, Duke University School of Medicine; and Durham Evidence Synthesis Program, Durham Veterans Affairs Health Care System, Durham, North Carolina (J.M.G.).
Background: Postdischarge contacts (PDCs) after hospitalization are common practice, but their effectiveness in reducing use of acute care after discharge remains unclear.
Purpose: To assess the effects of PDC on 30-day emergency department (ED) visits, 30-day hospital readmissions, and patient satisfaction.
Data Sources: MEDLINE, Embase, and CINAHL searched from 2012 to 25 May 2023.
J Occup Environ Hyg
January 2025
Center for Environmental Solutions and Emergency Response, United States Environmental Protection Agency, Cincinnati, Ohio.
Chemical release data are essential for performing chemical risk assessments to understand the potential exposures arising from industrial processes. Often, these data are unknown or unavailable and must be estimated. A case study of volatile organic compound releases during extrusion-based additive manufacturing is used here to explore the viability of various regression methods for predicting chemical releases to inform chemical assessments.
View Article and Find Full Text PDFDev Med Child Neurol
January 2025
Department of Community Health Sciences, Max Rady College of Medicine, Rady Faculty of Health Sciences, University of Manitoba, Winnipeg, Manitoba, Canada.
Aim: To quantify optic nerve hypoplasia (ONH) and septo-optic-pituitary dysplasia (SOD) morbidities and comorbidities.
Method: A retrospective population-based study with a case-control design was undertaken using administrative health data from Manitoba, Canada. Cases were 124 patients with ONH or SOD (70 males, 54 females; age range 6 months-36 years 8 months [mean 13 years, SD 7 years 2 months]) diagnosed from 1990 to 2019, matched to 620 unrelated population-based controls (350 males, 270 females; age range 0-36 years 8 months [mean 12 years 5 months, SD 7 years 2 months]) on birth year, sex, and area of residence.
PLoS Comput Biol
January 2025
Wisconsin Institute for Discovery, University of Wisconsin-Madison, Madison, Wisconsin, United States of America.
High-dimensional mixed-effects models are an increasingly important form of regression in which the number of covariates rivals or exceeds the number of samples, which are collected in groups or clusters. The penalized likelihood approach to fitting these models relies on a coordinate descent algorithm that lacks guarantees of convergence to a global optimum. Here, we empirically study the behavior of this algorithm on simulated and real examples of three types of data that are common in modern biology: transcriptome, genome-wide association, and microbiome data.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!