Secondary use of data for research purposes is especially important in rare diseases (RD), since, per definition, data are sparse. The European Joint Programme on Rare Diseases (EJP RD) aims at developing an RD infrastructure which supports the secondary use of data. Significant amounts of RD data are a) distributed and b) available only in pseudonymised format.
View Article and Find Full Text PDFBackground: Multisite clinical studies are increasingly using real-world data to gain real-world evidence. However, due to the heterogeneity of source data, it is difficult to analyze such data in a unified way across clinics. Therefore, the implementation of Extract-Transform-Load (ETL) or Extract-Load-Transform (ELT) processes for harmonizing local health data is necessary, in order to guarantee the data quality for research.
View Article and Find Full Text PDFThe current state-of-the-art analysis of central nervous system (CNS) tumors through DNA methylation profiling relies on the tumor classifier developed by Capper and colleagues, which centrally harnesses DNA methylation data provided by users. Here, we present a distributed-computing-based approach for CNS tumor classification that achieves a comparable performance to centralized systems while safeguarding privacy. We utilize the t-distributed neighborhood embedding (t-SNE) model for dimensionality reduction and visualization of tumor classification results in two-dimensional graphs in a distributed approach across multiple sites (DistSNE).
View Article and Find Full Text PDFBackground: Many research initiatives aim at using data from electronic health records (EHRs) in observational studies. Participating sites of the German Medical Informatics Initiative (MII) established data integration centers to integrate EHR data within research data repositories to support local and federated analyses. To address concerns regarding possible data quality (DQ) issues of hospital routine data compared with data specifically collected for scientific purposes, we have previously presented a data quality assessment (DQA) tool providing a standardized approach to assess DQ of the research data repositories at the MIRACUM consortium's partner sites.
View Article and Find Full Text PDFStud Health Technol Inform
September 2019
Interoperability is a growing demand in healthcare, caused by heterogeneous sources, which aggravate information transfer. The interoperability issues can be addressed by metadata repositories. These support to ensure syntactical interoperability, like compatible data formats or value ranges, however especially semantic interoperability is still challenging.
View Article and Find Full Text PDF