Background: Enriched electronic health records (EHRs) contain crucial information related to disease progression, and this information can help with decision-making in the health care field. Data analytics in health care is deemed as one of the essential processes that help accelerate the progress of clinical research. However, processing and analyzing EHR data are common bottlenecks in health care data analytics.

Methods: The R package provides mechanisms for integration, wrangling, and visualization of clinical data, including diagnosis and procedure records. First, the package helps users transform International Classification of Diseases (ICD) codes to a uniform format. After code format transformation, the package supports four strategies for grouping clinical diagnostic data. For clinical procedure data, two grouping methods can be chosen. After EHRs are integrated, users can employ a set of flexible built-in querying functions for dividing data into case and control groups by using specified criteria and splitting the data into before and after an event based on the record date. Subsequently, the structure of integrated long data can be converted into wide, analysis-ready data that are suitable for statistical analysis and visualization.

Results: We conducted comorbidity data processes based on a cohort of newborns from Medical Information Mart for Intensive Care-III (n = 7,833) by using the package. We first defined patent ductus arteriosus (PDA) cases as patients who had at least one PDA diagnosis (ICD, Ninth Revision, Clinical Modification [ICD-9-CM] 7470*). Controls were defined as patients who never had PDA diagnosis. In total, 381 and 7,452 patients with and without PDA, respectively, were included in our study population. Then, we grouped the diagnoses into defined comorbidities. Finally, we observed a statistically significant difference in 8 of the 16 comorbidities among patients with and without PDA, including fluid and electrolyte disorders, valvular disease, and others.

Conclusions: This package helps clinical data analysts address the common bottleneck caused by clinical data characteristics such as heterogeneity and sparseness.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC8176530PMC
http://dx.doi.org/10.7717/peerj-cs.520DOI Listing

Publication Analysis

Top Keywords

patients pda
16
data
14
health care
12
clinical data
12
analysis-ready data
8
electronic health
8
package helps
8
pda diagnosis
8
clinical
7
package
6

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!