Motivation: The increasing availability of Electronic Health Record (EHR) systems has created enormous potential for translational research. Recent developments in representation learning techniques have led to effective large-scale representations of EHR concepts along with knowledge graphs that empower downstream EHR studies. However, most existing methods require training with patient-level data, limiting their abilities to expand the training with multi-institutional EHR data. On the other hand, scalable approaches that only require summary-level data do not incorporate temporal dependencies between concepts.

Methods: We introduce a DirectiOnal Medical Embedding (DOME) algorithm to encode temporally directional relationships between medical concepts, using summary-level EHR data. Specifically, DOME first aggregates patient-level EHR data into an asymmetric co-occurrence matrix. Then it computes two Positive Pointwise Mutual Information (PPMI) matrices to correspondingly encode the pairwise prior and posterior dependencies between medical concepts. Following that, a joint matrix factorization is performed on the two PPMI matrices, which results in three vectors for each concept: a semantic embedding and two directional context embeddings. They collectively provide a comprehensive depiction of the temporal relationship between EHR concepts.

Results: We highlight the advantages and translational potential of DOME through three sets of validation studies. First, DOME consistently improves existing direction-agnostic embedding vectors for disease risk prediction in several diseases, for example achieving a relative gain of 5.5% in the area under the receiver operating characteristic (AUROC) for lung cancer. Second, DOME excels in directional drug-disease relationship inference by successfully differentiating between drug side effects and indications, correspondingly achieving relative AUROC gain over the state-of-the-art methods by 10.8% and 6.6%. Finally, DOME effectively constructs directional knowledge graphs, which distinguish disease risk factors from comorbidities, thereby revealing disease progression trajectories. The source codes are provided at https://github.com/celehs/Directional-EHR-embedding.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2024.104768DOI Listing

Publication Analysis

Top Keywords

ehr data
12
directional medical
8
medical embedding
8
embedding vectors
8
electronic health
8
knowledge graphs
8
medical concepts
8
ppmi matrices
8
disease risk
8
achieving relative
8

Similar Publications

Objective: To evaluate large language models (LLMs) for pre-test diagnostic probability estimation and compare their uncertainty estimation performance with a traditional machine learning classifier.

Materials And Methods: We assessed 2 instruction-tuned LLMs, Mistral-7B-Instruct and Llama3-70B-chat-hf, on predicting binary outcomes for Sepsis, Arrhythmia, and Congestive Heart Failure (CHF) using electronic health record (EHR) data from 660 patients. Three uncertainty estimation methods-Verbalized Confidence, Token Logits, and LLM Embedding+XGB-were compared against an eXtreme Gradient Boosting (XGB) classifier trained on raw EHR data.

View Article and Find Full Text PDF

Background: Unhealthy alcohol use is a leading cause of preventable mortality and a risk factor for an array of social and health problems. The Intervention in Small primary care Practices to Implement Reduction in unhealthy alcohol use (INSPIRE) study is part of a nationwide campaign to improve the identification and treatment of patients engaging in unhealthy alcohol use.

Methods: We conducted a single arm, pragmatic study consisting of seventeen primary care practices in the Chicago metropolitan area, Wisconsin, and California across two waves with a 6-month latent period, a 12-month intervention period, followed by a 6-month sustainability period.

View Article and Find Full Text PDF

Objective: Thrombocytopenia is a common complication of hematopoietic stem-cell transplantation (HSCT), though many patients will become immune refractory to platelet transfusions over time. We built and evaluated an electronic health record (EHR)-integrated, standards-based application that enables blood-bank clinicians to match platelet inventory with patients using data previously not available at the point-of-care, like human leukocyte antigen (HLA) data for donors and recipients.

Materials And Methods: The web-based application launches as an EHR-embedded application or as a standalone application.

View Article and Find Full Text PDF

Understanding Nurses' perspectives on electronic health records in Egypt: Insights from a cross-sectional study.

J Pediatr Nurs

January 2025

Department of Medical Surgical Nursing, College of Nursing, University of Ha'il, Ha'il City, Saudi Arabia; Department of Nursing, Faculty of Medicine and Health Sciences, Hodeida University, Hodeida, Yemen.

Aim: This study aimed to evaluate the knowledge, attitudes, and acceptance of Electronic Health Records (EHRs) among nurses in Egypt.

Methods: A descriptive cross-sectional study was conducted involving 1217 nurses from 33 public and private hospitals. Data were collected using a self-administered online questionnaire, which assessed knowledge, attitudes, and acceptance of EHRs.

View Article and Find Full Text PDF

Background And Aims: Hypertrophic cardiomyopathy (HCM) has various aetiologies, including genetic conditions like Fabry disease (FD), a lysosomal storage disorder. FD prevalence in high-risk HCM populations ranges from 0.3% to 11.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!