Motivation: The increasing availability of Electronic Health Record (EHR) systems has created enormous potential for translational research. Recent developments in representation learning techniques have led to effective large-scale representations of EHR concepts along with knowledge graphs that empower downstream EHR studies. However, most existing methods require training with patient-level data, limiting their abilities to expand the training with multi-institutional EHR data. On the other hand, scalable approaches that only require summary-level data do not incorporate temporal dependencies between concepts.
Methods: We introduce a DirectiOnal Medical Embedding (DOME) algorithm to encode temporally directional relationships between medical concepts, using summary-level EHR data. Specifically, DOME first aggregates patient-level EHR data into an asymmetric co-occurrence matrix. Then it computes two Positive Pointwise Mutual Information (PPMI) matrices to correspondingly encode the pairwise prior and posterior dependencies between medical concepts. Following that, a joint matrix factorization is performed on the two PPMI matrices, which results in three vectors for each concept: a semantic embedding and two directional context embeddings. They collectively provide a comprehensive depiction of the temporal relationship between EHR concepts.
Results: We highlight the advantages and translational potential of DOME through three sets of validation studies. First, DOME consistently improves existing direction-agnostic embedding vectors for disease risk prediction in several diseases, for example achieving a relative gain of 5.5% in the area under the receiver operating characteristic (AUROC) for lung cancer. Second, DOME excels in directional drug-disease relationship inference by successfully differentiating between drug side effects and indications, correspondingly achieving relative AUROC gain over the state-of-the-art methods by 10.8% and 6.6%. Finally, DOME effectively constructs directional knowledge graphs, which distinguish disease risk factors from comorbidities, thereby revealing disease progression trajectories. The source codes are provided at https://github.com/celehs/Directional-EHR-embedding.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jbi.2024.104768 | DOI Listing |
JAMIA Open
February 2025
Department of Medicine, University of Wisconsin-Madison, Madison, WI 53792, United States.
Objective: To evaluate large language models (LLMs) for pre-test diagnostic probability estimation and compare their uncertainty estimation performance with a traditional machine learning classifier.
Materials And Methods: We assessed 2 instruction-tuned LLMs, Mistral-7B-Instruct and Llama3-70B-chat-hf, on predicting binary outcomes for Sepsis, Arrhythmia, and Congestive Heart Failure (CHF) using electronic health record (EHR) data from 660 patients. Three uncertainty estimation methods-Verbalized Confidence, Token Logits, and LLM Embedding+XGB-were compared against an eXtreme Gradient Boosting (XGB) classifier trained on raw EHR data.
Contemp Clin Trials Commun
February 2025
Department of Medicine, Division of General Internal Medicine and Center for Health Information Partnerships, Institute for Public Health and Medicine, Northwestern University Feinberg School of Medicine, USA.
Background: Unhealthy alcohol use is a leading cause of preventable mortality and a risk factor for an array of social and health problems. The Intervention in Small primary care Practices to Implement Reduction in unhealthy alcohol use (INSPIRE) study is part of a nationwide campaign to improve the identification and treatment of patients engaging in unhealthy alcohol use.
Methods: We conducted a single arm, pragmatic study consisting of seventeen primary care practices in the Chicago metropolitan area, Wisconsin, and California across two waves with a 6-month latent period, a 12-month intervention period, followed by a 6-month sustainability period.
J Pathol Inform
January 2025
Harvard Medical School, Boston, MA, United States of America.
Objective: Thrombocytopenia is a common complication of hematopoietic stem-cell transplantation (HSCT), though many patients will become immune refractory to platelet transfusions over time. We built and evaluated an electronic health record (EHR)-integrated, standards-based application that enables blood-bank clinicians to match platelet inventory with patients using data previously not available at the point-of-care, like human leukocyte antigen (HLA) data for donors and recipients.
Materials And Methods: The web-based application launches as an EHR-embedded application or as a standalone application.
J Pediatr Nurs
January 2025
Department of Medical Surgical Nursing, College of Nursing, University of Ha'il, Ha'il City, Saudi Arabia; Department of Nursing, Faculty of Medicine and Health Sciences, Hodeida University, Hodeida, Yemen.
Aim: This study aimed to evaluate the knowledge, attitudes, and acceptance of Electronic Health Records (EHRs) among nurses in Egypt.
Methods: A descriptive cross-sectional study was conducted involving 1217 nurses from 33 public and private hospitals. Data were collected using a self-administered online questionnaire, which assessed knowledge, attitudes, and acceptance of EHRs.
Open Heart
January 2025
Department of Internal Medicine I, Universitätsklinikum Würzburg, Würzburg, BY, Germany
Background And Aims: Hypertrophic cardiomyopathy (HCM) has various aetiologies, including genetic conditions like Fabry disease (FD), a lysosomal storage disorder. FD prevalence in high-risk HCM populations ranges from 0.3% to 11.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!