Objective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes (NLP). The complexity of EHR presents challenges in feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient Aggregated naRrative Codified Health (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.
View Article and Find Full Text PDFMotivation: The increasing availability of Electronic Health Record (EHR) systems has created enormous potential for translational research. Recent developments in representation learning techniques have led to effective large-scale representations of EHR concepts along with knowledge graphs that empower downstream EHR studies. However, most existing methods require training with patient-level data, limiting their abilities to expand the training with multi-institutional EHR data.
View Article and Find Full Text PDFImportance: The American Heart Association proposed Life's Essential 8 (LE8) as an enhanced measurement tool for cardiovascular health.
Objective: To examine the association of LE8 with risk of atherosclerotic cardiovascular disease (ASCVD) incidence and prognosis in veterans.
Design, Setting, And Participants: This was a prospective cohort study of US veterans enrolled in the Department of Veterans Affairs (VA) Million Veteran Program (MVP) between 2011 and 2022.
medRxiv
September 2024
Obesity is a heritable disease, characterised by excess adiposity that is measured by body mass index (BMI). While over 1,000 genetic loci are associated with BMI, less is known about the genetic contribution to adiposity trajectories over adulthood. We derive adiposity-change phenotypes from 24.
View Article and Find Full Text PDFBackground: Whether tumor necrosis factor inhibitor (TNFi) use is cardioprotective among individuals with radiographic axial spondyloarthritis (r-axSpA), who have heightened cardiovascular (CV) risk, is unclear. We tested the association of TNFi use with incident CV outcomes in r-axSpA.
Methods: We identified a r-axSpA cohort within a Veterans Affairs database between 2002 and 2019 using novel phenotyping methods and secondarily using ICD codes.
Online J Public Health Inform
May 2024
Background: Post-COVID-19 condition (colloquially known as "long COVID-19") characterized as postacute sequelae of SARS-CoV-2 has no universal clinical case definition. Recent efforts have focused on understanding long COVID-19 symptoms, and electronic health record (EHR) data provide a unique resource for understanding this condition. The introduction of the International Classification of Diseases, Tenth Revision (ICD-10) code U09.
View Article and Find Full Text PDFFew studies examining the patient outcomes of concurrent neurological manifestations during acute COVID-19 leveraged multinational cohorts of adults and children or distinguished between central and peripheral nervous system (CNS vs. PNS) involvement. Using a federated multinational network in which local clinicians and informatics experts curated the electronic health records data, we evaluated the risk of prolonged hospitalization and mortality in hospitalized COVID-19 patients from 21 healthcare systems across 7 countries.
View Article and Find Full Text PDFThe Phenome-Wide Association Study (PheWAS) is increasingly used to broadly screen for potential treatment effects, e.g., IL6R variant as a proxy for IL6R antagonists.
View Article and Find Full Text PDFObjective: Development of clinical phenotypes from electronic health records (EHRs) can be resource intensive. Several phenotype libraries have been created to facilitate reuse of definitions. However, these platforms vary in target audience and utility.
View Article and Find Full Text PDFElectronic health record (EHR) data are increasingly used to support real-world evidence studies but are limited by the lack of precise timings of clinical events. Here, we propose a label-efficient incident phenotyping (LATTE) algorithm to accurately annotate the timing of clinical events from longitudinal EHR data. By leveraging the pre-trained semantic embeddings, LATTE selects predictive features and compresses their information into longitudinal visit embeddings through visit attention learning.
View Article and Find Full Text PDFBackground: Red meat consumption was associated with an increased risk of cardiovascular disease (CVD) in prospective cohort studies and a profile of biomarkers favoring high CVD risk in short-term controlled trials. However, several recent systematic reviews and meta-analyses concluded with no or weak evidence for limiting red meat intake.
Objectives: To prospectively examine the associations between red meat intake and incident CVD in an ongoing cohort study with diverse socioeconomic and racial or ethnic backgrounds.
Importance: A significant proportion of SARS-CoV-2 infected individuals experience post-COVID-19 condition months after initial infection.
Objective: To determine the rates, clinical setting, risk factors, and symptoms associated with the documentation of International Statistical Classification of Diseases Tenth Revision (ICD-10), code U09.9 for post-COVID-19 condition after acute infection.
Though electronic health record (EHR) systems are a rich repository of clinical information with large potential, the use of EHR-based phenotyping algorithms is often hindered by inaccurate diagnostic records, the presence of many irrelevant features, and the requirement for a human-labeled training set. In this paper, we describe a knowledge-driven online multimodal automated phenotyping (KOMAP) system that i) generates a list of informative features by an online narrative and codified feature search engine (ONCE) and ii) enables the training of a multimodal phenotyping algorithm based on summary data. Powered by composite knowledge from multiple EHR sources, online article corpora, and a large language model, features selected by ONCE show high concordance with the state-of-the-art AI models (GPT4 and ChatGPT) and encourage large-scale phenotyping by providing a smaller but highly relevant feature set.
View Article and Find Full Text PDFmedRxiv
August 2023
Genome-wide association studies (GWAS) have underrepresented individuals from non-European populations, impeding progress in characterizing the genetic architecture and consequences of health and disease traits. To address this, we present a population-stratified phenome-wide GWAS followed by a multi-population meta-analysis for 2,068 traits derived from electronic health records of 635,969 participants in the Million Veteran Program (MVP), a longitudinal cohort study of diverse U.S.
View Article and Find Full Text PDFObjective: Electronic health record (EHR) systems contain a wealth of clinical data stored as both codified data and free-text narrative notes, covering hundreds of thousands of clinical concepts available for research and clinical care. The complex, massive, heterogeneous, and noisy nature of EHR data imposes significant challenges for feature representation, information extraction, and uncertainty quantification. To address these challenges, we proposed an efficient ggregated narative odified ealth (ARCH) records analysis to generate a large-scale knowledge graph (KG) for a comprehensive set of EHR codified and narrative features.
View Article and Find Full Text PDF