Objective: The objective of this study was to describe the prevalence of obesity, obesity-related conditions (ORCs), and antiobesity medication (AOM) eligibility and prescribing practice among eligible patients in a large health care system.
Methods: In this cross-sectional analysis of the multicenter Mass General Brigham health care system (Boston, Massachusetts) spanning 2018 to 2022, adults eligible for AOMs (BMI ≥ 30 kg/m or BMI 27-29.9 kg/m with ≥1 ORC) were identified.
A majority of patients with heart failure (HF) do not receive adequate medical therapy as recommended by clinical guidelines. One major obstacle encountered by population health management (PHM) programs to improve medication usage is the substantial burden placed on clinical staff who must manually sift through electronic health records (EHRs) to ascertain patients' eligibility for the guidelines. As a potential solution, the study team developed a rule-based system (RBS) that automatically parses the EHR for identifying patients with HF who may be eligible for guideline-directed therapy.
View Article and Find Full Text PDFThe COVID-19 pandemic generated tremendous interest in using real world data (RWD). Many consortia across the public and private sectors formed in 2020 with the goal of rapidly producing high-quality evidence from RWD to guide medical decision-making, public health priorities, and more. Experiences were gathered from five large consortia on rapid multi-institutional evidence generation during the COVID-19 pandemic.
View Article and Find Full Text PDFBackground: Subject screening is a key aspect of all clinical trials; however, traditionally, it is a labor-intensive and error-prone task, demanding significant time and resources. With the advent of large language models (LLMs) and related technologies, a paradigm shift in natural language processing capabilities offers a promising avenue for increasing both quality and efficiency of screening efforts. This study aimed to test the Retrieval-Augmented Generation (RAG) process enabled Generative Pretrained Transformer Version 4 (GPT-4) to accurately identify and report on inclusion and exclusion criteria for a clinical trial.
View Article and Find Full Text PDFBackground: Characterizing Post-Acute Sequelae of COVID (SARS-CoV-2 Infection), or has been challenging due to the multitude of sub-phenotypes, temporal attributes, and definitions. Scalable characterization of PASC sub-phenotypes can enhance screening capacities, disease management, and treatment planning.
Methods: We conducted a retrospective multi-centre observational cohort study, leveraging longitudinal electronic health record (EHR) data of 30,422 patients from three healthcare systems in the Consortium for the Clinical Characterization of COVID-19 by EHR (4CE).
Physical and psychological symptoms lasting months following an acute COVID-19 infection are now recognized as post-acute sequelae of COVID-19 (PASC). Accurate tools for identifying such patients could enhance screening capabilities for the recruitment for clinical trials, improve the reliability of disease estimates, and allow for more accurate downstream cohort analysis. In this retrospective cohort study, we analyzed the EHR of hospitalized COVID-19 patients across three healthcare systems to develop a pipeline for better identifying patients with persistent PASC symptoms (dyspnea, fatigue, or joint pain) after their SARS-CoV-2 infection.
View Article and Find Full Text PDFImportance: Blood pressure (BP) and cholesterol control remain challenging. Remote care can deliver more effective care outside of traditional clinician-patient settings but scaling and ensuring access to care among diverse populations remains elusive.
Objective: To implement and evaluate a remote hypertension and cholesterol management program across a diverse health care network.
Motivation: The i2b2 platform is used at major academic health institutions and research consortia for querying for electronic health data. However, a major obstacle for wider utilization of the platform is the complexity of data loading that entails a steep curve of learning the platform's complex data schemas. To address this problem, we have developed the i2b2-etl package that simplifies the data loading process, which will facilitate wider deployment and utilization of the platform.
View Article and Find Full Text PDFBackground Models predicting atrial fibrillation (AF) risk, such as Cohorts for Heart and Aging Research in Genomic Epidemiology AF (CHARGE-AF), have not performed as well in electronic health records. Natural language processing (NLP) may improve models by using narrative electronic health record text. Methods and Results From a primary care network, we included patients aged ≥65 years with visits between 2003 and 2013 in development (n=32 960) and internal validation cohorts (n=13 992).
View Article and Find Full Text PDFAnalysis of health data typically requires development of queries using structured query language (SQL) by a data-analyst. As the SQL queries are manually created, they are prone to errors. In addition, accurate implementation of the queries depends on effective communication with clinical experts, that further makes the analysis error prone.
View Article and Find Full Text PDFThe risk profiles of post-acute sequelae of COVID-19 (PASC) have not been well characterized in multi-national settings with appropriate controls. We leveraged electronic health record (EHR) data from 277 international hospitals representing 414,602 patients with COVID-19, 2.3 million control patients without COVID-19 in the inpatient and outpatient settings, and over 221 million diagnosis codes to systematically identify new-onset conditions enriched among patients with COVID-19 during the post-acute period.
View Article and Find Full Text PDFObjective: To assess changes in international mortality rates and laboratory recovery rates during hospitalisation for patients hospitalised with SARS-CoV-2 between the first wave (1 March to 30 June 2020) and the second wave (1 July 2020 to 31 January 2021) of the COVID-19 pandemic.
Design, Setting And Participants: This is a retrospective cohort study of 83 178 hospitalised patients admitted between 7 days before or 14 days after PCR-confirmed SARS-CoV-2 infection within the Consortium for Clinical Characterization of COVID-19 by Electronic Health Record, an international multihealthcare system collaborative of 288 hospitals in the USA and Europe. The laboratory recovery rates and mortality rates over time were compared between the two waves of the pandemic.
Objective: The increasing translation of artificial intelligence (AI)/machine learning (ML) models into clinical practice brings an increased risk of direct harm from modeling bias; however, bias remains incompletely measured in many medical AI applications. This article aims to provide a framework for objective evaluation of medical AI from multiple aspects, focusing on binary classification models.
Materials And Methods: Using data from over 56 000 Mass General Brigham (MGB) patients with confirmed severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2), we evaluate unrecognized bias in 4 AI models developed during the early months of the pandemic in Boston, Massachusetts that predict risks of hospital admission, ICU admission, mechanical ventilation, and death after a SARS-CoV-2 infection purely based on their pre-infection longitudinal medical records.
Background: Admissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. Electronic health record (EHR)-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization.
View Article and Find Full Text PDFAdmissions are generally classified as COVID-19 hospitalizations if the patient has a positive SARS-CoV-2 polymerase chain reaction (PCR) test. However, because 35% of SARS-CoV-2 infections are asymptomatic, patients admitted for unrelated indications with an incidentally positive test could be misclassified as a COVID-19 hospitalization. EHR-based studies have been unable to distinguish between a hospitalization specifically for COVID-19 versus an incidental SARS-CoV-2 hospitalization.
View Article and Find Full Text PDFObjectives: Hypertension is a modifiable risk factor for numerous comorbidities and treating hypertension can greatly improve health outcomes. We sought to increase the efficiency of a virtual hypertension management program through workflow automation processes.
Methods: We developed a customer relationship management (CRM) solution at our institution for the purpose of improving processes and workflow for a virtual hypertension management program and describe here the development, implementation, and initial experience of this CRM system.
Background: Many countries have experienced 2 predominant waves of COVID-19-related hospitalizations. Comparing the clinical trajectories of patients hospitalized in separate waves of the pandemic enables further understanding of the evolving epidemiology, pathophysiology, and health care dynamics of the COVID-19 pandemic.
Objective: In this retrospective cohort study, we analyzed electronic health record (EHR) data from patients with SARS-CoV-2 infections hospitalized in participating health care systems representing 315 hospitals across 6 countries.
Coincident with the tsunami of COVID-19-related publications, there has been a surge of studies using real-world data, including those obtained from the electronic health record (EHR). Unfortunately, several of these high-profile publications were retracted because of concerns regarding the soundness and quality of the studies and the EHR data they purported to analyze. These retractions highlight that although a small community of EHR informatics experts can readily identify strengths and flaws in EHR-derived studies, many medical editorial teams and otherwise sophisticated medical readers lack the framework to fully critically appraise these studies.
View Article and Find Full Text PDFObjective: The Consortium for Clinical Characterization of COVID-19 by EHR (4CE) is an international collaboration addressing coronavirus disease 2019 (COVID-19) with federated analyses of electronic health record (EHR) data. We sought to develop and validate a computable phenotype for COVID-19 severity.
Materials And Methods: Twelve 4CE sites participated.
This study aims to predict death after COVID-19 using only the past medical information routinely collected in electronic health records (EHRs) and to understand the differences in risk factors across age groups. Combining computational methods and clinical expertise, we curated clusters that represent 46 clinical conditions as potential risk factors for death after a COVID-19 infection. We trained age-stratified generalized linear models (GLMs) with component-wise gradient boosting to predict the probability of death based on what we know from the patients before they contracted the virus.
View Article and Find Full Text PDFImportance: Optimal treatment of heart failure with reduced ejection fraction (HFrEF) is scripted by treatment guidelines, but many eligible patients do not receive guideline-directed medical therapy (GDMT) in clinical practice.
Objective: To determine whether a remote, algorithm-driven, navigator-administered medication optimization program could enhance implementation of GDMT in HFrEF.
Design, Setting, And Participants: In this case-control study, a population-based sample of patients with HFrEF was offered participation in a quality improvement program directed at GDMT optimization.
Electronic health records (EHRs) contain important temporal information about the progression of disease and treatment outcomes. This paper proposes a transitive sequencing approach for constructing temporal representations from EHR observations for downstream machine learning. Using clinical data from a cohort of patients with congestive heart failure, we mined temporal representations by transitive sequencing of EHR medication and diagnosis records for classification and prediction tasks.
View Article and Find Full Text PDFDespite the widespread use of the "Informatics for Integrating Biology and the Bedside" (i2b2) platform, there are substantial challenges for loading electronic health records (EHR) into i2b2 and for querying i2b2. We have previously presented a simplified framework for semantic abstraction of EHR records into i2b2. Building on our previous work, we have created a proof-of-concept implementation of cloud services on an i2b2 data store for cohort identification.
View Article and Find Full Text PDFMotivation: Expert-labeled data are essential to train phenotyping algorithms for cohort identification. However expert labeling is time and labor intensive, and the costs remain prohibitive for scaling phenotyping to wider use-cases.
Results: We present an approach referred to as polar labeling (PL), to create silver standard for training machine learning (ML) for disease classification.