AI Article Synopsis

  • Clinical EHR data is diverse and complex, which makes it difficult for machine learning models to accurately predict outcomes due to high variability within the same category.
  • To improve prediction accuracy, a new supervised pre-training model using an embedded k-nearest-neighbor positive sampling strategy is proposed.
  • This approach demonstrates strong performance, achieving an AUROC score of 0.872 in predicting patient mortality from COVID-19 data, outperforming existing methods, especially in cases with limited training data.

Article Abstract

Clinical EHR data is naturally heterogeneous, where it contains abundant sub-phenotype. Such diversity creates challenges for outcome prediction using a machine learning model since it leads to high intra-class variance. To address this issue, we propose a supervised pre-training model with a unique embedded k-nearest-neighbor positive sampling strategy. We demonstrate the enhanced performance value of this framework theoretically and show that it yields highly competitive experimental results in predicting patient mortality in real-world COVID-19 EHR data with a total of over 7,000 patients admitted to a large, urban health system. Our method achieves a better AUROC prediction score of 0.872, which outperforms the alternative pre-training models and traditional machine learning methods. Additionally, our method performs much better when the training data size is small (345 training instances).

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9365529PMC
http://dx.doi.org/10.1145/3535508.3545541DOI Listing

Publication Analysis

Top Keywords

ehr data
8
machine learning
8
supervised pretraining
4
pretraining contrastive
4
contrastive categorical
4
categorical positive
4
positive samplings
4
samplings improve
4
improve covid-19
4
covid-19 mortality
4

Similar Publications

Evaluation of COVID-19 Diagnosis Codes for Identification of SARS-CoV-2 Infections in a Nursing Home Cohort, 2022-2023.

J Am Med Dir Assoc

January 2025

Division of Healthcare Quality Promotion, National Center for Emerging and Zoonotic Infectious Diseases, Centers for Disease Control and Prevention, Atlanta, GA, USA.

Objectives: This study aimed to evaluate the utility of electronic health record (EHR) diagnosis codes for monitoring SARS-CoV-2 infections among nursing home residents.

Design: A retrospective cohort study design was used to analyze data collected from nursing homes operating under the tradename Signature Healthcare between January 2022 and June 2023.

Setting And Participants: Data from 31,136 nursing home residents across 76 facilities in Kentucky, Tennessee, Indiana, Ohio, North Carolina, Georgia, Alabama, and Virginia were included.

View Article and Find Full Text PDF

AI-based models to predict decompensation on traumatic brain injury patients.

Comput Biol Med

January 2025

INESC TEC - Institute for Systems and Computer Engineering, Technology and Science, Porto, Portugal; FCTUC - Faculty of Sciences and Technology of the University of Coimbra, Coimbra, Portugal. Electronic address:

Traumatic Brain Injury (TBI) is a form of brain injury caused by external forces, resulting in temporary or permanent impairment of brain function. Despite advancements in healthcare, TBI mortality rates can reach 30%-40% in severe cases. This study aims to assist clinical decision-making and enhance patient care for TBI-related complications by employing Artificial Intelligence (AI) methods and data-driven approaches to predict decompensation.

View Article and Find Full Text PDF

Background: Postoperative delirium (POD) is a common complication after major surgery and is associated with poor outcomes in older adults. Early identification of patients at high risk of POD can enable targeted prevention efforts. However, existing POD prediction models require inpatient data collected during the hospital stay, which delays predictions and limits scalability.

View Article and Find Full Text PDF

Amid an aging global population, heart failure has become a leading cause of hospitalization among older people. Its high prevalence and mortality rates underscore the importance of accurate mortality prediction for swift disease progression assessment and better patient outcomes. The evolution of artificial intelligence (AI) presents new avenues for predicting heart failure mortality.

View Article and Find Full Text PDF

Background: Natural language processing (NLP) and machine learning (ML) techniques may help harness unstructured free-text electronic health record (EHR) data to detect adverse drug events (ADEs) and thus improve pharmacovigilance. However, evidence of their real-world effectiveness remains unclear.

Objective: To summarise the evidence on the effectiveness of NLP/ML in detecting ADEs from unstructured EHR data and ultimately improve pharmacovigilance in comparison to other data sources.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!