Federated learning of medical concepts embedding using BEHRT.

JAMIA Open

Department of Software and Information Systems Engineering, Ben-Gurion University of the Negev, Be'er Sheva, Israel.

Published: December 2024

AI Article Synopsis

  • The study addresses the challenge of sharing sensitive electronic health record (EHR) data across different medical centers by using federated learning (FL) to create a shared predictive model for next visit predictions.
  • Our proposed FL method leverages the BEHRT model for training medical concept embeddings without centralizing data, aiming for effective prediction tasks.
  • Results show that FL models achieve competitive performance compared to centralized models and significantly outperform local models, demonstrating improved prediction accuracy when using pretrained masked language models.

Article Abstract

Objectives: Electronic health record data is often considered sensitive medical information. Therefore, the EHR data from different medical centers often cannot be shared, making it difficult to create prediction models using multicenter EHR data, which is essential for such models' robustness and generalizability. Federated learning (FL) is an algorithmic approach that allows learning a shared model using data in multiple locations without the need to store all data in a single central place. Our study aims to evaluate an FL approach using the BEHRT model for predictive tasks on EHR data, focusing on next visit prediction.

Materials And Methods: We propose an FL approach for learning medical concepts embedding. This pretrained model can be used for fine-tuning for specific downstream tasks. Our approach is based on an embedding model like BEHRT, a deep neural sequence transduction model for EHR. We train using FL, both the masked language modeling (MLM) and the next visit downstream model.

Results: We demonstrate our approach on the MIMIC-IV dataset. We compare the performance of a model trained with FL to one trained on centralized data, observing a difference in average precision ranging from 0% to 3% (absolute), depending on the length of the patients' visit history. Moreover, our approach improves average precision by 4%-10% (absolute) compared to local models. In addition, we show the importance of the usage of pretrained MLM for the next visit diagnoses prediction task.

Discussion And Conclusion: We find that our FL approach reaches very close to the performance of a centralized model, and it outperforms local models in terms of average precision. We also show that pretrained MLM improves the model's average precision performance in the next visit diagnoses prediction task, compared to an MLM without pretraining.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11498200PMC
http://dx.doi.org/10.1093/jamiaopen/ooae110DOI Listing

Publication Analysis

Top Keywords

average precision
16
ehr data
12
federated learning
8
learning medical
8
medical concepts
8
concepts embedding
8
mlm visit
8
local models
8
pretrained mlm
8
visit diagnoses
8

Similar Publications

The edge structures of carbonaceous materials exhibit temperature-dependent behavior on the atomic scale, with variations in the relative ratios of zigzag, reconstructed 5-7 zigzag (ZZ57), and armchair edges observed at different temperatures. Nevertheless, the mechanisms underlying the interconversion of these edge structures and the influence of the surrounding metals remain unclear. This study investigates the reconstruction and reversible transformation processes of ZZ57 edge structures in carbon materials and examines the effects of different metal atoms (Na, K, and Ca) by using density functional theory.

View Article and Find Full Text PDF

Background/purpose: Oral mucosal lesions are associated with a variety of pathological conditions. Most deep-learning-based convolutional neural network (CNN) systems for computer-aided diagnosis of oral lesions have typically concentrated on determining limited aspects of differential diagnosis. This study aimed to develop a CNN-based diagnostic model capable of classifying clinical photographs of oral ulcerative and associated lesions into five different diagnoses, thereby assisting clinicians in making accurate differential diagnoses.

View Article and Find Full Text PDF

Background/purpose: Artificial intelligence (AI) can assist in medical diagnosis owing to its high accuracy and efficiency. This study aimed to develop a diagnostic system for automatically determining the degree of tooth wear (TW) using intraoral photographs with deep learning.

Materials And Methods: The study included 388 intraoral photographs.

View Article and Find Full Text PDF

Integrating artificial intelligence (AI) into oncology can revolutionize decision-making by providing accurate information. This study evaluates the performance of ChatGPT-4o (OpenAI, San Francisco, CA) Oncology Expert, in addressing open-ended clinical oncology questions. Thirty-seven treatment-related questions on solid organ tumors were selected from a hematology-oncology textbook.

View Article and Find Full Text PDF

Hemangiosarcoma is a highly malignant tumor commonly affecting canines, originating from endothelial cells that line blood vessels, underscoring the importance of early detection. This canine cancer is analogous to human angiosarcoma, and the development of liquid biopsies leveraging cell-free DNA (cfDNA) represents a promising step forward in early cancer diagnosis. In this study, we utilized Whole Genome Sequencing (WGS) to analyze fragment sizes and copy number alterations (CNAs) in cfDNA from 21 hemangiosarcoma-affected and 36 healthy dogs, aiming to enhance early cancer detection accuracy through machine learning models.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!