An in-depth evaluation of federated learning on biomedical natural language processing for information extraction.

Le Peng Gaoxiang Luo Sicheng Zhou Jiandong Chen Ziyue Xu Ju Sun Rui Zhang

NPJ Digit Med

Division of Computational Health Sciences, Department of Surgery, University of Minnesota, Minneapolis, MN, USA.

Published: May 2024

Language models (LMs) such as BERT and GPT have revolutionized natural language processing (NLP). However, the medical field faces challenges in training LMs due to limited data access and privacy constraints imposed by regulations like the Health Insurance Portability and Accountability Act (HIPPA) and the General Data Protection Regulation (GDPR). Federated learning (FL) offers a decentralized solution that enables collaborative learning while ensuring data privacy. In this study, we evaluated FL on 2 biomedical NLP tasks encompassing 8 corpora using 6 LMs. Our results show that: (1) FL models consistently outperformed models trained on individual clients' data and sometimes performed comparably with models trained with polled data; (2) with the fixed number of total data, FL models training with more clients produced inferior performance but pre-trained transformer-based models exhibited great resilience. (3) FL models significantly outperformed pre-trained LLMs with few-shot prompting.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11096157	PMC
http://dx.doi.org/10.1038/s41746-024-01126-4	DOI Listing

Publication Analysis

Top Keywords

federated learning

natural language

language processing

models trained

models

data

in-depth evaluation

evaluation federated

learning biomedical

biomedical natural

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!