Building machine learning models without sharing patient data: A simulation-based analysis of distributed learning by ensembling.

J Biomed Inform

Department of Radiology, University of Calgary, Calgary, Alberta, Canada; Hotchkiss Brain Institute, University of Calgary, Calgary, Alberta, Canada; Department of Clinical Neurosciences, University of Calgary, Calgary, Alberta, Canada; Alberta Children's Hospital Research Institute, University of Calgary, Calgary, Alberta, Canada.

Published: June 2020

The development of machine learning solutions in medicine is often hindered by difficulties associated with sharing patient data. Distributed learning aims to train machine learning models locally without requiring data sharing. However, the utility of distributed learning for rare diseases, with only a few training examples at each contributing local center, has not been investigated. The aim of this work was to simulate distributed learning models by ensembling with artificial neural networks (ANN), support vector machines (SVM), and random forests (RF) and evaluate them using four medical datasets. Distributed learning by ensembling locally trained agents improved performance compared to models trained using the data from a single institution, even in cases where only a very few training examples are available per local center. Distributed learning improved when more locally trained models were added to the ensemble. Local class imbalance reduced distributed SVM performance but did not impact distributed RF and ANN classification. Our results suggest that distributed learning by ensembling can be used to train machine learning models without sharing patient data and is suitable to use with small datasets.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jbi.2020.103424DOI Listing

Publication Analysis

Top Keywords

distributed learning
28
machine learning
16
learning models
16
sharing patient
12
patient data
12
learning ensembling
12
learning
11
distributed
9
models sharing
8
train machine
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!