Federated learning for computational pathology on gigapixel whole slide images.

Med Image Anal

Department of Pathology, Brigham and Women's Hospital, Harvard Medical School, Boston, MA, United States; Cancer Program, Broad Institute of Harvard and MIT, Cambridge, MA, United States; Data Science Department, Dana-Farber/Harvard Cancer Center, Boston, MA, United States; Harvard Data Science Initiative, Harvard University, Cambridge, MA, United States. Electronic address:

Published: February 2022

AI Article Synopsis

  • Deep Learning algorithms in computational pathology excel at various tasks like identifying morphological features and predicting molecular changes from histology samples, but require large, high-quality annotated datasets for effective model training.
  • The challenge of collecting diverse data while addressing privacy concerns can be mitigated through multi-centric collaboration, yet data sharing complexities remain a significant barrier.
  • This paper introduces a privacy-preserving federated learning approach that allows for model development using distributed histology data while maintaining patient privacy, and offers a framework for survival prediction and patient stratification from whole slide images.

Article Abstract

Deep Learning-based computational pathology algorithms have demonstrated profound ability to excel in a wide array of tasks that range from characterization of well known morphological phenotypes to predicting non human-identifiable features from histology such as molecular alterations. However, the development of robust, adaptable and accurate deep learning-based models often rely on the collection and time-costly curation large high-quality annotated training data that should ideally come from diverse sources and patient populations to cater for the heterogeneity that exists in such datasets. Multi-centric and collaborative integration of medical data across multiple institutions can naturally help overcome this challenge and boost the model performance but is limited by privacy concerns among other difficulties that may arise in the complex data sharing process as models scale towards using hundreds of thousands of gigapixel whole slide images. In this paper, we introduce privacy-preserving federated learning for gigapixel whole slide images in computational pathology using weakly-supervised attention multiple instance learning and differential privacy. We evaluated our approach on two different diagnostic problems using thousands of histology whole slide images with only slide-level labels. Additionally, we present a weakly-supervised learning framework for survival prediction and patient stratification from whole slide images and demonstrate its effectiveness in a federated setting. Our results show that using federated learning, we can effectively develop accurate weakly-supervised deep learning models from distributed data silos without direct data sharing and its associated complexities, while also preserving differential privacy using randomized noise generation. We also make available an easy-to-use federated learning for computational pathology software package: http://github.com/mahmoodlab/HistoFL.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9340569PMC
http://dx.doi.org/10.1016/j.media.2021.102298DOI Listing

Publication Analysis

Top Keywords

slide images
20
federated learning
16
computational pathology
16
gigapixel slide
12
learning computational
8
deep learning-based
8
data sharing
8
differential privacy
8
learning
6
federated
5

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!