Histopathology language-image representation learning for fine-grained digital pathology cross-modal retrieval.

Dingyi Hu Zhiguo Jiang Jun Shi Fengying Xie Kun Wu Kunming Tang Ming Cao Jianguo Huai Yushan Zheng

Med Image Anal

Beijing Advanced Innovation Center for Biomedical Engineering, School of Engineering Medicine, Beihang University, Beijing, 100191, China. Electronic address:

Published: July 2024

Large-scale digital whole slide image (WSI) datasets analysis have gained significant attention in computer-aided cancer diagnosis. Content-based histopathological image retrieval (CBHIR) is a technique that searches a large database for data samples matching input objects in both details and semantics, offering relevant diagnostic information to pathologists. However, the current methods are limited by the difficulty of gigapixels, the variable size of WSIs, and the dependence on manual annotations. In this work, we propose a novel histopathology language-image representation learning framework for fine-grained digital pathology cross-modal retrieval, which utilizes paired diagnosis reports to learn fine-grained semantics from the WSI. An anchor-based WSI encoder is built to extract hierarchical region features and a prompt-based text encoder is introduced to learn fine-grained semantics from the diagnosis reports. The proposed framework is trained with a multivariate cross-modal loss function to learn semantic information from the diagnosis report at both the instance level and region level. After training, it can perform four types of retrieval tasks based on the multi-modal database to support diagnostic requirements. We conducted experiments on an in-house dataset and a public dataset to evaluate the proposed method. Extensive experiments have demonstrated the effectiveness of the proposed method and its advantages to the present histopathology retrieval methods. The code is available at https://github.com/hudingyi/FGCR.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.media.2024.103163	DOI Listing

Publication Analysis

Top Keywords

histopathology language-image

language-image representation

representation learning

fine-grained digital

digital pathology

pathology cross-modal

cross-modal retrieval

diagnosis reports

learn fine-grained

fine-grained semantics

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!