Building RadiologyNET: an unsupervised approach to annotating a large-scale multimodal medical database.

Mateja Napravnik Franko Hržić Sebastian Tschauner Ivan Štajduhar

BioData Min

Faculty of Engineering, University of Rijeka, Vukovarska 58, Rijeka, 51000, Croatia.

Published: July 2024

Background: The use of machine learning in medical diagnosis and treatment has grown significantly in recent years with the development of computer-aided diagnosis systems, often based on annotated medical radiology images. However, the lack of large annotated image datasets remains a major obstacle, as the annotation process is time-consuming and costly. This study aims to overcome this challenge by proposing an automated method for annotating a large database of medical radiology images based on their semantic similarity.

Results: An automated, unsupervised approach is used to create a large annotated dataset of medical radiology images originating from the Clinical Hospital Centre Rijeka, Croatia. The pipeline is built by data-mining three different types of medical data: images, DICOM metadata and narrative diagnoses. The optimal feature extractors are then integrated into a multimodal representation, which is then clustered to create an automated pipeline for labelling a precursor dataset of 1,337,926 medical images into 50 clusters of visually similar images. The quality of the clusters is assessed by examining their homogeneity and mutual information, taking into account the anatomical region and modality representation.

Conclusions: The results indicate that fusing the embeddings of all three data sources together provides the best results for the task of unsupervised clustering of large-scale medical data and leads to the most concise clusters. Hence, this work marks the initial step towards building a much larger and more fine-grained annotated dataset of medical radiology images.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11245804	PMC
http://dx.doi.org/10.1186/s13040-024-00373-1	DOI Listing

Publication Analysis

Top Keywords

medical radiology

radiology images

medical

unsupervised approach

large annotated

annotated dataset

dataset medical

medical data

images

building radiologynet

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered