Annu Int Conf IEEE Eng Med Biol Soc
July 2024
Vision-language models have emerged as a powerful tool for previously challenging multi-modal classification problem in the medical domain. This development has led to the exploration of automated image description generation for multi-modal clinical scans, particularly for radiology report generation. Existing research has focused on clinical descriptions for specific modalities or body regions, leaving a gap for a model providing entire-body multi-modal descriptions.
View Article and Find Full Text PDFUltrasound scanners image the anatomy modulated by their characteristic texture. For certain anatomical regions such as the liver, the characteristic texture of the scanner itself becomes the anatomical marker. Deep Learning (DL) models trained on a scanner-type not only model the anatomical content, they also learn the scanner's characteristic texture.
View Article and Find Full Text PDF