Purpose: This study aimed to evaluate the performance of large language models (LLMs) and multimodal LLMs in interpreting the Breast Imaging Reporting and Data System (BI-RADS) categories and providing clinical management recommendations for breast radiology in text-based and visual questions.
Methods: This cross-sectional observational study involved two steps. In the first step, we compared ten LLMs (namely ChatGPT 4o, ChatGPT 4, ChatGPT 3.
The advent of large language models (LLMs) marks a transformative leap in natural language processing, offering unprecedented potential in radiology, particularly in enhancing the accuracy and efficiency of coronary artery disease (CAD) diagnosis. While previous studies have explored the capabilities of specific LLMs like ChatGPT in cardiac imaging, a comprehensive evaluation comparing multiple LLMs in the context of CAD-RADS 2.0 has been lacking.
View Article and Find Full Text PDFThis study evaluates LLM integration in interpreting Lung-RADS for lung cancer screening, highlighting their innovative role in enhancing radiological practice. Our findings reveal that Claude 3 Opus and Perplexity achieved a 96% accuracy rate, outperforming other models.
View Article and Find Full Text PDF