Background It is unclear whether artificial intelligence (AI) explanations help or hurt radiologists and other physicians in AI-assisted radiologic diagnostic decision-making. Purpose To test whether the type of AI explanation and the correctness and confidence level of AI advice impact physician diagnostic performance, perception of AI advice usefulness, and trust in AI advice for chest radiograph diagnosis. Materials and Methods A multicenter, prospective randomized study was conducted from April 2022 to September 2022.
View Article and Find Full Text PDFBackground Natural language processing (NLP) is commonly used to annotate radiology datasets for training deep learning (DL) models. However, the accuracy and potential biases of these NLP methods have not been thoroughly investigated, particularly across different demographic groups. Purpose To evaluate the accuracy and demographic bias of four NLP radiology report labeling tools on two chest radiograph datasets.
View Article and Find Full Text PDFDeep learning (DL) tools developed on adult data sets may not generalize well to pediatric patients, posing potential safety risks. We evaluated the performance of TotalSegmentator, a state-of-the-art adult-trained CT organ segmentation model, on a subset of organs in a pediatric CT dataset and explored optimization strategies to improve pediatric segmentation performance. TotalSegmentator was retrospectively evaluated on abdominal CT scans from an external adult dataset (n = 300) and an external pediatric data set (n = 359).
View Article and Find Full Text PDFSegmentation of infarcts is clinically important in ischemic stroke management and prognostication. It is unclear what role the combination of DWI, ADC, and FLAIR MRI sequences provide for deep learning in infarct segmentation. Recent technologies in model self-configuration have promised greater performance and generalizability through automated optimization.
View Article and Find Full Text PDFPurpose: To evaluate whether a commercial AI tool for intracranial hemorrhage (ICH) detection on head CT exhibited sociodemographic biases.
Methods: Our retrospective study reviewed 9736 consecutive, adult non-contrast head CT scans performed between November 2021 and February 2022 in a single healthcare system. Each CT scan was evaluated by a commercial ICH AI tool and a board-certified neuroradiologist; ground truth was defined as final radiologist determination of ICH presence/absence.
Background: Radiographic diagnosis of necrotizing enterocolitis (NEC) is challenging. Deep learning models may improve accuracy by recognizing subtle imaging patterns. We hypothesized it would perform with comparable accuracy to that of senior surgical residents.
View Article and Find Full Text PDFAs the adoption of artificial intelligence (AI) systems in radiology grows, the increase in demand for greater bandwidth and computational resources can lead to greater infrastructural costs for healthcare providers and AI vendors. To that end, we developed ISLE, an intelligent streaming framework to address inefficiencies in current imaging infrastructures. Our framework draws inspiration from video-on-demand platforms to intelligently stream medical images to AI vendors at an optimal resolution for inference from a single high-resolution copy using progressive encoding.
View Article and Find Full Text PDFAJR Am J Roentgenol
September 2024
Large language models (LLMs) hold immense potential to revolutionize radiology. However, their integration into practice requires careful consideration. Artificial intelligence (AI) chatbots and general-purpose LLMs have potential pitfalls related to privacy, transparency, and accuracy, limiting their current clinical readiness.
View Article and Find Full Text PDFAs applications of AI in medicine continue to expand, there is an increasing focus on integration into clinical practice. An underappreciated aspect of this clinical translation is where the AI fits into the clinical workflow, and in turn, the outputs generated by the AI to facilitate clinician interaction in this workflow. For instance, in the canonical use case of AI for medical image interpretation, the AI could prioritize cases before clinician review or even autonomously interpret the images without clinician review.
View Article and Find Full Text PDFPurpose To evaluate the robustness of an award-winning bone age deep learning (DL) model to extensive variations in image appearance. Materials and Methods In December 2021, the DL bone age model that won the 2017 RSNA Pediatric Bone Age Challenge was retrospectively evaluated using the RSNA validation set (1425 pediatric hand radiographs; internal test set in this study) and the Digital Hand Atlas (DHA) (1202 pediatric hand radiographs; external test set). Each test image underwent seven types of transformations (rotations, flips, brightness, contrast, inversion, laterality marker, and resolution) to represent a range of image appearances, many of which simulate real-world variations.
View Article and Find Full Text PDFPurpose To evaluate the use of ChatGPT as a tool to simplify answers to common questions about breast cancer prevention and screening. Materials and Methods In this retrospective, exploratory study, ChatGPT was requested to simplify responses to 25 questions about breast cancer to a sixth-grade reading level in March and August 2023. Simplified responses were evaluated for clinical appropriateness.
View Article and Find Full Text PDFPurpose To compare the effectiveness of weak supervision (ie, with examination-level labels only) and strong supervision (ie, with image-level labels) in training deep learning models for detection of intracranial hemorrhage (ICH) on head CT scans. Materials and Methods In this retrospective study, an attention-based convolutional neural network was trained with either local (ie, image level) or global (ie, examination level) binary labels on the Radiological Society of North America (RSNA) 2019 Brain CT Hemorrhage Challenge dataset of 21 736 examinations (8876 [40.8%] ICH) and 752 422 images (107 784 [14.
View Article and Find Full Text PDFObjective: To assess the feasibility of using large language models (LLMs), specifically ChatGPT-4, to generate concise and accurate layperson summaries of musculoskeletal radiology reports.
Methods: Sixty radiology reports, comprising 20 MR shoulder, 20 MR knee, and 20 MR lumbar spine reports, were obtained via PACS. The reports were deidentified and then submitted to ChatGPT-4, with the prompt "Produce an organized and concise layperson summary of the findings of the following radiology report.