Objective: Radiology reporting is an essential component of clinical diagnosis and decision-making. With the advent of advanced artificial intelligence (AI) models like GPT-4 (Generative Pre-trained Transformer 4), there is growing interest in evaluating their potential for optimizing or generating radiology reports. This study aimed to compare the quality and content of radiologist-generated and GPT-4 AI-generated radiology reports.

Methods: A comparative study design was employed in the study, where a total of 100 anonymized radiology reports were randomly selected and analyzed. Each report was processed by GPT-4, resulting in the generation of a corresponding AI-generated report. Quantitative and qualitative analysis techniques were utilized to assess similarities and differences between the two sets of reports.

Results: The AI-generated reports showed comparable quality to radiologist-generated reports in most categories. Significant differences were observed in clarity (p = 0.027), ease of understanding (p = 0.023), and structure (p = 0.050), favoring the AI-generated reports. AI-generated reports were more concise, with 34.53 fewer words and 174.22 fewer characters on average, but had greater variability in sentence length. Content similarity was high, with an average Cosine Similarity of 0.85, Sequence Matcher Similarity of 0.52, BLEU Score of 0.5008, and BERTScore F1 of 0.8775.

Conclusion: The results of this proof-of-concept study suggest that GPT-4 can be a reliable tool for generating standardized radiology reports, offering potential benefits such as improved efficiency, better communication, and simplified data extraction and analysis. However, limitations and ethical implications must be addressed to ensure the safe and effective implementation of this technology in clinical practice.

Clinical Relevance Statement: The findings of this study suggest that GPT-4 (Generative Pre-trained Transformer 4), an advanced AI model, has the potential to significantly contribute to the standardization and optimization of radiology reporting, offering improved efficiency and communication in clinical practice.

Key Points: • Large language model-generated radiology reports exhibited high content similarity and moderate structural resemblance to radiologist-generated reports. • Performance metrics highlighted the strong matching of word selection and order, as well as high semantic similarity between AI and radiologist-generated reports. • Large language model demonstrated potential for generating standardized radiology reports, improving efficiency and communication in clinical settings.

Download full-text PDF

Source
http://dx.doi.org/10.1007/s00330-023-10384-xDOI Listing

Publication Analysis

Top Keywords

radiology reports
24
generative pre-trained
12
reports
12
ai-generated reports
12
radiologist-generated reports
12
radiology
9
radiology reporting
8
gpt-4 generative
8
pre-trained transformer
8
content similarity
8

Similar Publications

Background And Purpose: Radiation-induced lymphopenia (RIL) may be associated with a worse prognosis in pancreatic cancer. This study aimed to develop a normal tissue complication probability (NTCP) model to predict severe RIL in patients with pancreatic cancer undergoing concurrent chemoradiotherapy (CCRT).

Materials And Methods: We reviewed pancreatic cancer patients treated at our facility for model training and internal validation.

View Article and Find Full Text PDF

Background: Hepatocellular carcinoma (HCC) is the most common form of liver cancer that has limited treatment options and a poor prognosis. Transarterial chemoembolization (TACE) is the first-line treatment for intermediate-stage HCC but can induce tumour hypoxia, thereby promoting angiogenesis. Recent studies suggested that combining TACE with anti-angiogenic therapies and immunotherapy might improve efficacy.

View Article and Find Full Text PDF

Background: Risk stratification for sudden cardiac death (SCD) in patients with nonischemic cardiomyopathy (NICM) remains challenging.

Objectives: This study aimed to investigate the impact of epicardial adipose tissue (EAT) on SCD in NICM patients.

Methods: Our study cohort included 173 consecutive patients (age 53 ± 14 years, 73% men) scheduled for primary prevention implantable cardioverter-defibrillators (ICDs) implantation who underwent preimplant cardiovascular magnetic resonance.

View Article and Find Full Text PDF

Background: Tuberculosis (TB) is a leading cause of death worldwide with over 90% of reported cases occurring in low- and middle-income countries (LMICs). Pre-treatment loss to follow-up (PTLFU) is a key contributor to TB mortality and infection transmission.

Objectives: We performed a scoping review to map available evidence on interventions to reduce PTLFU in adults with pulmonary TB, identify gaps in existing knowledge, and develop a conceptual framework to guide intervention implementation.

View Article and Find Full Text PDF

Development and Validation of a Diagnostic Model for Stanford Type B Aortic Dissection Based on Proteomic Profiling.

J Inflamm Res

January 2025

Department of Vascular Surgery, Nanjing Drum Tower Hospital, Affiliated Hospital of Medical School, Nanjing University, Nanjing, People's Republic of China.

Purpose: Stanford Type B Aortic Dissection (TBAD), a critical aortic disease, has exhibited stable mortality rates over the past decade. However, diagnostic approaches for TBAD during routine health check-ups are currently lacking. This study focused on developing a model to improve the diagnosis in a population.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!