Differentiating between GPT-generated and human-written feedback for radiology residents.

Curr Probl Diagn Radiol

Department of Diagnostic Radiology, Queen's University, 76 Stuart Street, Kingston, ON K7L 2V7, Canada. Electronic address:

Published: February 2025

Purpose: Recent competency-based medical education (CBME) implementation within Canadian radiology programs has required faculty to conduct more assessments. The rise of narrative feedback in CBME, coinciding with the rise of large language models (LLMs), raises questions about the potential of these models to generate informative comments matching human experts and associated challenges. This study compares human-written feedback to GPT-3.5-generated feedback for radiology residents, and how well raters can differentiate between these sources.

Methods: Assessments were completed by 28 faculty members for 10 residents within a Canadian Diagnostic Radiology program (2019-2023). Comments were extracted from Elentra, de-identified, and parsed into sentences, of which 110 were randomly selected for analysis. 11 of these comments were entered into GPT-3.5, generating 110 synthetic comments that were mixed with actual comments. Two faculty raters and GPT-3.5 read each comment to predict whether it was human-written or GPT-generated.

Results: Actual comments from humans were often longer and more specific than synthetic comments, especially when describing clinical procedures and patient interactions. Source differentiation was more difficult when both feedback types were similarly vague. Low agreement (k=-0.237) between responses provided by GPT-3.5 and humans was observed. Human raters were also more accurate (80.5 %) at identifying actual and synthetic comments than GPT-3.5 (50 %).

Conclusion: Currently, GPT-3.5 cannot match human experts in delivering specific, nuanced feedback for radiology residents. Compared to humans, GPT-3.5 also performs worse in distinguishing between actual and synthetic comments. These insights could guide the development of more sophisticated algorithms to produce higher-quality feedback, supporting faculty development.

Download full-text PDF

Source
http://dx.doi.org/10.1067/j.cpradiol.2025.02.002DOI Listing

Publication Analysis

Top Keywords

synthetic comments
16
feedback radiology
12
radiology residents
12
comments
9
human-written feedback
8
human experts
8
actual comments
8
actual synthetic
8
feedback
7
gpt-35
6

Similar Publications

Social media platforms are vital for expressing opinions and understanding public sentiment, yet many analytical tools overlook passive users who mainly consume content without engaging actively. To address this, we introduce UniPoll, an advanced framework designed to automatically generate polls from social media posts using sophisticated natural language generation (NLG) techniques. Unlike traditional methods that struggle with social media's informal and context-sensitive nature, UniPoll leverages enriched contexts from user comments and employs multiobjective optimization to enhance poll relevance and engagement.

View Article and Find Full Text PDF

Salt Metathesis: An Ultimate Click Reaction.

Precis Chem

February 2025

Department of Chemistry, State University of New York at Oswego, Oswego, New York 13126, United States.

In this Comment, we suggest salt metathesis (or ion exchange) as an ultimate click reaction, extending click chemistry principles beyond covalent bonds to ionic interactions. These universal and robust reactions, which nature utilizes in marine organisms' biomineralization processes, proceed spontaneously under mild conditions with minimal waste, embodying the core principles of click philosophy. This perspective expands the traditional scope of click chemistry and opens new opportunities in synthetic accessibility across organic, inorganic, and materials science spaces.

View Article and Find Full Text PDF

The widespread use of synthetic polymers since the mid-twentieth century has led to significant environmental pollution from microplastics (MPs). These MPs, which persist in ecosystems, can interact with various pollutants, including pesticides such as tebuconazole (TEB). The subject paper investigates the sorption behaviour of TEB on different types of MPs (polystyrene, polypropylene, and polyamide-6), focusing on the kinetics and isotherms of these interactions.

View Article and Find Full Text PDF

Governing synthetic data in medical research: the time is now.

Lancet Digit Health

February 2025

Kavli Centre for Ethics, Science, and the Public, Faculty of Education, University of Cambridge, Cambridge CB4 8PQ, UK; Wellcome Sanger Institute, Cambridge, UK.

View Article and Find Full Text PDF

Differentiating between GPT-generated and human-written feedback for radiology residents.

Curr Probl Diagn Radiol

February 2025

Department of Diagnostic Radiology, Queen's University, 76 Stuart Street, Kingston, ON K7L 2V7, Canada. Electronic address:

Purpose: Recent competency-based medical education (CBME) implementation within Canadian radiology programs has required faculty to conduct more assessments. The rise of narrative feedback in CBME, coinciding with the rise of large language models (LLMs), raises questions about the potential of these models to generate informative comments matching human experts and associated challenges. This study compares human-written feedback to GPT-3.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!