A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Lung Cancer Staging Using Chest CT and FDG PET/CT Free-Text Reports: Comparison Among Three ChatGPT Large Language Models and Six Human Readers of Varying Experience. | LitMetric

Lung Cancer Staging Using Chest CT and FDG PET/CT Free-Text Reports: Comparison Among Three ChatGPT Large Language Models and Six Human Readers of Varying Experience.

AJR Am J Roentgenol

Department of Radiology, Research Institute for Convergence of Biomedical Science and Technology, Pusan National University Yangsan Hospital, Pusan National University School of Medicine, 20, Geumo-ro, Mulgeum-eup, Yangsan 50612, Korea.

Published: December 2024

Although radiology reports are commonly used for lung cancer staging, this task can be challenging given radiologists' variable reporting styles as well as reports' potentially ambiguous and/or incomplete staging-related information. The purpose of this study was to compare the performance of ChatGPT large language models (LLMs) and human readers of varying experience in lung cancer staging using chest CT and FDG PET/CT free-text reports. This retrospective study included 700 patients (mean age, 73.8 ± 29.5 [SD] years; 509 men, 191 women) from four institutions in Korea who underwent chest CT or FDG PET/CT for non-small cell lung cancer initial staging from January 2020 to December 2023. Examinations' reports used a free-text format, written exclusively in English or in mixed English and Korean. Two thoracic radiologists in consensus determined the overall stage group (IA, IB, IIA, IIB, IIIA, IIIB, IIIC, IVA, or IVB) for each report using the 8th-edition to establish the reference standard. Three ChatGPT models (GPT-4o, GPT-4, GPT-3.5) determined an overall stage group for each report using a script-based application programming interface, zero-shot learning, and a prompt incorporating a staging system summary. The code for this web application was made publicly available through a GitHub repository (https://github.com/elmidion/GPT_Information_Extractor). Six human readers (two fellowship-trained radiologists with less experience than the radiologists who determined the reference standard, two fellows, and two residents) also independently determined overall stage groups. GPT-4o's overall accuracy for determining the correct stage among the nine groups was compared with that of the other LLMs and human readers using McNemar tests. GPT-4o had an overall staging accuracy of 74.1%, significantly better than the accuracy of GPT-4 (70.1%, = .02), GPT-3.5 (57.4%, < .001), and resident 2 (65.7%, < .001); significantly worse than the accuracy of fellowship-trained radiologist 1 (82.3%, < .001) and fellowship-trained radiologist 2 (85.4%, < .001); and not significantly different from the accuracy of fellow 1 (77.7%, = .09), fellow 2 (75.6%, = .53), and resident 1 (72.3%, = .42). The best-performing model, GPT-4o, showed no significant difference in staging accuracy versus fellows but showed significantly worse performance versus fellowship-trained radiologists. The findings do not support use of LLMs for lung cancer staging in place of expert health care professionals. The findings indicate the importance of domain expertise for performing complex specialized tasks such as cancer staging.

Download full-text PDF

Source
http://dx.doi.org/10.2214/AJR.24.31696DOI Listing

Publication Analysis

Top Keywords

lung cancer
20
cancer staging
20
human readers
16
chest fdg
12
fdg pet/ct
12
determined stage
12
staging
9
staging chest
8
pet/ct free-text
8
free-text reports
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!