Purpose: To assess the performance of Chat Generative Pre-Trained Transformer-4 in providing accurate diagnoses to retina teaching cases from OCTCases.
Design: Cross-sectional study.
Subjects: Retina teaching cases from OCTCases.
Methods: We prompted a custom chatbot with 69 retina cases containing multimodal ophthalmic images, asking it to provide the most likely diagnosis. In a sensitivity analysis, we inputted increasing amounts of clinical information pertaining to each case until the chatbot achieved a correct diagnosis. We performed multivariable logistic regressions on Stata v17.0 (StataCorp LLC) to investigate associations between the amount of text-based information inputted per prompt and the odds of the chatbot achieving a correct diagnosis, adjusting for the laterality of cases, number of ophthalmic images inputted, and imaging modalities.
Main Outcome Measures: Our primary outcome was the proportion of cases for which the chatbot was able to provide a correct diagnosis. Our secondary outcome was the chatbot's performance in relation to the amount of text-based information accompanying ophthalmic images.
Results: Across 69 retina cases collectively containing 139 ophthalmic images, the chatbot was able to provide a definitive, correct diagnosis for 35 (50.7%) cases. The chatbot needed variable amounts of clinical information to achieve a correct diagnosis, where the entire patient description as presented by OCTCases was required for a majority of correctly diagnosed cases (23 of 35 cases, 65.7%). Relative to when the chatbot was only prompted with a patient's age and sex, the chatbot achieved a higher odds of a correct diagnosis when prompted with an entire patient description (odds ratio = 10.1, 95% confidence interval = 3.3-30.3, < 0.01). Despite providing an incorrect diagnosis for 34 (49.3%) cases, the chatbot listed the correct diagnosis within its differential diagnosis for 7 (20.6%) of these incorrectly answered cases.
Conclusions: This custom chatbot was able to accurately diagnose approximately half of the retina cases requiring multimodal input, albeit relying heavily on text-based contextual information that accompanied ophthalmic images. The diagnostic ability of the chatbot in interpretation of multimodal imaging without text-based information is currently limited. The appropriate use of the chatbot in this setting is of utmost importance, given bioethical concerns.
Financial Disclosures: Proprietary or commercial disclosure may be found in the Footnotes and Disclosures at the end of this article.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11321281 | PMC |
http://dx.doi.org/10.1016/j.xops.2024.100556 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!