Exploring AI-chatbots' capability to suggest surgical planning in ophthalmology: ChatGPT versus Google Gemini analysis of retinal detachment cases.

Matteo Mario Carlà Gloria Gambini Antonio Baldascino Federico Giannuzzi Francesco Boselli Emanuele Crincoli Nicola Claudio D'Onofrio Stanislao Rizzo

Br J Ophthalmol

Ophthalmology Department, Catholic University "Sacro Cuore", Rome, Italy.

Published: September 2024

Background: We aimed to define the capability of three different publicly available large language models, Chat Generative Pretrained Transformer (ChatGPT-3.5), ChatGPT-4 and Google Gemini in analysing retinal detachment cases and suggesting the best possible surgical planning.

Methods: Analysis of 54 retinal detachments records entered into ChatGPT and Gemini's interfaces. After asking 'Specify what kind of surgical planning you would suggest and the eventual intraocular tamponade.' and collecting the given answers, we assessed the level of agreement with the common opinion of three expert vitreoretinal surgeons. Moreover, ChatGPT and Gemini answers were graded 1-5 (from poor to excellent quality), according to the Global Quality Score (GQS).

Results: After excluding 4 controversial cases, 50 cases were included. Overall, ChatGPT-3.5, ChatGPT-4 and Google Gemini surgical choices agreed with those of vitreoretinal surgeons in 40/50 (80%), 42/50 (84%) and 35/50 (70%) of cases. Google Gemini was not able to respond in five cases. Contingency analysis showed significant differences between ChatGPT-4 and Gemini (p=0.03). ChatGPT's GQS were 3.9±0.8 and 4.2±0.7 for versions 3.5 and 4, while Gemini scored 3.5±1.1. There was no statistical difference between the two ChatGPTs (p=0.22), while both outperformed Gemini scores (p=0.03 and p=0.002, respectively). The main source of error was endotamponade choice (14% for ChatGPT-3.5 and 4, and 12% for Google Gemini). Only ChatGPT-4 was able to suggest a combined phacovitrectomy approach.

Conclusion: In conclusion, Google Gemini and ChatGPT evaluated vitreoretinal patients' records in a coherent manner, showing a good level of agreement with expert surgeons. According to the GQS, ChatGPT's recommendations were much more accurate and precise.

Download full-text PDF	Source
http://dx.doi.org/10.1136/bjo-2023-325143	DOI Listing

Publication Analysis

Top Keywords

google gemini

gemini

surgical planning

analysis retinal

retinal detachment

detachment cases

chatgpt-35 chatgpt-4

chatgpt-4 google

level agreement

vitreoretinal surgeons

Similar Publications

Claude, ChatGPT, Copilot, and Gemini Performance versus Students in Different Topics of Neuroscience.

Adv Physiol Educ

January 2025

College of Medicine, Alfaisal University, Kingdom of Saudi Arabia.

Volodymyr Mavrych Ahmed Yaqinuddin Olena Bolgova

Despite extensive studies on large language models and their capability to respond to questions from various licensed exams, there has been limited focus on employing chatbots for specific subjects within the medical curriculum, specifically medical neuroscience. This research compared the performances of Claude 3.5 Sonnet (Anthropic), GPT-3.

View Article and Find Full Text PDF

Similar Publications

Frontier-Model Chatbots Can Help Instructors Create, Improve, and Use Learning Objectives.

Adv Physiol Educ

January 2025

Department of Kinesiology and Outdoor Recreation, Southern Utah University, Cedar City, UT, USA.

Gregory J Crowther Merrill D Funk Kelly M Hennessey Marcus M Lawrence

Learning Objectives (LOs) are a pillar of course design and execution, and thus a focus of curricular reforms. This study explored the extent to which the creation and usage of LOs might be facilitated by three leading chatbots: ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini Advanced.

View Article and Find Full Text PDF

Similar Publications

Use of generative large language models for patient education on common surgical conditions: a comparative analysis between ChatGPT and Google Gemini.

Updates Surg

January 2025

Alluri Sitarama Raju Academy of Medical Sciences, Eluru, India.

Omar Mahmoud ELSenbawy Keval Bhavesh Patel Randev Ayodhya Wannakuwatte Akhila N Thota

There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids.

View Article and Find Full Text PDF

Similar Publications

Evaluating AI Chatbot Responses to Postkidney Transplant Inquiries.

Transplant Proc

January 2025

Department of Urology, Sun Yat-sen Memorial Hospital, Guangzhou, China. Electronic address:

Yihua Zhan Xutao Chen Feihong Ye Zhikai Wu Muhammad Usman

This study evaluated the capability of three AI chatbots-ChatGPT 4.0, Claude 3.0, and Gemini Pro, as well as Google-in responding to common postkidney transplantation inquiries.

View Article and Find Full Text PDF

Similar Publications

Comparative Performance of the Leading Large Language Models in Answering Complex Rhinoplasty Consultation Questions.

Facial Plast Surg Aesthet Med

January 2025

Department of Otolaryngology-Head and Neck Surgery, University of California, Irvine, California, USA.

Khodayar Goshtasbi Corliss Best Bethany Powers Harry Ching Norman J Pastorek

Various large language models (LLMs) can provide human-level medical discussions, but they have not been compared regarding rhinoplasty knowledge. To compare the leading LLMs in answering complex rhinoplasty consultation questions as evaluated by plastic surgeons. Ten open-ended rhinoplasty consultation questions were presented to ChatGPT-4o, Google Gemini, Claude, and Meta-AI LLMs.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!