Background: We aimed to define the capability of three different publicly available large language models, Chat Generative Pretrained Transformer (ChatGPT-3.5), ChatGPT-4 and Google Gemini in analysing retinal detachment cases and suggesting the best possible surgical planning.
Methods: Analysis of 54 retinal detachments records entered into ChatGPT and Gemini's interfaces. After asking 'Specify what kind of surgical planning you would suggest and the eventual intraocular tamponade.' and collecting the given answers, we assessed the level of agreement with the common opinion of three expert vitreoretinal surgeons. Moreover, ChatGPT and Gemini answers were graded 1-5 (from poor to excellent quality), according to the Global Quality Score (GQS).
Results: After excluding 4 controversial cases, 50 cases were included. Overall, ChatGPT-3.5, ChatGPT-4 and Google Gemini surgical choices agreed with those of vitreoretinal surgeons in 40/50 (80%), 42/50 (84%) and 35/50 (70%) of cases. Google Gemini was not able to respond in five cases. Contingency analysis showed significant differences between ChatGPT-4 and Gemini (p=0.03). ChatGPT's GQS were 3.9±0.8 and 4.2±0.7 for versions 3.5 and 4, while Gemini scored 3.5±1.1. There was no statistical difference between the two ChatGPTs (p=0.22), while both outperformed Gemini scores (p=0.03 and p=0.002, respectively). The main source of error was endotamponade choice (14% for ChatGPT-3.5 and 4, and 12% for Google Gemini). Only ChatGPT-4 was able to suggest a combined phacovitrectomy approach.
Conclusion: In conclusion, Google Gemini and ChatGPT evaluated vitreoretinal patients' records in a coherent manner, showing a good level of agreement with expert surgeons. According to the GQS, ChatGPT's recommendations were much more accurate and precise.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1136/bjo-2023-325143 | DOI Listing |
Adv Physiol Educ
January 2025
College of Medicine, Alfaisal University, Kingdom of Saudi Arabia.
Despite extensive studies on large language models and their capability to respond to questions from various licensed exams, there has been limited focus on employing chatbots for specific subjects within the medical curriculum, specifically medical neuroscience. This research compared the performances of Claude 3.5 Sonnet (Anthropic), GPT-3.
View Article and Find Full Text PDFAdv Physiol Educ
January 2025
Department of Kinesiology and Outdoor Recreation, Southern Utah University, Cedar City, UT, USA.
Learning Objectives (LOs) are a pillar of course design and execution, and thus a focus of curricular reforms. This study explored the extent to which the creation and usage of LOs might be facilitated by three leading chatbots: ChatGPT-4o, Claude 3.5 Sonnet, and Google Gemini Advanced.
View Article and Find Full Text PDFUpdates Surg
January 2025
Alluri Sitarama Raju Academy of Medical Sciences, Eluru, India.
There is a growing importance for patients to easily access information regarding their medical conditions to improve their understanding and participation in health care decisions. Artificial Intelligence (AI) has proven as a fast, efficient, and effective tool in educating patients regarding their health care conditions. The aim of the study is to compare the responses provided by AI tools, ChatGPT and Google Gemini, to assess for conciseness and understandability of information provided for the medical conditions Deep vein thrombosis, decubitus ulcers, and hemorrhoids.
View Article and Find Full Text PDFTransplant Proc
January 2025
Department of Urology, Sun Yat-sen Memorial Hospital, Guangzhou, China. Electronic address:
This study evaluated the capability of three AI chatbots-ChatGPT 4.0, Claude 3.0, and Gemini Pro, as well as Google-in responding to common postkidney transplantation inquiries.
View Article and Find Full Text PDFFacial Plast Surg Aesthet Med
January 2025
Department of Otolaryngology-Head and Neck Surgery, University of California, Irvine, California, USA.
Various large language models (LLMs) can provide human-level medical discussions, but they have not been compared regarding rhinoplasty knowledge. To compare the leading LLMs in answering complex rhinoplasty consultation questions as evaluated by plastic surgeons. Ten open-ended rhinoplasty consultation questions were presented to ChatGPT-4o, Google Gemini, Claude, and Meta-AI LLMs.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!