Purpose: Compare large language models (LLMs) in analyzing and responding to a difficult series of ophthalmic cases.
Design: A comparative case series involving LLMs that met inclusion criteria tested on twenty difficult case studies posed in open-text format.
Methods: Fifteen LLMs accessible to ophthalmologists were tested against twenty case studies published in JAMA Ophthalmology. Each case was presented in identical, open-ended text fashion to each LLM and open-ended responses regarding differential diagnosis, next diagnostic tests and recommended treatments were requested. Responses were recorded and assessed for accuracy against published correct answers. The main outcome was accuracy of LLMs against the correct answers. Secondary outcomes included comparative performance on the differential diagnosis, ancillary testing, and treatment subtests; and readability of responses.
Results: Scores were normally distributed and ranged from 0-35 (with a maximum score of 60) with a mean ± standard deviation of 19 ± 9. Scores for three of the LLMs (ChatGPT 3.5, Claude Pro, and Copilot Pro) were statistically significantly higher than the mean. Two of the high-performing LLMs were paid subscription (Claude Pro and Copilot Pro) and one was free (ChatGPT 3.5). While there were no clinical or statistical differences between ChatGPT 3.5 and Claude Pro, a separation of +5 points, or 0.56 standard deviations, between Copilot Pro and the other highly ranked LLMs was present. Readability of all tested programs were above the AMA (American Medical Association) reading level recommendations to public consumers of eight grade.
Conclusion: Subscription LLMs were more prevalent among highly ranked LLMs suggesting that these perform better as ophthalmic assistants. While readability was poor for the average person, the content was understood by a board-certified ophthalmologist. The accuracy of LLMs is not high enough to recommend patient care in standalone mode, but aiding clinicians in patient care and prevent oversights is promising.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11568767 | PMC |
http://dx.doi.org/10.2147/OPTH.S488232 | DOI Listing |
Elife
January 2025
Department of Chemistry & Biochemistry, University of Delaware, Newark, United States.
The SARS-CoV-2 main protease (M or Nsp5) is critical for production of viral proteins during infection and, like many viral proteases, also targets host proteins to subvert their cellular functions. Here, we show that the human tRNA methyltransferase TRMT1 is recognized and cleaved by SARS-CoV-2 M. TRMT1 installs the ,-dimethylguanosine (m2,2G) modification on mammalian tRNAs, which promotes cellular protein synthesis and redox homeostasis.
View Article and Find Full Text PDFBiol Sport
January 2025
Institute of Sport Sciences, University of Lausanne, Lausanne, Switzerland.
Oxidative stress is augmented under hypoxic environments, which may be attenuated with antioxidant supplementation. We investigated the effects of dietary nitrate (NO-) supplementation combined with high-intensity training performed under hypoxic conditions on antioxidant/pro-oxidant balance. Thirty trained participants were assigned to one of three groups - HNO: hypoxia (13% FO) + NO-; HPL: hypoxia + placebo; CON: normoxia (20.
View Article and Find Full Text PDFWorld J Mens Health
December 2024
Division of Urology, Department of Surgery, Far Eastern Memorial Hospital, New Taipei, Taiwan.
Purpose: Information retrieval (IR) and risk assessment (RA) from multi-modality imaging and pathology reports are critical to prostate cancer (PC) treatment. This study aims to evaluate the performance of four general-purpose large language model (LLMs) in IR and RA tasks.
Materials And Methods: We conducted a study using simulated text reports from computed tomography, magnetic resonance imaging, bone scans, and biopsy pathology on stage IV PC patients.
EBioMedicine
December 2024
Research Institute of Internal Medicine, Oslo University Hospital Rikshospitalet, Oslo, Norway; Faculty of Medicine, Institute of Clinical Medicine, University of Oslo, Oslo, Norway; Section for Clinical Immunology and Infectious Diseases, Oslo University Hospital Rikshospitalet, Oslo, Norway. Electronic address:
Background: The Bari-SolidAct randomized controlled trial compared baricitinib with placebo in patients with severe COVID-19. A post hoc analysis revealed a higher incidence of serious adverse events (SAEs) among SARS-CoV-2-vaccinated participants who had received baricitinib. This sub-study aimed to investigate whether vaccination influences the safety profile of baricitinib in patients with severe COVID-19.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!