A descriptive study based on the comparison of ChatGPT and evidence-based neurosurgeons.

Jiayu Liu Jiqi Zheng Xintian Cai Dongdong Wu Chengliang Yin

iScience

Faculty of Medicine, Macau University of Science and Technology, Macau 999078, China.

Published: September 2023

ChatGPT, developed by OpenAI, was evaluated for its ability to respond accurately to neurosurgical questions based on evidence-based medicine.
A total of 50 questions were posed to both GPT-3.5 and GPT-4.0, and responses were also obtained from neurosurgeons of varying seniority levels.
The study found that GPT-3.5's performance was similar to that of less experienced neurosurgeons, while GPT-4.0 showcased capabilities comparable to more experienced neurosurgeons, indicating potential for future improvements in AI-assisted medical responses.

ChatGPT is an artificial intelligence product developed by OpenAI. This study aims to investigate whether ChatGPT can respond in accordance with evidence-based medicine in neurosurgery. We generated 50 neurosurgical questions covering neurosurgical diseases. Each question was posed three times to GPT-3.5 and GPT-4.0. We also recruited three neurosurgeons with high, middle, and low seniority to respond to questions. The results were analyzed regarding ChatGPT's overall performance score, mean scores by the items' specialty classification, and question type. In conclusion, GPT-3.5's ability to respond in accordance with evidence-based medicine was comparable to that of neurosurgeons with low seniority, and GPT-4.0's ability was comparable to that of neurosurgeons with high seniority. Although ChatGPT is yet to be comparable to a neurosurgeon with high seniority, future upgrades could enhance its performance and abilities.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10495632	PMC
http://dx.doi.org/10.1016/j.isci.2023.107590	DOI Listing

Publication Analysis

Top Keywords

respond evidence-based

evidence-based medicine

neurosurgeons high

low seniority

comparable neurosurgeons

high seniority

descriptive study

study based

based comparison

chatgpt

Similar Publications

Minimally important change on the Columbia Impairment Scale and Strengths and Difficulties Questionnaire in youths seeking mental healthcare.

BMJ Ment Health

January 2025

Cundill Centre for Child and Youth Depression, Centre for Addiction and Mental Health, Toronto, Ontario, Canada.

Karolin R Krause Alina Lee Di Shan Katherine Tombeau Cost Lisa D Hawke

Background: Evidence-based mental health requires patient-relevant outcome data, but many indicators lack clinical meaning and fail to consider youth perceptions. The minimally important change (MIC) indicator designates change as meaningful to patients, yet is rarely reported in youth mental health trials.

Objective: This study aimed to establish MIC thresholds for two patient-reported outcome measures (PROMs), the Columbia Impairment Scale (CIS) and the Strengths and Difficulties Questionnaire (SDQ), using different estimation methods.

View Article and Find Full Text PDF

Similar Publications

Improving Mental Health and Well-Being Through the Paradym App: Quantitative Study of Real-World Data.

JMIR Form Res

January 2025

Paradym, London, United Kingdom.

Athina Marina Metaxa Shaun Liverpool Mia Eisenstadt John Pollard Courtney Carlsson

Background: With growing evidence suggesting that levels of emotional well-being have been decreasing globally over the past few years, demand for easily accessible, convenient, and affordable well-being and mental health support has increased. Although mental health apps designed to tackle this demand by targeting diagnosed conditions have been shown to be beneficial, less research has focused on apps aiming to improve emotional well-being. There is also a dearth of research on well-being apps structured around users' lived experiences and emotional patterns and a lack of integration of real-world evidence of app usage.

View Article and Find Full Text PDF

Similar Publications

The concept of difficult-to-treat disease in rheumatology: where next?

Lancet Rheumatol

January 2025

College of Medical, Veterinary and Life Sciences, University of Glasgow, Glasgow, UK.

György Nagy Lilla Gunkl-Tóth András M Dorgó Iain B McInnes

New pathogenesis-based therapeutics and evidence-based consensus treatment recommendations, often with predefined treatment goals, have remarkably improved outcomes across many chronic diseases. However, a clinically significant subgroup of patients responds poorly to interventions and show a progressive decline in the disease trajectory, which poses an increasing health-care challenge. Difficult-to-treat approaches exist in several areas of medicine and the need for similar definitions has recently also emerged in rheumatology.

View Article and Find Full Text PDF

Similar Publications

Evaluating the Acceptance and Usability of an Independent, Noncommercial Search Engine for Medical Information: Cross-Sectional Questionnaire Study and User Behavior Tracking Analysis.

JMIR Hum Factors

January 2025

Institute of General Practice, Faculty of Medicine and Medical Center, University of Freiburg, Freiburg, Germany.

Lisa Specht Raphael Scheible Martin Boeker Erik Farin-Glattacker Nikolas Kampel

Background: The internet is a key source of health information, but the quality of content from popular search engines varies, posing challenges for users-especially those with low health or digital health literacy. To address this, the "tala-med" search engine was developed in 2020 to provide access to high-quality, evidence-based content. It prioritizes German health websites based on trustworthiness, recency, user-friendliness, and comprehensibility, offering category-based filters while ensuring privacy by avoiding data collection and advertisements.

View Article and Find Full Text PDF

Similar Publications

Identifying the Minimal Clinically Important Difference in Emotion Regulation Among Youth Using the JoyPop App: Survey Study.

JMIR Form Res

January 2025

Department of Psychology, Lakehead University, Thunder Bay, ON, Canada.

Jaidyn Charlton Ishaq Malik Angela M Ashley Amanda Newton Elaine Toombs

Background: The minimal clinically important difference (MCID) is an important threshold to consider when evaluating the meaningfulness of improvement following an intervention. The JoyPop app is an evidence-based smartphone app designed to improve resilience and emotion regulation. Information is needed regarding the JoyPop app's MCID among culturally diverse youth.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!