Responses From ChatGPT-4 Show Limited Correlation With Expert Consensus Statement on Anterior Shoulder Instability.

Alexander Artamonov Ira Bachar-Avnieli Eyal Klang Omri Lubovsky Ehud Atoun Alexander Bermant Philip J Rosinsky

Arthrosc Sports Med Rehabil

Orthopedic Department, Barzilai Medical Center, Ashkelon, Israel.

Published: June 2024

- The study aimed to compare responses from GPT-4 with a consensus statement on diagnosing and managing anterior shoulder instability (ASI) to evaluate the AI's reliability in this context.
- Responses from GPT-4 were rated for similarity to expert opinions, showing high concordance (25.8% high similarity) but also notable discrepancies (41.9% disagreement). GPT-4 rated its own responses more favorably than the surgeons did.
- The findings suggest that while GPT-4 can provide relevant information, its responses do not closely align with expert recommendations, highlighting the need for caution in using AI for medical guidance.

Purpose: To compare the similarity of answers provided by Generative Pretrained Transformer-4 (GPT-4) with those of a consensus statement on diagnosis, nonoperative management, and Bankart repair in anterior shoulder instability (ASI).

Methods: An expert consensus statement on ASI published by Hurley et al. in 2022 was reviewed and questions laid out to the expert panel were extracted. GPT-4, the subscription version of ChatGPT, was queried using the same set of questions. Answers provided by GPT-4 were compared with those of the expert panel and subjectively rated for similarity by 2 experienced shoulder surgeons. GPT-4 was then used to rate the similarity of its own responses to the consensus statement, classifying them as low, medium, or high. Rates of similarity as classified by the shoulder surgeons and GPT-4 were then compared and interobserver reliability calculated using weighted κ scores.

Results: The degree of similarity between responses of GPT-4 and the ASI consensus statement, as defined by shoulder surgeons, was high in 25.8%, medium in 45.2%, and low 29% of questions. GPT-4 assessed similarity as high in 48.3%, medium in 41.9%, and low 9.7% of questions. Surgeons and GPT-4 reached consensus on the classification of 18 questions (58.1%) and disagreement on 13 questions (41.9%).

Conclusions: The responses generated by artificial intelligence exhibit limited correlation with an expert statement on the diagnosis and treatment of ASI.

Clinical Relevance: As the use of artificial intelligence becomes more prevalent, it is important to understand how closely information resembles content produced by human authors.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11240044	PMC
http://dx.doi.org/10.1016/j.asmr.2024.100923	DOI Listing

Publication Analysis

Top Keywords

consensus statement

shoulder surgeons

surgeons gpt-4

limited correlation

correlation expert

expert consensus

anterior shoulder

shoulder instability

answers provided

gpt-4

Similar Publications

Third Proceedings of The North American Airway Collaborative (NoAAC): Consensus Statement on Trial Design for Airway Stenosis.

JAMA Otolaryngol Head Neck Surg

January 2025

Department of Otolaryngology-Head and Neck Surgery, Vanderbilt University Medical Center, Nashville, Tennessee.

Ruth J Davis Lee M Akst Clint T Allen Richard J Battafarano Hayley L Born

Importance: Airway stenosis is a rare but debilitating disorder that significantly degrades the quality of life in affected patients. Treatments are primarily surgical, and disease management lacks established medical therapies. The North American Airway Collaborative held its third symposium at The Johns Hopkins Hospital in Baltimore, Maryland, on April 15, 2024, focused on strategies to advance the care of these patients.

View Article and Find Full Text PDF

Similar Publications

Global Delphi consensus on treatment goals for generalised pustular psoriasis.

Br J Dermatol

January 2025

Department of Dermatology, Yale University, New Haven, CT, USA.

Jonathan N Barker Emmylou Casanova Siew Eng Choon Peter Foley Hideki Fujita

Background: Generalised pustular psoriasis (GPP) is a chronic, systemic, neutrophilic inflammatory disease. A previous Delphi panel established areas of consensus on GPP, although patient perspectives were not included, and aspects of treatment goals remain unclear.

Objectives: To identify and achieve consensus on refined, specific treatment goals for GPP treatment via a Delphi panel with patient participation.

View Article and Find Full Text PDF

Similar Publications

Position statement on longitudinal cracks and fractures of teeth.

Int Endod J

January 2025

Centre for Oral, Clinical & Translational Sciences, King's College London, London, UK.

Shanon Patel Peng-Hui Teng Wan-Chuen Liao Matthew Craig Davis Ales Fidler

This position statement is a consensus view of an expert committee convened by the European Society of Endodontology (ESE). The statement is based on current clinical and scientific evidence as well as the collective reflective practice of the committee. The aim is to provide clinicians with evidence-based, authoritative information on the aetiology, clinical presentation, and management of cracks and fractures that typically manifest along the long axis of the crown and/or root.

View Article and Find Full Text PDF

Similar Publications

Update on the treatment navigation for functional cure of chronic hepatitis B: expert consensus 2.0.

Clin Mol Hepatol

January 2025

Department of Infectious Diseases, Tongji Hospital, Tongji Medical College and State Key Laboratory for Diagnosis and Treatment of Severe Zoonotic Infectious Disease, Huazhong University of Science and Technology, Wuhan, China.

Di Wu Jia-Horng Kao Teerha Piratvisuth Xiaojing Wang Patrick T F Kennedy

As new evidence emerges, treatment strategies toward the functional cure of chronic hepatitis B are evolving. In 2019, a panel of national hepatologists published a Consensus Statement on the functional cure of chronic hepatitis B. Currently, an international group of hepatologists has been assembled to evaluate research since the publication of the original consensus, and to collaboratively develop the updated statements.

View Article and Find Full Text PDF

Similar Publications

Rome Foundation Working Team Report: Consensus Statement on the Design and Conduct of Behavioural Clinical Trials for Disorders of Gut-Brain Interaction.

Aliment Pharmacol Ther

January 2025

Division of Gastroenterology, Icahn School of Medicine at Mount Sinai, New York, New York, USA.

Helen Burton-Murray Livia Guadagnoli Kendra Kamp Inês A Trindade Lynda H Powell

Background: Brain-gut behaviour therapies (BGBT) have gained widespread acceptance as therapeutic modalities for the management of disorders of gut-brain interaction (DGBI). However, existing treatment evaluation methods in the medical field fail to capture the specific elements of scientific rigour unique to behavioural trial evaluation.

Aims: To offer the first consensus on the development and testing of BGBT in DGBI.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!