Impact of artificial intelligence in managing musculoskeletal pathologies in physiatry: a qualitative observational study evaluating the potential use of ChatGPT versus Copilot for patient information and clinical advice on low back pain.

Christophe Ah-Yan Ève Boissonnault Mathieu Boudier-Revéret Christopher Mares

J Yeungnam Med Sci

Department of Physical Medicine and Rehabilitation, Centre Hospitalier de l'Université de Montréal, Montreal, QC, Canada.

Published: November 2024

Background: The self-management of low back pain (LBP) through patient information interventions offers significant benefits in terms of cost, reduced work absenteeism, and overall healthcare utilization. Using a large language model (LLM), such as ChatGPT (OpenAI) or Copilot (Microsoft), could potentially enhance these outcomes further. Thus, it is important to evaluate the LLMs ChatGPT and Copilot in providing medical advice for LBP and assessing the impact of clinical context on the quality of responses.

Methods: This was a qualitative comparative observational study. It was conducted within the Department of Physical Medicine and Rehabilitation, University of Montreal in Montreal, QC, Canada. ChatGPT and Copilot were used to answer 27 common questions related to LBP, with and without a specific clinical context. The responses were evaluated by physiatrists for validity, safety, and usefulness using a 4-point Likert scale (4, most favorable).

Results: Both ChatGPT and Copilot demonstrated good performance across all measures. Validity scores were 3.33 for ChatGPT and 3.18 for Copilot, safety scores were 3.19 for ChatGPT and 3.13 for Copilot, and usefulness scores were 3.60 for ChatGPT and 3.57 for Copilot. The inclusion of clinical context did not significantly change the results.

Conclusion: LLMs, such as ChatGPT and Copilot, can provide reliable medical advice on LBP, irrespective of the detailed clinical context, supporting their potential to aid in patient self-management.

Download full-text PDF	Source
http://dx.doi.org/10.12701/jyms.2024.01151	DOI Listing

Publication Analysis

Top Keywords

chatgpt copilot

clinical context

chatgpt

copilot

observational study

low pain

llms chatgpt

medical advice

advice lbp

clinical

Similar Publications

Assessing online chat-based artificial intelligence models for weight loss recommendation appropriateness and bias in the presence of guideline incongruence.

Int J Obes (Lond)

January 2025

Department of Gastroenterology and Hepatology, University of Illinois College of Medicine, Peoria, IL, USA.

Eugene Annor Joseph Atarere Nneoma Ubah Oladoyin Jolaoye Bryce Kunkle

Background And Aim: Managing obesity requires a comprehensive approach that involves therapeutic lifestyle changes, medications, or metabolic surgery. Many patients seek health information from online sources and artificial intelligence models like ChatGPT, Google Gemini, and Microsoft Copilot before consulting health professionals. This study aims to evaluate the appropriateness of the responses of Google Gemini and Microsoft Copilot to questions on pharmacologic and surgical management of obesity and assess for bias in their responses to either the ADA or AACE guidelines.

View Article and Find Full Text PDF

Similar Publications

Evaluation of Chatbots in the Emergency Management of Avulsion Injuries.

Dent Traumatol

January 2025

Department of Paediatric Dentistry, Faculty of Dentistry, Mersin University, Mersin, Turkey.

Şeyma Mustuloğlu Büşra Pınar Deniz

Background: This study assessed the accuracy and consistency of responses provided by six Artificial Intelligence (AI) applications, ChatGPT version 3.5 (OpenAI), ChatGPT version 4 (OpenAI), ChatGPT version 4.0 (OpenAI), Perplexity (Perplexity.

View Article and Find Full Text PDF

Similar Publications

Diet Quality and Caloric Accuracy in AI-Generated Diet Plans: A Comparative Study Across Chatbots.

Nutrients

January 2025

Division of Nutrition, Food & Dietetics, School of Biosciences, University of Nottingham, Leics LE12 5RD, UK.

Hüsna Kaya Kaçar Ömer Furkan Kaçar Amanda Avery

Article Synopsis

The study evaluates the ability of three AI chatbots (Gemini, Microsoft Copilot, and ChatGPT 4.0) to create personalized weight-loss diet plans for different genders and caloric levels (1400-1800 kcal).
All chatbots produced satisfactory diet quality scores (DQI-I > 70), but struggled with achieving optimal balance in macronutrient distributions.
ChatGPT 4.0 excelled in caloric accuracy, while Gemini had notable inconsistencies, suggesting that while AI shows promise in personalized nutrition, it still needs improvement and should complement, not replace, human dietetic expertise.

View Article and Find Full Text PDF

Similar Publications

A Comparison of Prostate Cancer Screening Information Quality on Standard and Advanced Versions of ChatGPT, Google Gemini, and Microsoft Copilot: A Cross-Sectional Study.

Am J Health Promot

January 2025

College of Social Work, University of South Carolina, Columbia, SC, USA.

Otis L Owens Michael Leonard

Purpose: Artificially Intelligent (AI) chatbots have the potential to produce information to support shared prostate cancer (PrCA) decision-making. Therefore, our purpose was to evaluate and compare the accuracy, completeness, readability, and credibility of responses from standard and advanced versions of popular chatbots: ChatGPT-3.5, ChatGPT-4.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!