Background: The self-management of low back pain (LBP) through patient information interventions offers significant benefits in terms of cost, reduced work absenteeism, and overall healthcare utilization. Using a large language model (LLM), such as ChatGPT (OpenAI) or Copilot (Microsoft), could potentially enhance these outcomes further. Thus, it is important to evaluate the LLMs ChatGPT and Copilot in providing medical advice for LBP and assessing the impact of clinical context on the quality of responses.
Methods: This was a qualitative comparative observational study. It was conducted within the Department of Physical Medicine and Rehabilitation, University of Montreal in Montreal, QC, Canada. ChatGPT and Copilot were used to answer 27 common questions related to LBP, with and without a specific clinical context. The responses were evaluated by physiatrists for validity, safety, and usefulness using a 4-point Likert scale (4, most favorable).
Results: Both ChatGPT and Copilot demonstrated good performance across all measures. Validity scores were 3.33 for ChatGPT and 3.18 for Copilot, safety scores were 3.19 for ChatGPT and 3.13 for Copilot, and usefulness scores were 3.60 for ChatGPT and 3.57 for Copilot. The inclusion of clinical context did not significantly change the results.
Conclusion: LLMs, such as ChatGPT and Copilot, can provide reliable medical advice on LBP, irrespective of the detailed clinical context, supporting their potential to aid in patient self-management.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.12701/jyms.2024.01151 | DOI Listing |
Int J Obes (Lond)
January 2025
Department of Gastroenterology and Hepatology, University of Illinois College of Medicine, Peoria, IL, USA.
Background And Aim: Managing obesity requires a comprehensive approach that involves therapeutic lifestyle changes, medications, or metabolic surgery. Many patients seek health information from online sources and artificial intelligence models like ChatGPT, Google Gemini, and Microsoft Copilot before consulting health professionals. This study aims to evaluate the appropriateness of the responses of Google Gemini and Microsoft Copilot to questions on pharmacologic and surgical management of obesity and assess for bias in their responses to either the ADA or AACE guidelines.
View Article and Find Full Text PDFDent Traumatol
January 2025
Department of Paediatric Dentistry, Faculty of Dentistry, Mersin University, Mersin, Turkey.
Background: This study assessed the accuracy and consistency of responses provided by six Artificial Intelligence (AI) applications, ChatGPT version 3.5 (OpenAI), ChatGPT version 4 (OpenAI), ChatGPT version 4.0 (OpenAI), Perplexity (Perplexity.
View Article and Find Full Text PDFNutrients
January 2025
Division of Nutrition, Food & Dietetics, School of Biosciences, University of Nottingham, Leics LE12 5RD, UK.
Am J Health Promot
January 2025
College of Social Work, University of South Carolina, Columbia, SC, USA.
Purpose: Artificially Intelligent (AI) chatbots have the potential to produce information to support shared prostate cancer (PrCA) decision-making. Therefore, our purpose was to evaluate and compare the accuracy, completeness, readability, and credibility of responses from standard and advanced versions of popular chatbots: ChatGPT-3.5, ChatGPT-4.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!