Background: The internet is a common source of health information for patients. Interactive online artificial intelligence (AI) may be a more reliable source of health-related information than traditional search engines. This study aimed to assess the quality and perceived utility of chat-based AI responses related to 3 common gastrointestinal (GI) surgical procedures.
Methods: A survey of 24 questions covering general perioperative information on cholecystectomy, pancreaticoduodenectomy (PD), and colectomy was created. Each question was posed to Chat Generative Pre-trained Transformer (ChatGPT) in June 2023, and the generated responses were recorded. The quality and perceived utility of responses were independently and subjectively graded by expert respondents specific to each surgical field. Grades were classified as "poor," "fair," "good," "very good," or "excellent."
Results: Among the 45 respondents (general surgeon [n = 13], surgical oncologist [n = 18], colorectal surgeon [n = 13], and transplant surgeon [n = 1]), most practiced at an academic facility (95.6%). Respondents had been in practice for a mean of 12.3 years (general surgeon, 14.5 ± 7.2; surgical oncologist, 12.1 ± 8.2; colorectal surgeon, 10.2 ± 8.0) and performed a mean 53 index operations annually (cholecystectomy, 47 ± 28; PD, 28 ± 27; colectomy, 81 ± 44). Overall, the most commonly assigned quality grade was "fair" or "good" for most responses (n = 622/1080, 57.6%). Most of the 1080 total utility grades were "fair" (n = 279, 25.8%) or "good" (n = 344, 31.9%), whereas only 129 utility grades (11.9%) were "poor." Of note, ChatGPT responses related to cholecystectomy (45.3% ["very good"/"excellent"] vs 18.1% ["poor"/"fair"]) were deemed to be better quality than AI responses about PD (18.9% ["very good"/"excellent"] vs 46.9% ["poor"/"fair"]) or colectomy (31.4% ["very good"/"excellent"] vs 38.3% ["poor"/"fair"]). Overall, only 20.0% of the experts deemed ChatGPT to be an accurate source of information, whereas 15.6% of the experts found it unreliable. Moreover, 1 in 3 surgeons deemed ChatGPT responses as not likely to reduce patient-physician correspondence (31.1%) or not comparable to in-person surgeon responses (35.6%).
Conclusions: Although a potential resource for patient education, ChatGPT responses to common GI perioperative questions were deemed to be of only modest quality and utility to patients. In addition, the relative quality of AI responses varied markedly on the basis of procedure type.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.gassur.2023.11.019 | DOI Listing |
Ann Rheum Dis
January 2025
Masters and Doctoral Programs in Physical Therapy, Universidade Cidade de Sao Paulo, Sao Paulo, Brazil; Discipline of Physiotherapy, Graduate School of Health, Faculty of Health, University of Technology, Sydney, New South Wales, Australia.
Objectives: The aim of this study was to assess the accuracy and readability of the answers generated by large language model (LLM)-chatbots to common patient questions about low back pain (LBP).
Methods: This cross-sectional study analysed responses to 30 LBP-related questions, covering self-management, risk factors and treatment. The questions were developed by experienced clinicians and researchers and were piloted with a group of consumer representatives with lived experience of LBP.
J Manag Care Spec Pharm
January 2025
Academy of Managed Care Pharmacy Foundation, Alexandria, VA.
Background: Over the past 5 years, managed care pharmacy has been shaped by a global pandemic, advancements in generative artificial intelligence (AI), Medicare drug price negotiation policies, and significant therapeutic developments. Collective intelligence methods can be used to anticipate future developments in practice to help organizations plan and develop new strategies around those changes.
Objective: To identify emerging trends in managed care pharmacy.
Urol Res Pract
January 2025
Clinic of Urology, Ankara Acibadem Hospital, Ankara, Türkiye.
Objective: Erectile dysfunction (ED) is a common cause of male sexual dysfunction. We aimed to evaluate the quality of ChatGPT and Gemini's responses to the most frequently asked questions about ED.
Methods: This study was conducted as a cross-sectional, observational study.
J Dent Sci
January 2025
Division of Physiology, Department of Health Promotion, Kyushu Dental University, Kitakyushu, Japan.
Background/purpose: OpenAI's GPT-4V and Google's Gemini Pro, being Large Language Models (LLMs) equipped with image recognition capabilities, have the potential to be utilized in future medical diagnosis and treatment, ands serve as valuable educational support tools for students. This study compared and evaluated the image recognition capabilities of GPT-4V and Gemini Pro using questions from the Japanese National Dental Examination (JNDE) to investigate their potential as educational support tools.
Materials And Methods: We analyzed 160 questions from the 116th JNDE, administered in March 2023, using ChatGPT-4V, and Gemini Pro, which have image recognition functions.
Cureus
January 2025
Medical Oncology, Kartal Dr. Lütfi Kirdar City Hospital, Health Science University, Istanbul, TUR.
Integrating artificial intelligence (AI) into oncology can revolutionize decision-making by providing accurate information. This study evaluates the performance of ChatGPT-4o (OpenAI, San Francisco, CA) Oncology Expert, in addressing open-ended clinical oncology questions. Thirty-seven treatment-related questions on solid organ tumors were selected from a hematology-oncology textbook.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!