AI Article Synopsis

  • The internet is a popular place for people to find health information, and this study looked at how well an AI tool, like ChatGPT, can answer questions about common surgeries for the stomach and intestines.
  • Researchers created a quiz with 24 questions about three types of surgeries and asked ChatGPT to answer them, then experts rated the quality of those answers.
  • Most of the AI responses were rated as "fair" or "good," but responses about one surgery, cholecystectomy, were judged to be better than the others, while answers for pancreatic surgery were not as good.

Article Abstract

Background: The internet is a common source of health information for patients. Interactive online artificial intelligence (AI) may be a more reliable source of health-related information than traditional search engines. This study aimed to assess the quality and perceived utility of chat-based AI responses related to 3 common gastrointestinal (GI) surgical procedures.

Methods: A survey of 24 questions covering general perioperative information on cholecystectomy, pancreaticoduodenectomy (PD), and colectomy was created. Each question was posed to Chat Generative Pre-trained Transformer (ChatGPT) in June 2023, and the generated responses were recorded. The quality and perceived utility of responses were independently and subjectively graded by expert respondents specific to each surgical field. Grades were classified as "poor," "fair," "good," "very good," or "excellent."

Results: Among the 45 respondents (general surgeon [n = 13], surgical oncologist [n = 18], colorectal surgeon [n = 13], and transplant surgeon [n = 1]), most practiced at an academic facility (95.6%). Respondents had been in practice for a mean of 12.3 years (general surgeon, 14.5 ± 7.2; surgical oncologist, 12.1 ± 8.2; colorectal surgeon, 10.2 ± 8.0) and performed a mean 53 index operations annually (cholecystectomy, 47 ± 28; PD, 28 ± 27; colectomy, 81 ± 44). Overall, the most commonly assigned quality grade was "fair" or "good" for most responses (n = 622/1080, 57.6%). Most of the 1080 total utility grades were "fair" (n = 279, 25.8%) or "good" (n = 344, 31.9%), whereas only 129 utility grades (11.9%) were "poor." Of note, ChatGPT responses related to cholecystectomy (45.3% ["very good"/"excellent"] vs 18.1% ["poor"/"fair"]) were deemed to be better quality than AI responses about PD (18.9% ["very good"/"excellent"] vs 46.9% ["poor"/"fair"]) or colectomy (31.4% ["very good"/"excellent"] vs 38.3% ["poor"/"fair"]). Overall, only 20.0% of the experts deemed ChatGPT to be an accurate source of information, whereas 15.6% of the experts found it unreliable. Moreover, 1 in 3 surgeons deemed ChatGPT responses as not likely to reduce patient-physician correspondence (31.1%) or not comparable to in-person surgeon responses (35.6%).

Conclusions: Although a potential resource for patient education, ChatGPT responses to common GI perioperative questions were deemed to be of only modest quality and utility to patients. In addition, the relative quality of AI responses varied markedly on the basis of procedure type.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.gassur.2023.11.019DOI Listing

Publication Analysis

Top Keywords

chatgpt responses
12
["very good"/"excellent"]
12
responses
10
online artificial
8
artificial intelligence
8
gastrointestinal surgical
8
quality perceived
8
perceived utility
8
responses common
8
"fair" "good"
8

Similar Publications

Assessing the performance of AI chatbots in answering patients' common questions about low back pain.

Ann Rheum Dis

January 2025

Masters and Doctoral Programs in Physical Therapy, Universidade Cidade de Sao Paulo, Sao Paulo, Brazil; Discipline of Physiotherapy, Graduate School of Health, Faculty of Health, University of Technology, Sydney, New South Wales, Australia.

Objectives: The aim of this study was to assess the accuracy and readability of the answers generated by large language model (LLM)-chatbots to common patient questions about low back pain (LBP).

Methods: This cross-sectional study analysed responses to 30 LBP-related questions, covering self-management, risk factors and treatment. The questions were developed by experienced clinicians and researchers and were piloted with a group of consumer representatives with lived experience of LBP.

View Article and Find Full Text PDF

Emerging trends in managed care pharmacy: A mixed-method study.

J Manag Care Spec Pharm

January 2025

Academy of Managed Care Pharmacy Foundation, Alexandria, VA.

Background: Over the past 5 years, managed care pharmacy has been shaped by a global pandemic, advancements in generative artificial intelligence (AI), Medicare drug price negotiation policies, and significant therapeutic developments. Collective intelligence methods can be used to anticipate future developments in practice to help organizations plan and develop new strategies around those changes.

Objective: To identify emerging trends in managed care pharmacy.

View Article and Find Full Text PDF

Objective: Erectile dysfunction (ED) is a common cause of male sexual dysfunction. We aimed to evaluate the quality of ChatGPT and Gemini's responses to the most frequently asked questions about ED.

Methods: This study was conducted as a cross-sectional, observational study.

View Article and Find Full Text PDF

Background/purpose: OpenAI's GPT-4V and Google's Gemini Pro, being Large Language Models (LLMs) equipped with image recognition capabilities, have the potential to be utilized in future medical diagnosis and treatment, ands serve as valuable educational support tools for students. This study compared and evaluated the image recognition capabilities of GPT-4V and Gemini Pro using questions from the Japanese National Dental Examination (JNDE) to investigate their potential as educational support tools.

Materials And Methods: We analyzed 160 questions from the 116th JNDE, administered in March 2023, using ChatGPT-4V, and Gemini Pro, which have image recognition functions.

View Article and Find Full Text PDF

Integrating artificial intelligence (AI) into oncology can revolutionize decision-making by providing accurate information. This study evaluates the performance of ChatGPT-4o (OpenAI, San Francisco, CA) Oncology Expert, in addressing open-ended clinical oncology questions. Thirty-seven treatment-related questions on solid organ tumors were selected from a hematology-oncology textbook.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!