A PHP Error was encountered

Severity: Warning

Message: file_get_contents(https://...@pubfacts.com&api_key=b8daa3ad693db53b1410957c26c9a51b4908&a=1): Failed to open stream: HTTP request failed! HTTP/1.1 429 Too Many Requests

Filename: helpers/my_audit_helper.php

Line Number: 176

Backtrace:

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 176
Function: file_get_contents

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 250
Function: simplexml_load_file_from_url

File: /var/www/html/application/helpers/my_audit_helper.php
Line: 3122
Function: getPubMedXML

File: /var/www/html/application/controllers/Detail.php
Line: 575
Function: pubMedSearch_Global

File: /var/www/html/application/controllers/Detail.php
Line: 489
Function: pubMedGetRelatedKeyword

File: /var/www/html/index.php
Line: 316
Function: require_once

Utility and Comparative Performance of Current Artificial Intelligence Large Language Models as Postoperative Medical Support Chatbots in Aesthetic Surgery. | LitMetric

Background: Large language models (LLMs) have revolutionized the way plastic surgeons and their patients can access and leverage artificial intelligence (AI).

Objectives: The present study aims to compare the performance of 2 current publicly available and patient-accessible LLMs in the potential application of AI as postoperative medical support chatbots in an aesthetic surgeon's practice.

Methods: Twenty-two simulated postoperative patient presentations following aesthetic breast plastic surgery were devised and expert-validated. Complications varied in their latency within the postoperative period, as well as urgency of required medical attention. In response to each patient-reported presentation, Open AI's ChatGPT and Google's Bard, in their unmodified and freely available versions, were objectively assessed for their comparative accuracy in generating an appropriate differential diagnosis, most-likely diagnosis, suggested medical disposition, treatments or interventions to begin from home, and/or red flag signs/symptoms indicating deterioration.

Results: ChatGPT cumulatively and significantly outperformed Bard across all objective assessment metrics examined (66% vs 55%, respectively; P < .05). Accuracy in generating an appropriate differential diagnosis was 61% for ChatGPT vs 57% for Bard (P = .45). ChatGPT asked an average of 9.2 questions on history vs Bard's 6.8 questions (P < .001), with accuracies of 91% vs 68% reporting the most-likely diagnosis, respectively (P < .01). Appropriate medical dispositions were suggested with accuracies of 50% by ChatGPT vs 41% by Bard (P = .40); appropriate home interventions/treatments with accuracies of 59% vs 55% (P = .94), and red flag signs/symptoms with accuracies of 79% vs 54% (P < .01), respectively. Detailed and comparative performance breakdowns according to complication latency and urgency are presented.

Conclusions: ChatGPT represents the superior LLM for the potential application of AI technology in postoperative medical support chatbots. Imperfect performance and limitations discussed may guide the necessary refinement to facilitate adoption.

Download full-text PDF

Source
http://dx.doi.org/10.1093/asj/sjae025DOI Listing

Publication Analysis

Top Keywords

postoperative medical
12
medical support
12
support chatbots
12
comparative performance
8
performance current
8
artificial intelligence
8
large language
8
language models
8
chatbots aesthetic
8
potential application
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!