Since their release, the medical community has been actively exploring large language models' (LLMs) capabilities, which show promise in providing accurate medical knowledge. One potential application is as a patient resource. This study analyzes and compares the ability of the currently available LLMs, ChatGPT-3.5, GPT-4, and Gemini, to provide postoperative care recommendations to plastic surgery patients. We presented each model with 32 questions addressing common patient concerns after surgical cosmetic procedures and evaluated the medical accuracy, readability, understandability, and actionability of the models' responses. The three LLMs provided equally accurate information, with GPT-3.5 averaging the highest on the Likert scale (LS) (4.18 ± 0.93) ( = 0.849), while Gemini provided significantly more readable ( = 0.001) and understandable responses ( = 0.014; = 0.001). There was no difference in the actionability of the models' responses ( = 0.830). Although LLMs have shown their potential as adjunctive tools in postoperative patient care, further refinement and research are imperative to enable their evolution into comprehensive standalone resources.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11171524PMC
http://dx.doi.org/10.3390/healthcare12111083DOI Listing

Publication Analysis

Top Keywords

postoperative care
8
large language
8
recommendations plastic
8
plastic surgery
8
actionability models'
8
models' responses
8
artificial intelligence
4
intelligence postoperative
4
care assessing
4
assessing large
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!