This study evaluated the capability of three AI chatbots-ChatGPT 4.0, Claude 3.0, and Gemini Pro, as well as Google-in responding to common postkidney transplantation inquiries. We compiled a list of frequently asked postkidney transplant questions using Google and Bing. Response quality was rated on a 5-point Likert scale, while understandability and actionability were measured with the Patient Education Materials Assessment Tool (PEMAT). Readability was assessed using the Flesch Reading Ease and Flesch-Kincaid Grade Level metrics, with statistical analysis conducted via non-parametric tests, specifically the Kruskal-Wallis test, using SPSS. We gathered 127 questions, which were addressed by the chatbots and Google. The responses were of high quality (median Likert score: 4 [4,5]), good understandability (median PEMAT understandability score: 72.7% [62.5,77.8]), but poor actionability (median PEMAT operability score: 20% [0%-20%]). The readability was challenging (median Flesch Reading Ease score: 22.1 [8.7,34.8]), with a Flesch-Kincaid Grade Level akin to undergraduate-level text (median score: 14.7 [12.3,16.7]). Among the chatbots, Claude 3.0 provided the most reliable responses, though they required a higher reading level. ChatGPT 4.0 offered the most comprehensible responses. Moreover, Google did not outperform the chatbots in any of the scoring metrics.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.transproceed.2024.12.028DOI Listing

Publication Analysis

Top Keywords

postkidney transplant
8
flesch reading
8
reading ease
8
flesch-kincaid grade
8
grade level
8
median pemat
8
median
5
score
5
evaluating chatbot
4
responses
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!