Introduction: Artificial intelligence-based chatbots offer a potential avenue for delivering personalized counseling to patients with autoimmune hepatitis. We assessed accuracy, completeness, comprehensiveness, and safety of Chat Generative Pretrained Transformer-4 responses to 12 inquiries out of a pool of 40 questions posed by 4 patients with autoimmune hepatitis.
Methods: Questions were categorized into 3 areas: diagnosis (1-3), quality of life (4-8), and medical treatment (9-12). 11 key opinion leaders evaluated responses using a Likert scale with 6 points for accuracy, 5 points for safety, and 3 points for completeness and comprehensiveness.
Results: Median scores for accuracy, completeness, comprehensiveness, and safety were 5 (4-6), 2 (2-2), and 3 (2-3), respectively; no domain exhibited superior evaluation. Postdiagnosis follow-up question was the trickiest with low accuracy and completeness, but safe and comprehensive features. Agreement among key opinion leaders (Fleiss Kappa statistics) was slight for the accuracy (0.05) but poor for the remaining features (-0.05, -0.06, and -0.02, respectively).
Discussion: Chatbots show good comprehensibility, but lack reliability. Further studies are needed to integrate Chat Generative Pretrained Transformer within clinical practice.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.14309/ajg.0000000000003179 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!