Evaluating the accuracy and readability of ChatGPT in providing parental guidance for adenoidectomy, tonsillectomy, and ventilation tube insertion surgery.

Emre Polat Yagmur Basak Polat Erol Senturk Remzi Dogan Alper Yenigun Selahattin Tugrul Sabri Baki Eren Fadlullah Aksoy Orhan Ozturan

Int J Pediatr Otorhinolaryngol

Department of Otorhinolaryngology, Faculty of Medicine, Bezmialem Vakif University, Fatih, Istanbul, Turkey.

Published: June 2024

Objectives: This study examined the potential of ChatGPT as an accurate and readable source of information for parents seeking guidance on adenoidectomy, tonsillectomy, and ventilation tube insertion surgeries (ATVtis).

Methods: ChatGPT was tasked with identifying the top 15 most frequently asked questions by parents on internet search engines for each of the three specific surgical procedures. We removed repeated questions from the initial set of 45. Subsequently, we asked ChatGPT to generate answers to the remaining 33 questions. Seven highly experienced otolaryngologists individually assessed the accuracy of the responses using a four-level grading scale, from completely incorrect to comprehensive. The readability of responses was determined using the Flesch Reading Ease (FRE) and Flesch-Kincaid Grade Level (FKGL) scores. The questions were categorized into four groups: Diagnosis and Preparation Process, Surgical Information, Risks and Complications, and Postoperative Process. Responses were then compared based on accuracy grade, FRE, and FKGL scores.

Results: Seven evaluators each assessed 33 AI-generated responses, providing a total of 231 evaluations. Among the evaluated responses, 167 (72.3 %) were classified as 'comprehensive.' Sixty-two responses (26.8 %) were categorized as 'correct but inadequate,' and two responses (0.9 %) were assessed as 'some correct, some incorrect.' None of the responses were adjudged 'completely incorrect' by any assessors. The average FRE and FGKL scores were 57.15(±10.73) and 9.95(±1.91), respectively. Upon analyzing the responses from ChatGPT, 3 (9.1 %) were at or below the sixth-grade reading level recommended by the American Medical Association (AMA). No significant differences were found between the groups regarding readability and accuracy scores (p > 0.05).

Conclusions: ChatGPT can provide accurate answers to questions on various topics related to ATVtis. However, ChatGPT's answers may be too complex for some readers, as they are generally written at a high school level. This is above the sixth-grade reading level recommended for patient information by the AMA. According to our study, more than three-quarters of the AI-generated responses were at or above the 10th-grade reading level, raising concerns about the ChatGPT text's readability.

Download full-text PDF	Source
http://dx.doi.org/10.1016/j.ijporl.2024.111998	DOI Listing

Publication Analysis

Top Keywords

reading level

responses

guidance adenoidectomy

adenoidectomy tonsillectomy

tonsillectomy ventilation

ventilation tube

tube insertion

ai-generated responses

sixth-grade reading

level recommended

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered