This is the first study to evaluate the adequacy and reliability of the ChatGPT and Gemini chatbots on viral hepatitis. A total of 176 questions were composed from three different categories. The first group includes "questions and answers (Q&As) for the public" determined by the Centers for Disease Control and Prevention (CDC). The second group includes strong recommendations of international guidelines. The third group includes frequently asked questions on social media platforms. The answers of the chatbots were evaluated by two different infectious diseases specialists on a scoring scale from 1 to 4. Cohen's kappa coefficient was calculated to assess inter-rater reliability. The reproducibility and correlation of answers generated by ChatGPT and Gemini were analyzed. ChatGPT and Gemini's mean scores (3.55 ± 0.83 vs. 3.57 ± 0.89, p = 0.260) and completely correct response rates (71.0% vs. 78.4%, p = 0.111) were similar. In addition, in subgroup analyses with the CDC questions Sect. (90.1% vs. 91.9%, p = 0.752), the guideline questions Sect. (49.4% vs. 61.4%, p = 0.140), and the social media platform questions Sect. (82.5% vs. 90%, p = 0.335), the completely correct answers rates were similar. There was a moderate positive correlation between ChatGPT and Gemini chatbots' answers (r = 0.633, p < 0.001). Reproducibility rates of answers to questions were 91.3% in ChatGPT and 92% in Gemini (p = 0.710). According to Cohen's kappa test, there was a substantial inter-rater agreement for both ChatGPT (κ = 0.720) and Gemini (κ = 0.704). ChatGPT and Gemini successfully answered CDC questions and social media platform questions, but the correct answer rates were insufficient for guideline questions.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11724965PMC
http://dx.doi.org/10.1038/s41598-024-83575-1DOI Listing

Publication Analysis

Top Keywords

chatgpt gemini
16
group includes
12
viral hepatitis
8
social media
8
completely correct
8
questions
6
chatgpt
5
answers
5
comparison performances
4
performances chatgpt
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!