Performance of 4 Artificial Intelligence Chatbots in Answering Endodontic Questions.

J Endod

Department of Maxillofacial Surgery and Diagnostic Sciences, College of Dentistry, Jazan University, Jazan, Saudi Arabia.

Published: January 2025

Introduction: Artificial intelligence models have shown potential as educational tools in healthcare, such as answering exam questions. This study aimed to assess the performance of 4 prominent chatbots: ChatGPT-4o, MedGebra GPT-4o, Meta LIama 3, and Gemini Advanced in answering multiple-choice questions (MCQs) in endodontics.

Methods: The study utilized 100 MCQs, each with 4 potential answers. These MCQs were obtained from 2 well-known endodontic textbooks. The performance of the above chatbots regarding choosing the correct answers was assessed twice with a 1-week interval.

Results: The stability of the performance in the 2 rounds was highest for ChatGPT-4o, followed by Gemini Advanced and Meta Llama 3. MedGebra GPT-4o provided the highest percentage of true answers in the first round (93%) followed by ChatGPT-4o in the second round (90%). Meta Llama 3 provided the lowest percentages in the first (73%) and second rounds (75%). Although the performance of MedGebra GPT-4o was the best in the first round, it was less stable upon the second round (McNemar P > .05; Kappa = 0.725, P < .001).

Conclusions: ChatGPT-4o and MedGebra GPT-4o answered a high fraction of endodontic MCQs, while Meta LIama 3 and Gemini Advanced showed lower performance. Further training and development are required to improve their accuracy and reliability in endodontics.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.joen.2025.01.002DOI Listing

Publication Analysis

Top Keywords

medgebra gpt-4o
12
artificial intelligence
8
gemini advanced
8
meta llama
8
second round
8
performance
5
performance artificial
4
intelligence chatbots
4
chatbots answering
4
answering endodontic
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!