AI Article Synopsis

  • Large language models (LLMs), including ChatGPT and others, were analyzed for their effectiveness as educational tools in the dental field by using questions from a Japanese national examination.
  • The study found that GPT-4 had the highest accuracy at 75.3%, with various models performing differently across question categories, particularly excelling in disease mechanism-related questions.
  • The overall results indicate that while GPT-4 stands out as the most effective tool currently, the abilities of these LLMs are continually evolving, suggesting a promising future for their use in education.

Article Abstract

Background/purpose: Large language models (LLMs) such as OpenAI's ChatGPT, Google's Bard, and Microsoft's Bing Chat have shown potential as educational tools in the medical and dental fields. This study evaluated their effectiveness using questions from the Japanese national dental hygienist examination, focusing on textual information only.

Materials And Methods: We analyzed 73 questions from the 32nd Japanese national dental hygienist examination, conducted in March 2023, using LLMs ChatGPT-3.5, GPT-4, Bard, and Bing Chat. Each question was categorized into one of nine domains. Standardized prompts were used for all LLMs, and Fisher's exact test was applied for statistical analysis.

Results: GPT-4 achieved the highest accuracy (75.3%), followed by Bing (68.5%), Bard (66.7%), and GPT-3.5 (63.0%). There were no statistically significant differences between the LLMs. The performance varied across different question categories, with all models excelling in the 'Disease mechanism and promotion of recovery process' category (100% accuracy). GPT-4 generally outperformed other models, especially in multi-answer questions.

Conclusion: GPT-4 demonstrated the highest overall accuracy among the LLMs tested, indicating its superior potential as an educational support tool in dental hygiene studies. The study highlights the varied performance of different LLMs across various question categories. While GPT-4 is currently the most effective, the capabilities of LLMs in educational settings are subject to continual change and improvement.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11437298PMC
http://dx.doi.org/10.1016/j.jds.2024.02.019DOI Listing

Publication Analysis

Top Keywords

japanese national
12
national dental
12
dental hygienist
12
hygienist examination
12
bing chat
12
large language
8
language models
8
bard bing
8
potential educational
8
highest accuracy
8

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!