Background/purpose: OpenAI's GPT-4V and Google's Gemini Pro, being Large Language Models (LLMs) equipped with image recognition capabilities, have the potential to be utilized in future medical diagnosis and treatment, ands serve as valuable educational support tools for students. This study compared and evaluated the image recognition capabilities of GPT-4V and Gemini Pro using questions from the Japanese National Dental Examination (JNDE) to investigate their potential as educational support tools.
Materials And Methods: We analyzed 160 questions from the 116th JNDE, administered in March 2023, using ChatGPT-4V, and Gemini Pro, which have image recognition functions. Standardized prompts were used for all LLMs, and statistical analysis was conducted using Fisher's exact test and the Mann-Whitney U test.
Results: For the 160 JNDE questions, the accuracy rates of GPT-4V and Gemini Pro were 35.0% and 28.1%, respectively, with GPT-4V being the highest, although not statistically significant. Across dental specialties, the accuracy rates of the GPT-4V were generally higher than those of the Gemini Pro, with some areas showing equal accuracy. Accuracy rates tended to decrease with an increased number of images within a question, suggesting that the number of images influenced the correctness of the responses.
Conclusion: The overall superior performance of GPT-4V compared to Gemini Pro may be attributed to the continuous updates in OpenAI's model. This research demonstrates the potential of LLMs as educational support tools in dentistry, while also highlighting areas that require further technological development.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11762652 | PMC |
http://dx.doi.org/10.1016/j.jds.2024.06.015 | DOI Listing |
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!