Objectives: Bing Chat is a large language model artificial intelligence (AI) with online search and text generating capabilities. This study assessed its performance within the scope of dentistry in: (a) tackling exam questions for dental students, (ii) providing guidelines for dental practitioners, and (iii) answering patients' frequently asked questions. We discuss the potential of clinical tutoring, common patient communication and impact on academia.
Methods: With the aim of assessing AI's performance in dental exams, Bing Chat was presented with 532 multiple-choice questions and awarded scores based on its answers. In evaluating guidelines for clinicians, a further set of 15 questions, each with 2 follow-up questions on clinical protocols, was presented to the AI. The answers were assessed by 4 reviewers using electronic visual analog scale. In evaluating answers to patients' frequently asked questions, another list of 15 common questions was included in the session, with respective outputs assessed.
Results: Bing Chat correctly answered 383 out of 532 multiple-choice questions in dental exam part, achieving a score of 71.99 %. As for outlining clinical protocols for practitioners, the overall assessment score was 81.05 %. In answering patients' frequently asked questions, Bing Chat achieved an overall mean score of 83.8 %. The assessments demonstrated low inter-rater reliability.
Conclusions: The overall performance of Bing Chat was above the regularly adopted passing scores, particularly in answering patient's frequently asked questions. The generated content may have biased sources. These results suggest the importance of raising clinicians' awareness of AI's benefits and risks, as well as timely adaptations of dental education curricula, and safeguarding its use in dentistry and healthcare in general.
Clinical Significance: Bing Chat AI performed above the passing threshold in three categories, and thus demonstrated potential for educational assistance, clinical tutoring, and answering patients' questions. We recommend popularizing its benefits and risks among students and clinicians, while maintaining awareness of possible false information.
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1016/j.jdent.2024.104927 | DOI Listing |
J Fr Ophtalmol
December 2024
Department of Ophthalmology, Gaziantep Islamic Science and Technology University School of Medicine, Gaziantep, Turkey.
Purpose: To evaluate the appropriateness, understandability, actionability, and readability of responses provided by ChatGPT-3.5, Bard, and Bing Chat to frequently asked questions about keratorefractive surgery (KRS).
Method: Thirty-eight frequently asked questions about KRS were directed three times to a fresh ChatGPT-3.
Oman J Ophthalmol
October 2024
Department of Ophthalmology, Ankara Etlik City Hospital, Ankara, Turkey.
Background: This study aims to evaluate the knowledge levels of chat generative pretrained transformer (ChatGPT), Bing, and Bard programs, which are three different artificial intelligence chatbots offered to the market free of charge by various manufacturers, regarding neuro-ophthalmological diseases, to examine their usability, and to investigate the existence of their superiority to each other.
Materials And Methods: Forty questions related to neuro-ophthalmological diseases were obtained from the study questions' section of the American Academy and Ophthalmology 2022-2023 Basic and Clinical Science Course Neuro-ophthalmology Book. The questions were posed to the ChatGPT, Bing, and Bard artificial intelligence chatbots.
Cornea
December 2024
Department of Ophthalmology and Vision Sciences, University of Toronto, Toronto, Ontario, Canada.
BMC Med Inform Decis Mak
November 2024
Diabetes Research Institute, Mills-Peninsula Medical Center, 100 South San Mateo Drive, Room 1165, San Mateo, CA, 94401, USA.
Background: The large language models (LLMs), most notably ChatGPT, released since November 30, 2022, have prompted shifting attention to their use in medicine, particularly for supporting clinical decision-making. However, there is little consensus in the medical community on how LLM performance in clinical contexts should be evaluated.
Methods: We performed a literature review of PubMed to identify publications between December 1, 2022, and April 1, 2024, that discussed assessments of LLM-generated diagnoses or treatment plans.
Perspect Clin Res
April 2024
Department of Business Data Processing and Management, Satyawati College (Eve.), University of Delhi, New Delhi, India.
Background: In contemporary research, selecting the appropriate statistical test is a critical and often challenging step. The emergence of large language models (LLMs) has offered a promising avenue for automating this process, potentially enhancing the efficiency and accuracy of statistical test selection.
Aim: This study aimed to assess the capability of freely available LLMs - OpenAI's ChatGPT3.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!