Purpose: Chat Generative Pre-Trained Transformer (ChatGPT) may have implications as a novel educational resource. There are differences in opinion on the best resource for the Orthopaedic In-Training Exam (OITE) as information changes from year to year. This study assesses ChatGPT's performance on the OITE for use as a potential study resource for residents.

Methods: Questions for the OITE data set were sourced from the American Academy of Orthopaedic Surgeons (AAOS) website. All questions from the 2022 OITE were included. All questions, including those with images, were included in the analysis. The questions were formatted in the same manner as presented on the AAOS website, with the question, narrative text and answer choices separated by a line. Each question was evaluated in a new chat session to minimize confounding variables. Answers from ChatGPT were characterized by whether they contained logical, internal or external information. Incorrect responses were further categorized into logical, informational or explicit fallacies.

Results: ChatGPT yielded an overall success rate of 48.3% based on the 2022 AAOS OITE. ChatGPT demonstrated the ability to apply logic and stepwise thinking in 67.6% of the questions. ChatGPT effectively utilized internal information from the question stem in 68.1% of the questions. ChatGPT also demonstrated the ability to incorporate external information in 68.1% of the questions. The utilization of logical reasoning ( < 0.001), internal information ( = 0.004) and external information (p = 0.009) was greater among correct responses than incorrect responses. Informational fallacy was the most common shortcoming of ChatGPT's responses. There was no difference in correct responses based on whether or not an image was present ( = 0.320).

Conclusions: ChatGPT demonstrates logical, informational and explicit fallacies which, at this time, may lead to misinformation and hinder resident education.

Level Of Evidence: Level V.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11693985PMC
http://dx.doi.org/10.1002/jeo2.70135DOI Listing

Publication Analysis

Top Keywords

orthopaedic in-training
8
aaos website
8
chatgpt demonstrated
8
demonstrated ability
8
questions chatgpt
8
681% questions
8
questions
7
chatgpt
6
oite
5
utility chatgpt
4

Similar Publications

Purpose: Chat Generative Pre-Trained Transformer (ChatGPT) may have implications as a novel educational resource. There are differences in opinion on the best resource for the Orthopaedic In-Training Exam (OITE) as information changes from year to year. This study assesses ChatGPT's performance on the OITE for use as a potential study resource for residents.

View Article and Find Full Text PDF
Article Synopsis
  • The study examined the epidemiology and risk factors for ACL injuries in handball, hypothesizing that the injury rates are high, akin to other sports like football.
  • A total of 84 ACL injuries were reported over six seasons, with a notable 46.3% being re-ruptures; higher injury rates were found in the first division compared to the second division.
  • Key risk factors identified include previous knee injuries and increased time spent in matches, indicating a significant concern for professional handball players.
View Article and Find Full Text PDF

Comparing Large Language Model and Human Reader Accuracy with Image Challenge Case Image Inputs.

Radiology

December 2024

From the Department of Radiology, Research Institute of Radiological Science and Center for Clinical Imaging Data Science, Yonsei University College of Medicine, Seoul, Republic of Korea (P.S.S.); Department of Radiology and Research Institute of Radiology (W.H.S., C.H.S., K.J.P., P.H.K., S.J.C., Y.A., S.P., H.Y.P., N.E.O.), Department of Medical Science, Asan Medical Institute of Convergence Science and Technology (W.H.S., H.H.), and Department of Internal Medicine (C.Y.W.), Asan Medical Center, University of Ulsan College of Medicine, Olympic-ro 33, Songpa-gu, 05505 Seoul, Republic of Korea; University of Ulsan College of Medicine, Seoul, Republic of Korea (M.W.H.); Department of Orthopaedic Surgery, Seoul Seonam Hospital, Republic of Korea (S.T.C.); and Department of Pulmonary and Critical Care Medicine, Gumdan Top Hospital, Incheon, Republic of Korea (H.P.).

Background Application of multimodal large language models (LLMs) with both textual and visual capabilities has been steadily increasing, but their ability to interpret radiologic images is still doubted. Purpose To evaluate the accuracy of LLMs and compare it with that of human readers with varying levels of experience and to assess the factors affecting LLM accuracy in answering Image Challenge cases. Materials and Methods Radiologic images of cases from October 13, 2005, to April 18, 2024, were retrospectively reviewed.

View Article and Find Full Text PDF

» Virtual reality (VR) is increasingly used across surgical specialties in training, offering a safe, immersive environment for skill development.» Studies show that VR significantly improves surgical performance making it an effective training tool for orthopaedic residents; however, effects may be more pronounced in junior trainees and may not be seen in senior trainees or attendings.» As VR technology evolves, it promises broader applications in surgical training, though further research is needed to establish its superiority over traditional methods.

View Article and Find Full Text PDF

Introduction: ChatGPT is a sophisticated AI model capable of generating human-like text based on the input it receives. ChatGPT 3.5 showed an inability to pass the FRCS (Tr&Orth) examination due to a lack of higher-order judgement in previous studies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!