The Comparative Performance of Large Language Models on the Hand Surgery Self-Assessment Examination.

Hand (N Y)

Albert Einstein Healthcare Network, Philadelphia, PA, USA.

Published: September 2024

Background: Generative artificial intelligence (AI) models have emerged as capable of producing human-like responses and have showcased their potential in general medical specialties. This study explores the performance of AI systems on the American Society for Surgery of the Hand (ASSH) Self-Assessment Exams (SAE).

Methods: ChatGPT 4.0 and Bing AI were evaluated on a set of multiple-choice questions drawn from the ASSH SAE online question bank spanning 5 years (2019-2023). Each system was evaluated with 999 questions. Images and video links were inserted into question prompts to allow for complete AI interpretation. The performance of both systems was standardized using the May 2023 version of ChatGPT 4.0 and Microsoft Bing AI, both of which had web browsing and image capabilities.

Results: ChatGPT 4.0 scored an average of 66.5% on the ASSH questions. Bing AI scored higher, with an average of 75.3%. Bing AI outperformed ChatGPT 4.0 by an average of 8.8%. As a benchmark, a minimum passing score of 50% was required for continuing medical education credit. Both ChatGPT 4.0 and Bing AI had poorer performance on video-type and image-type questions on analysis of variance testing. Responses from both models contained elements from sources such as PubMed, Journal of Hand Surgery, and American Academy of Orthopedic Surgeons.

Conclusions: ChatGPT 4.0 with browsing and Bing AI can both be anticipated to achieve passing scores on the ASSH SAE. Generative AI, with its ability to provide logical responses and literature citations, presents a convincing argument for use as an interactive learning aid and educational tool.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11559719PMC
http://dx.doi.org/10.1177/15589447241279460DOI Listing

Publication Analysis

Top Keywords

hand surgery
8
performance systems
8
chatgpt bing
8
assh sae
8
chatgpt
6
bing
6
comparative performance
4
performance large
4
large language
4
language models
4

Similar Publications

Pelvic packing - status 2024.

Arch Orthop Trauma Surg

January 2025

Division of Orthopaedic Surgery, Oslo University Hospital, Oslo, Norway.

Patients with unstable hemodynamics and unstable pelvic ring injuries are still demanding patients regarding initial treatment and survival. Several concepts were reported during the last 30 years. Mechanical stabilization of the pelvis together with hemorrhage control offer the best treatment option in these patients.

View Article and Find Full Text PDF

Iliosacral screw osteosynthesis - state of the art.

Arch Orthop Trauma Surg

January 2025

Department of Orthopedics and Traumatology, University Medical Center Mainz, Mainz, Germany.

Iliosacral screw osteosynthesis is a widely recognized technique for stabilizing unstable posterior pelvic ring injuries, offering notable advantages, including enhanced mechanical stability, minimal invasiveness, reduced blood loss, and lower infection rates. However, the procedure presents technical challenges due to the complex anatomy of the sacrum and the proximity of critical neurovascular structures. While conventional fluoroscopy remains the primary method for intraoperative guidance, precise preoperative planning using multiplanar reconstructions and three-dimensional volume rendering is crucial for ensuring accurate placement of iliosacral or transsacral screws.

View Article and Find Full Text PDF

Background: Although both the lateral sagittal and costoclavicular approaches are applied at the cord level in the infraclavicular region, there is a major difference between the distributions of the two approaches. We aimed to investigate the effects of this different distribution on tissue perfusion and oxygenation.

Methods: Sixty patients undergoing elective elbow, forearm, wrist and hand surgery under infraclavicular brachial plexus block were included in the study.

View Article and Find Full Text PDF

The opioid epidemic has been a defining crisis in American health care. Many attempts to address the epidemic have focused on issues around opioid prescribing. Legislation at the state and federal levels has been passed; however, the results from these policies have been mixed.

View Article and Find Full Text PDF

Anatomical Characterization of the Motor Branch to the Fourth Lumbrical: A Cadaver Study.

J Hand Surg Am

January 2025

Division of Plastic and Reconstructive Surgery, Department of Surgery, University of Florida, Gainesville, FL.

Purpose: The branching pattern of the deep motor branch of the ulnar nerve (DBUN) in the hand is complex. The anatomy of the motor branch innervating the fourth lumbrical (4L), where paralysis results in a claw hand deformity after ulnar nerve injury, is not well defined. This cadaver study focused on mapping and defining anatomical landmarks in relation to the motor branch to the 4L.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!