Evaluating Performance of ChatGPT on MKSAP Cardiology Board Review Questions.

Int J Cardiol

Florida State University College of Medicine Internal Medicine Residency Program at Lee Health, Cape Coral, Florida, USA; Lee Health Heart Institute, Fort Myers, Florida, USA; Florida Heart Associates, Fort Myers, Florida, USA.

Published: December 2024

Chat Generative Pretrained Transformer (ChatGPT) is a natural language processing tool created by OpenAI. Much of the discussion regarding artificial intelligence (AI) in medicine is the ability of the language to enhance medical practice, improve efficiency and decrease errors. The objective of this study was to analyze the ability of ChatGPT to answer board-style cardiovascular medicine questions by using the Medical Knowledge Self-Assessment Program (MKSAP).The study evaluated the performance of ChatGPT (versions 3.5 and 4), alongside internal medicine residents and internal medicine and cardiology attendings, in answering 98 multiple-choice questions (MCQs) from the Cardiovascular Medicine Chapter of MKSAP. ChatGPT-4 demonstrated an accuracy of 74.5 %, comparable to internal medicine (IM) intern (63.3 %), senior resident (63.3 %), internal medicine attending physician (62.2 %), and ChatGPT-3.5 (64.3 %) but significantly lower than cardiology attending physician (85.7 %). Subcategory analysis revealed no statistical difference between ChatGPT and physicians, except in valvular heart disease where cardiology attending outperformed ChatGPT (p = 0.031) for version 3.5, and for heart failure (p = 0.046) where ChatGPT-4 outperformed senior resident. While ChatGPT shows promise in certain subcategories, in order to establish AI as a reliable educational tool for medical professionals, performance of ChatGPT will likely need to surpass the accuracy of instructors, ideally achieving the near-perfect score on posed questions.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.ijcard.2024.132576DOI Listing

Publication Analysis

Top Keywords

internal medicine
16
performance chatgpt
12
chatgpt
8
cardiovascular medicine
8
senior resident
8
attending physician
8
cardiology attending
8
medicine
7
evaluating performance
4
chatgpt mksap
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!