Artificial intelligence in healthcare education: evaluating the accuracy of ChatGPT, Copilot, and Google Gemini in cardiovascular pharmacology.

Front Med (Lausanne)

Division of Pulmonary, Critical Care, and Sleep Medicine, School of Medicine, Case Western Reserve University, Cleveland, OH, United States.

Published: February 2025

Background: Artificial intelligence (AI) is revolutionizing medical education; however, its limitations remain underexplored. This study evaluated the accuracy of three generative AI tools-ChatGPT-4, Copilot, and Google Gemini-in answering multiple-choice questions (MCQ) and short-answer questions (SAQ) related to cardiovascular pharmacology, a key subject in healthcare education.

Methods: Using free versions of each AI tool, we administered 45 MCQs and 30 SAQs across three difficulty levels: easy, intermediate, and advanced. AI-generated answers were reviewed by three pharmacology experts. The accuracy of MCQ responses was recorded as correct or incorrect, while SAQ responses were rated on a 1-5 scale based on relevance, completeness, and correctness.

Results: ChatGPT, Copilot, and Gemini demonstrated high accuracy scores in easy and intermediate MCQs (87-100%). While all AI models showed a decline in performance on the advanced MCQ section, only Copilot (53% accuracy) and Gemini (20% accuracy) had significantly lower scores compared to their performance on easy-intermediate levels. SAQ evaluations revealed high accuracy scores for ChatGPT (overall 4.7 ± 0.3) and Copilot (overall 4.5 ± 0.4) across all difficulty levels, with no significant differences between the two tools. In contrast, Gemini's SAQ performance was markedly lower across all levels (overall 3.3 ± 1.0).

Conclusion: ChatGPT-4 demonstrates the highest accuracy in addressing both MCQ and SAQ cardiovascular pharmacology questions, regardless of difficulty level. Copilot ranks second after ChatGPT, while Google Gemini shows significant limitations in handling complex MCQs and providing accurate responses to SAQ-type questions in this field. These findings can guide the ongoing refinement of AI tools for specialized medical education.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11879995PMC
http://dx.doi.org/10.3389/fmed.2025.1495378DOI Listing

Publication Analysis

Top Keywords

cardiovascular pharmacology
12
artificial intelligence
8
accuracy
8
chatgpt copilot
8
copilot google
8
google gemini
8
medical education
8
saq cardiovascular
8
difficulty levels
8
easy intermediate
8

Similar Publications

Background: This study aims to analyze the patient characteristics, clinical outcomes, and contemporary trends concerning type A aortic dissection (TAAD) in previous recipients of abdominal solid organ transplantation (ASOT) in the United States.

Methods: The National Inpatient Sample was queried to identify all patients aged ≥18 with TAAD and a history of ASOT (TAAD-ASOT) between 2002 and 2015Q3 using ICD-9 diagnosis and procedure codes. Baseline characteristics and in-hospital outcomes were compared between TAAD-ASOT patients and TAAD patients without a history of ASOT (TAAD-non-ASOT).

View Article and Find Full Text PDF

Atrial fibrillation (AF) is one of the most common cardiac diseases and a complicating comorbidity for multiple associated diseases. Many clinical decisions regarding AF are currently based on the binary recognition of AF being present or absent with the categorical appraisal of AF as continued or intermittent. Assessment of AF in clinical trials is largely limited to the time to (first) detection of an AF episode.

View Article and Find Full Text PDF

Persistent systemic inflammation is associated with an elevated risk of cardiometabolic diseases. However, the characteristics of the innate and adaptive immune systems in individuals who develop these conditions remain poorly defined. Doublets, or cell-cell complexes, are routinely eliminated from flow cytometric and other immune phenotyping analyses, which limits our understanding of their relationship to disease states.

View Article and Find Full Text PDF

Background: The perioperative management of patients undergoing cardiac surgery is highly complex and involves numerous factors. There is a strong association between cardiac surgery and perioperative complications. The Brazilian Surgical Identification Study (BraSIS 2) aims to assess the incidence of death and early postoperative complications, identify potential risk factors, and examine both the demographic characteristics of patients and the epidemiology of cardiovascular procedures.

View Article and Find Full Text PDF

Although many cardioprotective interventions have been shown to limit infarct size (IS), in preclinical animal studies of acute myocardial ischemia/reperfusion injury (IRI), their clinical translation to patient benefit has been largely disappointing. A major factor is the lack of rigor and reproducibility in the preclinical studies. To address this, we have established the IMproving Preclinical Assessment of Cardioprotective Therapies (IMPACT) small animal multisite acute myocardial infarction (AMI) network, with centralized randomization and blinded core laboratory IS analysis, and have validated the network using ischemic preconditioning (IPC).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!