ChatGPT's Performance on Portuguese Medical Examination Questions: Comparative Analysis of ChatGPT-3.5 Turbo and ChatGPT-4o Mini.

JMIR Med Educ

Faculty of Health Sciences, University of Beira Interior, Av. Infante D. Henrique, Covilhã, 6201-506, Portugal, 351 234393150.

Published: March 2025

Background: Advancements in ChatGPT are transforming medical education by providing new tools for assessment and learning, potentially enhancing evaluations for doctors and improving instructional effectiveness.

Objective: This study evaluates the performance and consistency of ChatGPT-3.5 Turbo and ChatGPT-4o mini in solving European Portuguese medical examination questions (2023 National Examination for Access to Specialized Training; Prova Nacional de Acesso à Formação Especializada [PNA]) and compares their performance to human candidates.

Methods: ChatGPT-3.5 Turbo was tested on the first part of the examination (74 questions) on July 18, 2024, and ChatGPT-4o mini on the second part (74 questions) on July 19, 2024. Each model generated an answer using its natural language processing capabilities. To test consistency, each model was asked, "Are you sure?" after providing an answer. Differences between the first and second responses of each model were analyzed using the McNemar test with continuity correction. A single-parameter t test compared the models' performance to human candidates. Frequencies and percentages were used for categorical variables, and means and CIs for numerical variables. Statistical significance was set at P<.05.

Results: ChatGPT-4o mini achieved an accuracy rate of 65% (48/74) on the 2023 PNA examination, surpassing ChatGPT-3.5 Turbo. ChatGPT-4o mini outperformed medical candidates, while ChatGPT-3.5 Turbo had a more moderate performance.

Conclusions: This study highlights the advancements and potential of ChatGPT models in medical education, emphasizing the need for careful implementation with teacher oversight and further research.

Download full-text PDF	Source
http://dx.doi.org/10.2196/65108	DOI Listing

Publication Analysis

Top Keywords

examination questions

chatgpt-35 turbo

chatgpt-4o mini

portuguese medical

medical examination

turbo chatgpt-4o

performance human

questions july

july 2024

chatgpt's performance

Similar Publications

Midlife financial stress and cognitive and physical impairments in older age: The role of potentially modifying factors.

Soc Sci Med

January 2025

Division of Clinical Geriatrics, Center for Alzheimer's Research, Department of Neurobiology, Care Sciences and Society (NVS), Karolinska Institutet and Karolinska University Hospital, Stockholm, Sweden; Ageing Epidemiology Research Unit (AGE), School of Public Health, Faculty of Medicine, Imperial College London, London, United Kingdom.

Ingemar Kåreholt Charlotta Nilsen Miia Kivipelto Deborah Finkel Shireen Sindi

Background: Financial stress is an important source of chronic stress and has been associated with cognitive and physical impairments. The goal of this study was to investigate whether financial stress is associated with cognitive and physical impairment and their combination, the role of potential modifiable factors and potential sex differences.

Methods: The Cardiovascular Risk Factors, Aging, and Dementia population-based cohort study from Finland was used (n = 1497) (baseline data collected 1972-1987, mean age 50 years).

View Article and Find Full Text PDF

Similar Publications

A survey of Swedish radiographer's need for knowledge at advanced level.

Radiography (Lond)

March 2025

Department of Health, Medicine and Caring Sciences, Linköping University, Sweden; Center for Medical Image Science and Visualization, Linköping University, Linköping, Sweden; Department of Radiology in Linköping, Sweden.

M Byenfeldt L-L Lundvall P T Olofsson J Kihlberg

Introduction: There are uncertainties about whether current advanced-level courses provide the knowledge needed to develop the profession for radiographers in Sweden. The aim of this study was to investigate Swedish radiographers' perceived need for additional post-registration knowledge in their profession and their need for education at advanced level.

Methods: Swedish radiographers were invited to participate in a national electronic survey between November and December 2022.

View Article and Find Full Text PDF

Similar Publications

Clinical Practice Guideline: The Diagnosis and Treatment of Acute Spinal Cord Injury.

Dtsch Arztebl Int

April 2025

Nora Cryns Sandra Himmelhaus Sophie Irrgang Moritz Ernst Norbert Weidner

Background: In Germany, the incidence of traumatic spinal cord injury is approximately 16 per million inhabitants per year. This article aims to present evidence-based diagnostic and therapeutic measures for the first 14 days after injury to minimize neural damage, prevent complications, and preserve functioning as much as possible.

Methods: After the formulation of key questions, systematic literature searches were carried out on multiple topics.

View Article and Find Full Text PDF

Similar Publications

Enhancing Large Language Models with Retrieval-augmented Generation: A Radiology-specific Approach.

Radiol Artif Intell

March 2025

Department of Radiology & Biomedical Imaging, University of California, San Francisco (UCSF), San Francisco, Calif.

Dane A Weinert Andreas M Rauschecker

Retrieval-augmented generation (RAG) is a strategy to improve performance of large language models (LLMs) by providing the LLM with an updated corpus of knowledge that can be used for answer generation in real-time. RAG may improve LLM performance and clinical applicability in radiology by providing citable, up-to-date information without requiring model fine-tuning. In this retrospective study, a radiology-specific RAG was developed using a vector database of 3,689 articles published from January 1999 to December 2023.

View Article and Find Full Text PDF

Similar Publications

Perceived value and barriers of nursing specialty certifications among clinical nurses in Saudi Arabia: a cross-sectional study.

Front Med (Lausanne)

February 2025

College of Nursing, King Saud University, Riyadh, Saudi Arabia.

Alawiah T AlSadah Ahmad E Aboshaiqah Naif H Alanazi

Introduction: Specialty nursing certifications reflect nurse's knowledge and competence in certain areas. Obtaining certification allows them to advance their careers and enhance patient care standards as their role and scope of responsibility expands. This study aimed to understand how nurses view specialty certification and related challenges in three university hospitals in Riyadh, Saudi Arabia.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!