Performance of ChatGPT in the In-Training Examination for Anesthesiology and Pain Medicine Residents in South Korea: Observational Study.

Soo-Hyuk Yoon Seok Kyeong Oh Byung Gun Lim Ho-Jin Lee

JMIR Med Educ

Department of Anesthesiology and Pain Medicine, Seoul National University Hospital, Seoul National University College of Medicine, Seoul, Republic of Korea.

Published: September 2024

Background: ChatGPT has been tested in health care, including the US Medical Licensing Examination and specialty exams, showing near-passing results. Its performance in the field of anesthesiology has been assessed using English board examination questions; however, its effectiveness in Korea remains unexplored.

Objective: This study investigated the problem-solving performance of ChatGPT in the fields of anesthesiology and pain medicine in the Korean language context, highlighted advancements in artificial intelligence (AI), and explored its potential applications in medical education.

Methods: We investigated the performance (number of correct answers/number of questions) of GPT-4, GPT-3.5, and CLOVA X in the fields of anesthesiology and pain medicine, using in-training examinations that have been administered to Korean anesthesiology residents over the past 5 years, with an annual composition of 100 questions. Questions containing images, diagrams, or photographs were excluded from the analysis. Furthermore, to assess the performance differences of the GPT across different languages, we conducted a comparative analysis of the GPT-4's problem-solving proficiency using both the original Korean texts and their English translations.

Results: A total of 398 questions were analyzed. GPT-4 (67.8%) demonstrated a significantly better overall performance than GPT-3.5 (37.2%) and CLOVA-X (36.7%). However, GPT-3.5 and CLOVA X did not show significant differences in their overall performance. Additionally, the GPT-4 showed superior performance on questions translated into English, indicating a language processing discrepancy (English: 75.4% vs Korean: 67.8%; difference 7.5%; 95% CI 3.1%-11.9%; P=.001).

Conclusions: This study underscores the potential of AI tools, such as ChatGPT, in medical education and practice but emphasizes the need for cautious application and further refinement, especially in non-English medical contexts. The findings suggest that although AI advancements are promising, they require careful evaluation and development to ensure acceptable performance across diverse linguistic and professional settings.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11443200	PMC
http://dx.doi.org/10.2196/56859	DOI Listing

Publication Analysis

Top Keywords

anesthesiology pain

pain medicine

performance

performance chatgpt

fields anesthesiology

gpt-35 clova

questions

anesthesiology

chatgpt in-training

in-training examination

Similar Publications

ISCT MSC committee statement on the US FDA approval of allogenic bone-marrow mesenchymal stromal cells.

Cytotherapy

January 2025

Osteoarthritis Research Program, Division of Orthopedic Surgery, Schroeder Arthritis Institute, University Health Network, Toronto, Ontario, Canada; Krembil Research Institute, University Health Network, Toronto, Ontario, Canada; Institute of Biomedical Engineering, University of Toronto, Toronto, Ontario, Canada; Department of Medicine, Division of Hematology, University of Toronto, Toronto, Ontario, Canada. Electronic address:

Katarina Le Blanc Francesco Dazzi Karen English Dominique Farge Jacques Galipeau

The December 2024 US Food and Drug Administration (FDA) approval of Mesoblast's Ryoncil (remestemcel-L-rknd)-allogeneic bone marrow mesenchymal stromal cell (MSC(M)) therapy-in pediatric acute steroid-refractory graft-versus-host-disease finally ended a long-lasting drought on approved MSC clinical products in the United States. While other jurisdictions-including Europe, Japan, India, and South Korea-have marketed autologous or allogeneic MSC products, the United States has lagged in its approval. The sponsor's significant efforts and investments, working closely with the FDA addressing concerns regarding clinical efficacy and consistent MSC potency through an iterative process that spanned several years, was rewarded with this landmark approval.

View Article and Find Full Text PDF

Similar Publications

Reticulocyte hemoglobin content: a new frontier in iron deficiency diagnostics for major surgical patients.

BMC Anesthesiol

January 2025

University Hospital Würzburg, Department of Anaesthesiology, Intensive Care, Emergency and Pain Medicine, Würzburg, Germany.

Suma Choorapoikayil Mischa J Kotlyar Lisa Kawohl Paul P Pratz Denana Mehic

Background: Iron deficiency (ID) is the most common nutritional deficiency among patients undergoing major surgery. Treatment of ID is straightforward, however implementing a comprehensive anemia management strategy within clinical routines is complex. Recently, reticulocyte hemoglobin content (Ret-He) has been evaluated as an early marker for ID diagnosis.

View Article and Find Full Text PDF

Similar Publications

Chronic pain is a risk factor for all-cause and cancer-specific mortality in cancer survivors: a population-based cohort study.

BMC Public Health

January 2025

Department of Oncology, Zhuji People's Hospital of Zhejiang Province, No. 9 Jianmin Road, Zhuji, Zhejiang, 311800, China.

Yeying Zhang Yuna Guo

Background: Evidence is lacking on whether chronic pain is related to the risk of cancer mortality. This study seeks to unveil the association between chronic pain and all-cause, cancer, as well as non-cancer death in cancer patients based on the National Health and Nutrition Examination Survey (NHANES) database.

Methods: Cancer survivors aged at least 20 (n = 1369) from 3 NHANES (1999-2004) cycles were encompassed.

View Article and Find Full Text PDF

Similar Publications

Association between intraoperative fluid management and postoperative outcomes in living kidney donors: a retrospective cohort study.

Sci Rep

January 2025

Department of Anesthesiology and Pain Medicine, Samsung Medical Center, Sungkyunkwan University School of Medicine, 81, Irwon-ro, Gangnam-gu, Seoul, 06351, Republic of Korea.

Ja Eun Lee Chisong Chung Sunghae Park Kyo Won Lee Gaab Soo Kim

Optimal fluid strategy for laparoscopic donor nephrectomy (LDN) remains unclear. LDN has been a domain for liberal fluid management to ensure graft perfusion, but this can result in adverse outcomes due to fluid overload. We compared postoperative outcome of living kidney donors according to the intraoperative fluid management.

View Article and Find Full Text PDF

Similar Publications

Influence of frailty status on the incidence of intraoperative hypotensive events in elective surgery: Hypo-Frail, a single-centre retrospective cohort study.

Br J Anaesth

January 2025

Charité - Universitätsmedizin Berlin, Corporate Member of Freie Universität Berlin and Humboldt Universität zu Berlin, Department of Anaesthesiology and Intensive Care Medicine (CCM/CVK), Berlin, Germany; Medical University of Vienna, Department of Anaesthesia, Intensive Care Medicine and Pain Medicine, Clinical Division of General Anaesthesia and Intensive Care Medicine, Vienna, Austria. Electronic address:

Nils Daum Laerson Hoff Claudia Spies Anne Pohrt Annika Bald

Background: Frailty is a predictor of morbidity and mortality in older patients. This study aimed to investigate the influence of frailty status on likelihood, rate, duration, and severity of intraoperative hypotension (IOH), which can lead to severe organ dysfunction.

Methods: Surgical patients (≥70 yr old) with preoperative frailty assessment were analysed retrospectively.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!