Performance of ChatGPT in French language Parcours d'Accès Spécifique Santé test and in OBGYN.

Paul-Adrien Guigue Raanan Meyer Gaetan Thivolle-Lioux Yoav Brezinov Gabriel Levin

Int J Gynaecol Obstet

Lady Davis Institute for Cancer Research, Jewish General Hospital, McGill University, Montreal, Quebec, Canada.

Published: March 2024

Objectives: To evaluate the performance of ChatGPT in a French medical school entrance examination.

Methods: A cross-sectional study using a consecutive sample of text-based multiple-choice practice questions for the Parcours d'Accès Spécifique Santé. ChatGPT answered questions in French. We compared performance of ChatGPT in obstetrics and gynecology (OBGYN) and in the whole test.

Results: Overall, 885 questions were evaluated. The mean test score was 34.0% (306; maximal score of 900). The performance of ChatGPT was 33.0% (292 correct answers, 885 questions). The performance of ChatGPT was lower in biostatistics (13.3% ± 19.7%) than in anatomy (34.2% ± 17.9%; P = 0.037) and also lower than in histology and embryology (40.0% ± 18.5%; P = 0.004). The OBGYN part had 290 questions. There was no difference in the test scores and the performance of ChatGPT in OBGYN versus the whole entrance test (P = 0.76 vs P = 0.10, respectively).

Conclusions: ChatGPT answered one-third of questions correctly in the French test preparation. The performance in OBGYN was similar.

Download full-text PDF	Source
http://dx.doi.org/10.1002/ijgo.15083	DOI Listing

Publication Analysis

Top Keywords

performance chatgpt

chatgpt french

parcours d'accès

d'accès spécifique

spécifique santé

chatgpt answered

885 questions

performance

chatgpt

questions

Similar Publications

Me-LLaMA: Medical Foundation Large Language Models for Comprehensive Text Analysis and Beyond.

Res Sq

December 2024

Qianqian Xie Qingyu Chen Aokun Chen Cheng Peng Yan Hu

Recent advancements in large language models (LLMs) like ChatGPT and LLaMA have shown significant potential in medical applications, but their effectiveness is limited by a lack of specialized medical knowledge due to general-domain training. In this study, we developed Me-LLaMA, a new family of open-source medical LLMs that uniquely integrate extensive domain-specific knowledge with robust instruction-following capabilities. Me-LLaMA comprises foundation models (Me-LLaMA 13B and 70B) and their chat-enhanced versions, developed through comprehensive continual pretraining and instruction tuning of LLaMA2 models using both biomedical literature and clinical notes.

View Article and Find Full Text PDF

Similar Publications

Evaluation of Generative Artificial Intelligence Models in Predicting Pediatric Emergency Severity Index Levels.

Pediatr Emerg Care

January 2025

University of California Davis School of Medicine, Sacramento, CA.

Brandon Ho Meng Lu Xuan Wang Russell Butler Joshua Park

Objective: Evaluate the accuracy and reliability of various generative artificial intelligence (AI) models (ChatGPT-3.5, ChatGPT-4.0, T5, Llama-2, Mistral-Large, and Claude-3 Opus) in predicting Emergency Severity Index (ESI) levels for pediatric emergency department patients and assess the impact of medically oriented fine-tuning.

View Article and Find Full Text PDF

Similar Publications

The PDC30 Chatbot-Development of a Psychoeducational Resource on Dementia Caregiving Among Family Caregivers: Mixed Methods Acceptability Study.

JMIR Aging

January 2025

Department of Computing, Faculty of Computer and Mathematical Sciences, Hong Kong Polytechnic University, Hung Hom, China (Hong Kong).

Sheung-Tak Cheng Peter H F Ng

Background: Providing ongoing support to the increasing number of caregivers as their needs change in the long-term course of dementia is a severe challenge to any health care system. Conversational artificial intelligence (AI) operating 24/7 may help to tackle this problem.

Objective: This study describes the development of a generative AI chatbot-the PDC30 Chatbot-and evaluates its acceptability in a mixed methods study.

View Article and Find Full Text PDF

Similar Publications

Burn Patient Education in the Modern Age: A Comparative Analysis of ChatGPT and Google Performance Answering Common Questions on Burn Injury and Management.

J Burn Care Res

January 2025

Department of Plastic Surgery, University of Pittsburgh Medical Center, Pittsburgh, PA 15213, United States.

Sumaarg Pandya Mario Alessandri Bonetti Hilary Y Liu Tiffany Jeong Jenny A Ziembicki

Patients often use Google for their medical questions. With the emergence of artificial intelligence large language models, such as ChatGPT, patients may turn to such technologies as an alternative source of medical information. This study investigates the safety, accuracy, and comprehensiveness of medical responses provided by ChatGPT in comparison to Google for common questions about burn injuries and their management.

View Article and Find Full Text PDF

Similar Publications

Artificial Intelligence in Physical Therapy: Evaluating ChatGPT's Role in Clinical Decision Support for Musculoskeletal Care.

Ann Biomed Eng

January 2025

Department of Rehabilitation Medicine, Beijing Jishuitan Hospital, Beijing, People's Republic of China.

Jie Hao Zixuan Yao Yaogeng Tang Andréas Remis Kangchao Wu

Background: The integration of artificial intelligence into medicine has attracted increasing attention in recent years. ChatGPT has emerged as a promising tool for delivering evidence-based recommendations in various clinical domains. However, the application of ChatGPT to physical therapy for musculoskeletal conditions has yet to be investigated.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!