Background: This study aimed to evaluate the performance of GPT-3.5, GPT-4, GPT-4o and Google Bard on the United States Medical Licensing Examination (USMLE), the Professional and Linguistic Assessments Board (PLAB), the Hong Kong Medical Licensing Examination (HKMLE) and the National Medical Licensing Examination (NMLE).
Methods: This study was conducted in June 2023. Four LLMs (Large Language Models) (GPT-3.5, GPT-4, GPT-4o and Google Bard) were applied to four medical standardized tests (USMLE, PLAB, HKMLE and NMLE). All questions are multiple-choice questions and were sourced from the question banks of these examinations.
Results: In USMLE step 1, step 2CK and Step 3, there are accuracy rates of 91.5%, 94.2% and 92.7% provided from GPT-4o, 93.2%, 95.0% and 92.0% provided from GPT-4, 65.6%, 71.6% and 68.5% provided from GPT-3.5, and 64.3%, 55.6%, 58.1% from Google Bard, respectively. In PLAB, HKMLE and NMLE, GPT-4o scored 93.3%, 91.7% and 84.9%, GPT-4 scored 86.7%, 89.6% and 69.8%, GPT-3.5 scored 80.0%, 68.1% and 60.4%, and Google Bard scored 54.2%, 71.7% and 61.3%. There was significant difference in the accuracy rates of four LLMs in the four medical licensing examinations.
Conclusion: GPT-4o performed better in the medical licensing examinations than other three LLMs. The performance of the four models in the NMLE examination needs further improvement.
Clinical Trial Number: Not applicable.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11590336 | PMC |
http://dx.doi.org/10.1186/s12909-024-06309-x | DOI Listing |
JMIR Med Inform
January 2025
Department of Science and Education, Shenzhen Baoan Women's and Children's Hospital, Shenzhen, China.
Background: Large language models (LLMs) have been proposed as valuable tools in medical education and practice. The Chinese National Nursing Licensing Examination (CNNLE) presents unique challenges for LLMs due to its requirement for both deep domain-specific nursing knowledge and the ability to make complex clinical decisions, which differentiates it from more general medical examinations. However, their potential application in the CNNLE remains unexplored.
View Article and Find Full Text PDFChin J Integr Med
January 2025
Department of Oriental Neuropsychiatry, Dong-Eui University College of Korean Medicine, Busan, Republic of Korea.
Objective: Traditional medicine (TM) has played a key role in the health care system of East Asian countries, including China, Japan and South Korea. This bibliometric study analyzes the recent research status of these three TMs, including traditional Chinese medicine (TCM), traditional Korean medicine (TKM), and Kampo medicine (KM).
Methods: Research topics of studies published for recent 10 years (2014 to 2023), through a search on MEDLINE via PubMed, was analyzed.
CNS Drugs
January 2025
School of Medicine and Dentistry, Gold Coast Campus, Griffith University, Southport, QLD, 4222, Australia.
Background: Epstein-Barr virus (EBV) is implicated as a necessary factor in the development of multiple sclerosis (MS) and may also be a driver of disease activity. Although it is not clear whether ongoing viral replication is the driver for MS pathology, MS researchers have considered the prospect of using drugs with potential efficacy against EBV in the treatment of MS. We have undertaken scientific and lived experience expert panel reviews to shortlist existing licensed therapies that could be used in later-stage clinical trials in MS.
View Article and Find Full Text PDFArch Dermatol Res
January 2025
Department of Dermatology, Brigham and Women's Hospital, 221 Longwood Avenue, Boston, MA, 02115, USA.
Cardiovasc Diagn Ther
December 2024
Department of Cardiovascular, University Hospital Basel, Basel, Switzerland.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!