Assessing knowledge about medical physics in language-generative AI with large language model: using the medical physicist exam.

Noriyuki Kadoya Kazuhiro Arai Shohei Tanaka Yuto Kimura Ryota Tozuka Keisuke Yasui Naoki Hayashi Yoshiyuki Katsuta Haruna Takahashi Koki Inoue Keiichi Jingu

Radiol Phys Technol

Department of Radiation Oncology, Tohoku University Graduate School of Medicine, 1-1 Seiryo-Machi, Aoba-Ku, Sendai, Miyagi, 980-8574, Japan.

Published: December 2024

This study aimed to evaluate the performance for answering the Japanese medical physicist examination and providing the benchmark of knowledge about medical physics in language-generative AI with large language model. We used questions from Japan's 2018, 2019, 2020, 2021 and 2022 medical physicist board examinations, which covered various question types, including multiple-choice questions, and mainly focused on general medicine and medical physics. ChatGPT-3.5 and ChatGPT-4.0 (OpenAI) were used. We compared the AI-based answers with the correct ones. The average accuracy rates were 42.2 ± 2.5% (ChatGPT-3.5) and 72.7 ± 2.6% (ChatGPT-4), showing that ChatGPT-4 was more accurate than ChatGPT-3.5 [all categories (except for radiation-related laws and recommendations/medical ethics): p value < 0.05]. Even with the ChatGPT model with higher accuracy, the accuracy rates were less than 60% in two categories; radiation metrology (55.6%), and radiation-related laws and recommendations/medical ethics (40.0%). These data provide the benchmark for knowledge about medical physics in ChatGPT and can be utilized as basic data for the development of various medical physics tools using ChatGPT (e.g., radiation therapy support tools with Japanese input).

Download full-text PDF	Source
http://dx.doi.org/10.1007/s12194-024-00838-2	DOI Listing

Publication Analysis

Top Keywords

medical physics

medical physicist

knowledge medical

physics language-generative

language-generative large

large language

language model

medical

assessing knowledge

model medical

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered