Introduction: This study assessed the potential of large language models (LLMs) as educational tools by evaluating their accuracy in answering questions across urological subtopics.
Methods: Three LLMs (ChatGPT-3.5, ChatGPT-4, and Bing AI) were examined in two testing rounds, separated by 48 h, using 100 Multiple-Choice Questions (MCQs) from the 2022 European Board of Urology (EBU) In-Service Assessment (ISA), covering five different subtopics. The correct answer was defined as "formal accuracy" (FA) representing the designated single best answer (SBA) among four options. Alternative answers selected from LLMs, which may not necessarily be the SBA but are still deemed correct, were labeled as "extended accuracy" (EA). Their capacity to enhance the overall accuracy rate when combined with FA was examined.
Results: In two rounds of testing, the FA scores were achieved as follows: ChatGPT-3.5: 58% and 62%, ChatGPT-4: 63% and 77%, and BING AI: 81% and 73%. The incorporation of EA did not yield a significant enhancement in overall performance. The achieved gains for ChatGPT-3.5, ChatGPT-4, and BING AI were as a result 7% and 5%, 5% and 2%, and 3% and 1%, respectively (p > 0.3). Within urological subtopics, LLMs showcased best performance in Pediatrics/Congenital and comparatively less effectiveness in Functional/BPS/Incontinence.
Conclusion: LLMs exhibit suboptimal urology knowledge and unsatisfactory proficiency for educational purposes. The overall accuracy did not significantly improve when combining EA to FA. The error rates remained high ranging from 16 to 35%. Proficiency levels vary substantially across subtopics. Further development of medicine-specific LLMs is required before integration into urological training programs.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11305516 | PMC |
http://dx.doi.org/10.1159/000537854 | DOI Listing |
Int J Environ Res Public Health
December 2024
Institute of Integrated Atmospheric Environment, 1-2-8 Koraku, Bunkyo, Tokyo 112-0004, Japan.
Concerns regarding the health risks associated with employe exposure to volatile chemicals during gasoline refueling necessitates rigorous investigation and effective countermeasures. This study aims to evaluate the efficacy of vapor recovery systems in mitigating exposure risks during gasoline refueling. Employee exposure to volatile organic compounds, aldehydes, carbon monoxide, and fine particulate matter (PM) was assessed at gasoline stations with and without vapor recovery systems.
View Article and Find Full Text PDFHealthcare (Basel)
January 2025
Department of Nursing Management and Epidemiological Nursing, Faculty of Health Sciences, Jagiellonian University Medical College, 31-008 Krakow, Poland.
: Work is an essential aspect of human life. However, high expectations from employers and clients, combined with time pressure and chronic stress, can contribute to burnout among employees in service professions. : This study aimed to compare the prevalence of burnout syndrome between two occupational groups-corporate office workers and active nurses-and to assess the influence of socio-demographic factors on the level of burnout in both groups.
View Article and Find Full Text PDFBMC Nurs
January 2025
Department of Nursing, Affiliated Hospital of Zunyi Medical University, No. 149, Dalian Road, Huichuan District, Zunyi City, 563000, Guizhou Province, China.
Background: Every year, more than one-third of diabetes patients experience various acute and chronic complications, leading to the presence of diabetes patients in various departments of the hospital. High-quality nursing care can delay the progression of diabetes and effectively reduce the incidence of complications. Therefore, understanding the level of diabetes knowledge and training needs of clinical nurses is of great significance.
View Article and Find Full Text PDFBMC Pregnancy Childbirth
January 2025
Non-communicable Diseases Research Center, Research Institute for Prevention Non-communicable Diseases, Qazvin University of Medical Sciences, Qazvin, Iran.
Background: Concerning maternity service, the mother's quality assessment is central because emotional, cultural, and respectful support is vital during labour and the delivery process. Studies concerning the perceived quality of maternity services from the perspective of mothers have rarely been carried out in Iranian hospital settings. Therefore, this study aimed to measure the gap between the expectations of patients with maternity services and their perceptions of the service and identify associated factors at a maternity hospital in northwest Iran using service quality (SERVQUAL) and health quality (HEALTHQUAL) questionnaires.
View Article and Find Full Text PDFBMJ Mil Health
January 2025
Academic Department of Military Medicine, Royal Centre for Defence Medicine, Birmingham, UK
Introduction: Abnormal cardiorespiratory symptoms and investigative findings in service personnel typically result in prolonged investigation and occupational restriction. This analysis aimed to assess the impact of the xford ilitary Cardiopulmonary xercise Testing linic (OMEC), which investigates such symptoms and findings, on occupational recommendations.
Methods: A service evaluation was conducted on all OMEC attendances over a 5-year period.
Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!