Background: The ability of an expert's item difficulty ratings to predict test-taker actual performance is an important aspect of licensure examinations. Expert judgment is used as a primary source of information for users to make prior decisions to determine the pass rate of test takers. The nature of raters involved in predicting item difficulty is central to set credible standards. Therefore, this study aimed to assess and compare raters' prediction and actual Multiple-Choice Questions' difficulty of the undergraduate medicine licensure examination (UGMLE) in Ethiopia.

Method: 815 examinees' responses to 200 Multiple-Choice Questions (MCQs) were used in this study. The study also included experts' item difficulty ratings of seven physicians who participated in the standard settings of UGMLE. Then, analysis was conducted to understand experts' rating variation in predicting the actual difficulty levels of examinees. Descriptive statistics was used to profile the mean rater's and actual difficulty value for MCQs, and ANOVA was used to compare the mean differences between raters' prediction of item difficulty. Additionally, regression analysis was used to understand the interrater variations in item difficulty predictions compared to the actual difficulty. The proportion of variance of actual difficulty explained from rater prediction was computed using regression analysis.

Results: In this study, the mean difference between raters' prediction and examinees' actual performance was inconsistent across the exam domains. The study revealed a statistically significant strong positive correlation between the actual and predicted item difficulty in exam domains eight and eleven. However, a non-statistically significant very weak positive correlation was reported in exam domains seven and twelve. The multiple comparison analysis showed significant differences in mean item difficulty ratings between raters. In the regression analysis, experts' item difficulty ratings of the UGMLE had 33% power in predicting the actual difficulty level. The regression model also showed a moderate positive correlation (R = 0.57) that was statistically significant at F (6, 193) = 15.58, P = 0.001.

Conclusion: This study demonstrated the complex process for assessing the difficulty level of MCQs in the UGMLE and emphasized the benefits of using experts' ratings in advance. To ensure the exams maintain the necessary reliable and valid scores, raters' accuracy on the UGMLE must be improved. To achieve this, techniques that align with the evolving assessment methodologies must be developed.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11407004PMC
http://dx.doi.org/10.1186/s12909-024-06012-xDOI Listing

Publication Analysis

Top Keywords

item difficulty
36
actual difficulty
20
difficulty
16
difficulty ratings
16
raters' prediction
12
exam domains
12
positive correlation
12
item
9
actual
9
prediction item
8

Similar Publications

Introduction: The prevalence of epilepsy in sub-Saharan Africa varies considerably, and the exact estimate for Ghana remains unclear, particularly in peri-urban areas where data are scarce. More community-based studies are required to understand better the actual burden of epilepsy in these areas and the difficulties in accessing healthcare.

Objective: To adapt and validate a household survey epilepsy-screening instrument in Shai-Osudoku and Ningo-Prampram District of Greater Accra Region, Ghana.

View Article and Find Full Text PDF

Background: The CAMDEX-DS is an instrument to diagnose Alzheimer's disease (AD) in Down syndrome consisting of an informant interview and a cognitive test battery (CAMCOG-DS). Measurement properties of the German CAMDEX-DS were investigated.

Method: Fifty-five adults with Down syndrome (19-58 years) participated in this observational study.

View Article and Find Full Text PDF

Introduction: The Friedreich Ataxia Rating Scale-Activities of Daily Living (FARS-ADL) is a validated and highly utilized measure for evaluating patients with Friedreich Ataxia. While construct validity of FARS-ADL has been shown for spinocerebellar ataxia (SCA), content validity has not been established.

Methods: Individuals with SCA1 or SCA3 (n = 7) and healthcare professionals (HCPs) with SCA expertise (n = 8) participated in qualitative interviews evaluating the relevance, clarity, and clinical meaningfulness of FARS-ADL for assessment of individuals with SCA.

View Article and Find Full Text PDF

The Importance of Frailty in Older Adults With Benign Paroxysmal Positioning Vertigo.

J Neurol Phys Ther

November 2024

Faculty of Rehabilitation Sciences, REVAL-Rehabilitation Research Centre, Hasselt University, Hasselt Diepenbeek, Limburg, Belgium (S.P., P.M., J.S.); Department of Otorhinolaryngology and Head & Neck Surgery, School for Mental Health and Neuroscience, Faculty of Health Medicine and Life Sciences, Maastricht University Medical Centre, The Netherlands (S.P., R.V.D.B); Department of Otorhinolaryngology, Head and Neck Surgery ZOL Hospital, Belgium (N.L., W.L.); and Department of Nutrition and Movement Sciences, NUTRIM Institute of Nutrition and Translational Research in Metabolism, Maastricht University, The Netherlands (K.M.).

Background And Purpose: Even though Benign Paroxysmal Positioning Vertigo (BPPV) is one of the most reported vestibular disorders, its interaction with frailty and postural control in older adults is hardly or not investigated.

Methods: Thirty-seven older adults (≥65 years) with a diagnosis of BPPV (oaBPPV) (mean age 73.13 (4.

View Article and Find Full Text PDF

Background: Multiple sclerosis (MS) is a demyelinating disease characterized by balance and gait impairment, fatigue, anxiety, depression, and diminished quality of life. Transcranial direct current stimulation (tDCS) has emerged as an effective intervention for managing these symptoms.

Objective: This study aims to investigate the efficacy of remotely supervised tDCS (RS-tDCS) applied to the left dorsolateral prefrontal cortex, in conjunction with a telerehabilitation (TR) program, on motor (balance and gait), cognitive (executive functions), and participation outcomes (fatigue, anxiety, depression, and quality of life) in persons with MS (pwMS).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!