A primer on classical test theory and item response theory for assessments in medical education.

Med Educ

National Board of Medical Examiners, Philadelphia, Pennsylvania 19104, USA.

Published: January 2010

Context: A test score is a number which purportedly reflects a candidate's proficiency in some clearly defined knowledge or skill domain. A test theory model is necessary to help us better understand the relationship that exists between the observed (or actual) score on an examination and the underlying proficiency in the domain, which is generally unobserved. Common test theory models include classical test theory (CTT) and item response theory (IRT). The widespread use of IRT models over the past several decades attests to their importance in the development and analysis of assessments in medical education. Item response theory models are used for a host of purposes, including item analysis, test form assembly and equating. Although helpful in many circumstances, IRT models make fairly strong assumptions and are mathematically much more complex than CTT models. Consequently, there are instances in which it might be more appropriate to use CTT, especially when common assumptions of IRT cannot be readily met, or in more local settings, such as those that may characterise many medical school examinations.

Objectives: The objective of this paper is to provide an overview of both CTT and IRT to the practitioner involved in the development and scoring of medical education assessments.

Methods: The tenets of CCT and IRT are initially described. Then, main uses of both models in test development and psychometric activities are illustrated via several practical examples. Finally, general recommendations pertaining to the use of each model in practice are outlined.

Discussion: Classical test theory and IRT are widely used to address measurement-related issues that arise from commonly used assessments in medical education, including multiple-choice examinations, objective structured clinical examinations, ward ratings and workplace evaluations. The present paper provides an introduction to these models and how they can be applied to answer common assessment questions.

Download full-text PDF

Source
http://dx.doi.org/10.1111/j.1365-2923.2009.03425.xDOI Listing

Publication Analysis

Top Keywords

test theory
20
medical education
16
classical test
12
item response
12
response theory
12
assessments medical
12
test
8
theory
8
theory models
8
theory irt
8

Similar Publications

Item response theory models are commonly adopted in educational assessment and psychological measurement. Such models need to be modified to accommodate practical situations when statistical sampling assumptions are violated. Omission is a common phenomenon in educational testing.

View Article and Find Full Text PDF

Ecosystem engineering is a facilitative interaction that generates bottom-up extrinsic variability that may increase species coexistence, particularly along a stress/disturbance gradient. American alligators (Alligator mississippiensis) create and maintain 'alligator ponds' that serve as dry-season refuges for other animals. During seasonal water recession, these ponds present an opportunity to examine predictions of the stress-gradient (SGH) and intermediate disturbance hypotheses (IDH).

View Article and Find Full Text PDF

The electrochemical reduction of nitrate to ammonia offers an environmentally sustainable pathway for nitrogen fixation. However, achieving both efficiency and selectivity in nitrate reduction presents a formidable challenge, due to the involvement of sluggish multielectron transfer processes. Herein, the successful synthesis of spherical Cu₂O nanoparticles (s-Cu₂O) exhibiting significant compressive strain effects, achieved through a one-pot method using gelatin as a structural modifier, is reported.

View Article and Find Full Text PDF

The arrangement of pores within the framework plays a crucial role in the gas separation and adsorption of metal-organic frameworks (MOFs), determining their overall performance. In this study, the impact on gas separation efficiency is compared using two multivariate MOF (MTV-MOF) systems with controlled pore arrangements. These systems employ two types of ligands with differing bulkiness: one is the core-shell MOF composite (CSMOF), sequentially synthesized with the bulkier ligand located at the shell, and the other is the mixed-linker MOF (MLMOF), synthesized via a one-pot reaction.

View Article and Find Full Text PDF

Objectives: This study aimed to evaluate the effectiveness of a pharmacist-led intervention in decreasing the overuse of stress ulcer prophylaxis (SUP) compared with the usual care for adult patients in Chinese ICUs.

Design: Pragmatic, multicenter, stepped-wedge, cluster-randomized controlled trial.

Setting: Twenty-six ICUs in China from October 2022 to March 2023.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!