Behav Res Methods
September 2023
Multidimensional forced-choice (MFC) testing has been proposed as a way of reducing response biases in noncognitive measurement. Although early item response theory (IRT) research focused on illustrating that person parameter estimates with normative properties could be obtained using various MFC models and formats, more recent attention has been devoted to exploring the processes involved in test construction and how that influences MFC scores. This research compared two approaches for estimating multi-unidimensional pairwise preference model (MUPP; Stark et al.
View Article and Find Full Text PDFDue to well-known problems with self-ratings of job performance (e.g., inflation, weak correlation with supervisor ratings) and the challenges of collecting supervisor ratings of job performance, researchers sometimes use supervisor-perspective ratings (e.
View Article and Find Full Text PDFDifferential item functioning (DIF) analysis is one of the most important applications of item response theory (IRT) in psychological assessment. This study examined the performance of two Bayesian DIF methods, Bayes factor (BF) and deviance information criterion (DIC), with the generalized graded unfolding model (GGUM). The Type I error and power were investigated in a Monte Carlo simulation that manipulated sample size, DIF source, DIF size, DIF location, subpopulation trait distribution, and type of baseline model.
View Article and Find Full Text PDFCollateral information has been used to address subpopulation heterogeneity and increase estimation accuracy in some large-scale cognitive assessments. The methodology that takes collateral information into account has not been developed and explored in published research with models designed specifically for noncognitive measurement. Because the accurate noncognitive measurement is becoming increasingly important, we sought to examine the benefits of using collateral information in latent trait estimation with an item response theory model that has proven valuable for noncognitive testing, namely, the generalized graded unfolding model (GGUM).
View Article and Find Full Text PDFThis research developed a new ideal point-based item response theory (IRT) model for multidimensional forced choice (MFC) measures. We adapted the Zinnes and Griggs (ZG; 1974) IRT model and the multi-unidimensional pairwise preference (MUPP; Stark et al., 2005) model, henceforth referred to as ZG-MUPP.
View Article and Find Full Text PDFThere has been reemerging interest within psychology in the construct of character, yet assessing it can be difficult due to social desirability of character traits. Forced-choice formats offer one way to address response bias, but traditional scoring methods (i.e.
View Article and Find Full Text PDFLikert-type measures have been criticized in psychological assessment because they are vulnerable to response biases, including central tendency, acquiescence, leniency, halo, and socially desirable responding. As an alternative, multidimensional forced choice (MFC) testing has been proposed to address these concerns. A number of researchers have developed item response theory (IRT) models for MFC data and have examined latent trait estimation with tests of different dimensionality and length.
View Article and Find Full Text PDFHistorically, multidimensional forced choice (MFC) measures have been criticized because conventional scoring methods can lead to ipsativity problems that render scores unsuitable for interindividual comparisons. However, with the recent advent of item response theory (IRT) scoring methods that yield normative information, MFC measures are surging in popularity and becoming important components in high-stake evaluation settings. This article aims to add to burgeoning methodological advances in MFC measurement by focusing on statement and person parameter recovery for the GGUM-RANK (generalized graded unfolding-RANK) IRT model.
View Article and Find Full Text PDFThe drinking motives questionnaire (DMQ, Cooper, 1994) has been a very useful measurement tool for understanding why people drink alcohol. Recent attempts to examine drinking motives used the DMQ within a person-centered analysis framework. However, latent profiles identified in previous research largely presented level effects without strong shape effects, which consequently restricted meaningful interpretations and effective applications of drinking-motive profiles.
View Article and Find Full Text PDFForced-choice item response theory (IRT) models are being more widely used as a way of reducing response biases in noncognitive research and operational testing contexts. As applications have increased, there has been a growing need for methods to link parameters estimated in different examinee groups as a prelude to measurement equivalence testing. This study compared four linking methods for the Zinnes and Griggs (ZG) pairwise preference ideal point model.
View Article and Find Full Text PDFAppl Psychol Meas
March 2017
Concurrent calibration using anchor items has proven to be an effective alternative to separate calibration and linking for developing large item banks, which are needed to support continuous testing. In principle, anchor-item designs and estimation methods that have proven effective with dominance item response theory (IRT) models, such as the 3PL model, should also lead to accurate parameter recovery with ideal point IRT models, but surprisingly little research has been devoted to this issue. This study, therefore, had two purposes: (a) to develop software for concurrent calibration with, what is now the most widely used ideal point model, the generalized graded unfolding model (GGUM); (b) to compare the efficacy of different GGUM anchor-item designs and develop empirically based guidelines for practitioners.
View Article and Find Full Text PDFAppl Psychol Meas
October 2016
In recent years, there has been a surge of interest in measuring noncognitive constructs in educational and managerial/organizational settings. For the most part, these noncognitive constructs have been and continue to be measured using Likert-type (ordinal response) scales, which are susceptible to several types of response distortion. To deal with these response biases, researchers have proposed using forced-choice format, which requires respondents or raters to evaluate cognitive, affective, or behavioral descriptors presented in blocks of two or more.
View Article and Find Full Text PDF