Proposed is a new method of standard setting referred to as response vector for mastery (RVM) method. Under the RVM method, the task of panelists that participate in the standard setting process does not involve conceptualization of a borderline examinee and probability judgments as it is the case with the Angoff and bookmark methods. Also, the RVM-based computation of a cut-score is not based on a single item (e.
View Article and Find Full Text PDFEduc Psychol Meas
February 2022
This study offers an approach to testing for differential item functioning (DIF) in a recently developed measurement framework, referred to as -scoring method (DSM). Under the proposed approach, called method of testing for DIF, the item response functions of two groups (reference and focal) are compared by transforming their probabilities of correct item response, estimated under the DSM, into Z-scale normal deviates. Using the liner relationship between such Z-deviates, the testing for DIF is reduced to testing two basic statistical hypotheses about equal variances and equal means of the Z-deviates for the reference and focal groups.
View Article and Find Full Text PDFThis study presents a latent (item response theory-like) framework of a recently developed classical approach to test scoring, equating, and item analysis, referred to as -scoring method. Specifically, (a) person and item parameters are estimated under an item response function model on the -scale (from 0 to 1) using marginal maximum-likelihood estimation and (b) analytic expressions are provided for item information function, test information function, and standard error of estimation for -scores obtained under the proposed latent treatment of the -scoring method. The results from a simulation study reveal very good recovery of item and person parameters via the marginal maximum-likelihood estimation method.
View Article and Find Full Text PDFBuilding on prior research on the relationships between key concepts in item response theory and classical test theory, this note contributes to highlighting their important and useful links. A readily and widely applicable latent variable modeling procedure is discussed that can be used for point and interval estimation of the individual person true score on any item in a unidimensional multicomponent measuring instrument or item set under consideration. The method adds to the body of research on the connections between classical test theory and item response theory.
View Article and Find Full Text PDFA procedure for evaluation of validity related coefficients and their differences is discussed, which is applicable when one or more frequently used assumptions in empirical educational, behavioral and social research are violated. The method is developed within the framework of the latent variable modeling methodology and accomplishes point and interval estimation of convergent and discriminant correlations as well as differences between them in cases of incomplete data sets with data not missing at random, nonnormality, and clustering effects. The procedure uses the full information maximum likelihood approach to model fitting and parameter estimation, does not assume availability of multiple indicators for underlying latent constructs, includes auxiliary variables, and accounts for within-group correlations on main response variables resulting from nesting effects involving studied respondents.
View Article and Find Full Text PDFThis study presents new models for item response functions (IRFs) in the framework of the -scoring method (DSM) that is gaining attention in the field of educational and psychological measurement and largescale assessments. In a previous work on DSM, the IRFs of binary items were estimated using a logistic regression model (LRM). However, the LRM underestimates the item true scores at the top end of the -scale (ranging from 0 to 1), especially for relatively difficult items.
View Article and Find Full Text PDFThis note highlights and illustrates the links between item response theory and classical test theory in the context of polytomous items. An item response modeling procedure is discussed that can be used for point and interval estimation of the individual true score on any item in a measuring instrument or item set following the popular and widely applicable graded response model. The method contributes to the body of research on the relationships between classical test theory and item response theory and is illustrated on empirical data.
View Article and Find Full Text PDFThe -scoring method for scoring and equating tests with binary items proposed by Dimitrov offers some of the advantages of item response theory, such as item-level difficulty information and score computation that reflects the item difficulties, while retaining the merits of classical test theory such as the simplicity of number correct score computation and relaxed requirements for model sample sizes. Because of its unique combination of those merits, the -scoring method has seen quick adoption in the educational and psychological measurement field. Because item-level difficulty information is available with the -scoring method and item difficulties are reflected in test scores, it conceptually makes sense to use the -scoring method with adaptive test designs such as multistage testing (MST).
View Article and Find Full Text PDFAn approach to scoring tests with binary items, referred to as -scoring method, was previously developed as a classical analog to basic models in item response theory (IRT) for binary items. As some tests include polytomous items, this study offers an approach to -scoring of such items and parallels the results with those obtained under the graded response model (GRM) for ordered polytomous items in the framework of IRT. The proposed design of using -scoring with "virtual" binary items generated from polytomous items provides (a) ability scores that are consistent with their GRM counterparts and (b) item category response functions analogous to those obtained under the GRM.
View Article and Find Full Text PDFPlausible values can be used to either estimate population-level statistics or compute point estimates of latent variables. While it is well known that five plausible values are usually sufficient for accurate estimation of population-level statistics in large-scale surveys, the minimum number of plausible values needed to obtain accurate latent variable point estimates is unclear. This is especially relevant when an item response theory (IRT) model is estimated with MCMC (Markov chain Monte Carlo) methods in Mplus and point estimates of the IRT ability parameter are of interest, as Mplus only estimates the posterior distribution of each ability parameter.
View Article and Find Full Text PDFBacteria colonizing the surface of orthopedic implants are responsible for most postoperative periprosthetic joint infections. A possible alternative route for drug delivery is described in this study by utilizing the bulk of the implant itself as a reservoir. Drug release is enabled by manufacturing of integrated permeable structures possessing high porosity through application of selective laser melting technology.
View Article and Find Full Text PDFEduc Psychol Meas
October 2018
This article presents some new developments in the methodology of an approach to scoring and equating of tests with binary items, referred to as delta scoring (D-scoring), which is under piloting with large-scale assessments at the National Center for Assessment in Saudi Arabia. This presentation builds on a previous work on delta scoring and adds procedures for scaling and equating, item response function, and estimation of true values and standard errors of D scores. Also, unlike the previous work on this topic, where D-scoring involves estimates of item and person parameters in the framework of item response theory, the approach presented here does not require item response theory calibration.
View Article and Find Full Text PDFA latent variable modeling method for studying measurement invariance when evaluating latent constructs with multiple binary or binary scored items with no guessing is outlined. The approach extends the continuous indicator procedure described by Raykov and colleagues, utilizes similarly the false discovery rate approach to multiple testing, and permits one to locate violations of measurement invariance in loading or threshold parameters. The discussed method does not require selection of a reference observed variable and is directly applicable for studying differential item functioning with one- or two-parameter item response models.
View Article and Find Full Text PDFThis article extends the procedure outlined in the article by Raykov, Marcoulides, and Tong for testing congruence of latent constructs to the setting of binary items and clustering effects. In this widely used setting in contemporary educational and psychological research, the method can be used to examine if two or more homogeneous multicomponent instruments with distinct components measure the same construct. The approach is useful in scale construction and development research as well as in construct validation investigations.
View Article and Find Full Text PDFEduc Psychol Meas
December 2016
This article describes an approach to test scoring, referred to as (-scoring), for tests with dichotomously scored items. The -scoring uses information from item response theory (IRT) calibration to facilitate computations and interpretations in the context of large-scale assessments. The -score is computed from the examinee's response vector, which is weighted by the expected difficulties (not "easiness") of the test items.
View Article and Find Full Text PDFThis article is concerned with developing a measure of general academic ability (GAA) for high school graduates who apply to colleges, as well as with the identification of optimal weights of the GAA indicators in a linear combination that yields a composite score with maximal reliability and maximal predictive validity, employing the framework of the popular latent variable modeling methodology. The approach to achieving this goal is illustrated with data for 6,640 students with major in Science and 3,388 students with major in Art from colleges in Saudi Arabia. The indicators (observed measures) of the targeted GAA construct were selected from assessments that include the students' high school grade and their scores on two standardized tests developed by the National Center for Assessment in Higher Education in Saudi Arabia, General Aptitude Test (GAT) and Standardized Achievement Admission Test (SAAT).
View Article and Find Full Text PDFThe purpose of this study was to examine the effects of program interventions in a school-based teen pregnancy program on hypothesized constructs underlying teens' attitudes toward sexuality. An important task related to this purpose was the validation of the constructs and their stability from pre- to postintervention measures. Data from 1,136 middle grade students were obtained from an earlier evaluation of an abstinence-based teen pregnancy prevention program (S.
View Article and Find Full Text PDFStructural equation modeling (SEM) provides a dependable framework for testing differences among groups on latent variables (constructs, factors). The purpose of this article is to illustrate SEM-based testing for group mean differences on latent variables. Related procedures of confirmatory factor analysis and testing for measurement invariance across compared groups are also presented in the context of rehabilitation research.
View Article and Find Full Text PDFTwo frequently used parametric statistics of person-fit with the dichotomous Rasch model (RM) are adjusted and compared to each other and to their original counterparts in terms of power to detect aberrant response patterns in short tests (10, 20, and 30 items). Specifically, the cube root transformation of the mean square for the unweighted person-fit statistic, t, and the standardized likelihood-based person-fit statistic Z3 were adjusted by estimating the probability for correct item response through the use of symmetric functions in the dichotomous Rasch model. The results for simulated unidimensional Rasch data indicate that t and Z3 are consistently, yet not greatly, outperformed by their adjusted counterparts, denoted t* and Z3*, respectively.
View Article and Find Full Text PDFThis article provides psychometric analysis of the performance of nursing students on categories of client needs (CN) and nursing process (NP) measured by the NLN Diagnostic Readiness Test (NLN-DRT) for RN licensure. While analyses of items and number-right score performance with NLN tests are well documented, the analysis of proficiency on categories that organize items at the conceptual level is limited to reporting basic classical statistics (e.g.
View Article and Find Full Text PDFThis article provides formulas for expected true-score measures and reliability of binary items as a function of their Rasch difficulty when the trait (ability) distribution is normal or logistic. The proposed formulas have theoretical value and can be useful in test development, score analysis, and simulation studies. Once the items are calibrated with the dichotomous Rasch model, one can estimate (without further data collection) the expected values for true-score measures (e.
View Article and Find Full Text PDFThe article examines issues involved in comparing groups and measuring change with pretest and posttest data. Different pretest-posttest designs are presented in a manner that can help rehabilitation professionals to better understand and determine effects resulting from selected interventions. The reliability of gain scores in pretest-posttest measurement is also discussed in the context of rehabilitation research and practice.
View Article and Find Full Text PDFMultivariate Behav Res
January 2003
Determining sources of item difficulty and using them for selection or development of test items is a bridging task of psychometrics and cognitive psychology. A key problem in this task is the validation of hypothesized cognitive operations required for correct solution of test items. In previous research, the problem has been addressed frequently via use of the linear logistic test model for prediction of item difficulties.
View Article and Find Full Text PDF