The article compares the trajectories of students' self-reported test-taking effort during a 120 minutes low-stakes large-scale assessment of English comprehension between a paper-and-pencil (PPA) and a computer-based assessment (CBA). Test-taking effort was measured four times during the test. Using a within-subject design, each of the = 2,676 German ninth-grade students completed half of the test in PPA and half in CBA mode, where the sequence of modes was balanced between students.
View Article and Find Full Text PDFIn high-stakes testing, often multiple test forms are used and a common time limit is enforced. Test fairness requires that ability estimates must not depend on the administration of a specific test form. Such a requirement may be violated if speededness differs between test forms.
View Article and Find Full Text PDFIn many disciplines of the social sciences, comparisons between a group mean and the total mean is a common but also challenging task. As one solution to this statistical testing problem, we propose using linear regression with weighted effect coding. For random samples, this procedure is straightforward and easy to implement by means of standard statistical software.
View Article and Find Full Text PDFThis article examines the interdependency of two context effects that are known to occur regularly in large-scale assessments: item position effects and effects of test-taking effort on the probability of correctly answering an item. A microlongitudinal design was used to measure test-taking effort over the course of a large-scale assessment of 60 min. Two components of test-taking effort were investigated: initial effort and change in effort.
View Article and Find Full Text PDFEduc Psychol Meas
December 2015
The selection of an appropriate booklet design is an important element of large-scale assessments of student achievement. Two design properties that are typically optimized are the with respect to the positions the items are presented and with respect to the mutual occurrence of pairs of items in the same booklet. The purpose of this study is to investigate the effects of these two design properties on bias and root mean square error of item parameter estimates from the Rasch model.
View Article and Find Full Text PDFMultiple matrix designs are commonly used in large-scale assessments to distribute test items to students. These designs comprise several booklets, each containing a subset of the complete item pool. Besides reducing the test burden of individual students, using various booklets allows aligning the difficulty of the presented items to the assumed performance level of examined subgroups.
View Article and Find Full Text PDFObjective: It has been suggested that women have a better face recognition memory than men. Here we analyzed whether this advantage depends on a better encoding or consolidation of information and if the advantage is visible during short-term memory (STM), only, or whether it also remains evident in long-term memory (LTM).
Method: We tested short- and long-term face recognition memory in 36 nonclinical participants (19 women).