Standardized testing is ubiquitous in educational assessment, but questions have been raised about the extent to which these test scores accurately reflect students' genuine knowledge and skills. To more rigorously investigate this issue, the current study employed a within-subject experimental design to examine item format effects on primary school students' standardized assessment results in literacy, reading comprehension, and numeracy. Eighty-nine Grade 3 students (ages 8-9 years) completed tests that varied only in item format: multiple choice; open-ended; error detection and correction; explain; and, for numeracy questions, low literacy. Analyses contrasted students' performance across these conditions, as well as item response theory-derived item difficulty and ability discrimination estimates. Findings revealed that difficulty increased and accuracy decreased from multiple-choice to open-response to error-correction and explain questions. However, the most difficult item formats tended to yield the greatest discrimination across student ability levels. Despite previous findings to the contrary, low-literacy numeracy questions did not improve student performance or reduce item difficulty. Overall, findings indicated the impact of differing methods of assessment on standardized test performance and highlighted the need for careful consideration of not only the content of assessments but also their approaches to assessment. (PsycINFO Database Record (c) 2020 APA, all rights reserved).
Download full-text PDF |
Source |
---|---|
http://dx.doi.org/10.1037/spq0000340 | DOI Listing |
J Clin Aesthet Dermatol
January 2025
Drs. O'Connor, De La Garza, and Vashi are with the Department of Dermatology at Boston University in Boston, Massachusetts.
Objective: The authors sought to examine the association between the increased use of videoconferencing and image-enhancing software filters during the COVID-19 pandemic and the corresponding rise in interest in cosmetic procedures. We aimed to discern if heightened exposure to one's digital self-image correlates with a greater inclination toward cosmetic interventions.
Methods: In this cross-sectional questionnaire survey study, adult participants living in the United States were recruited on Amazon's Mechanical Turk platform.
Br J Sports Med
January 2025
Department of Clinical Medicine, Aarhus University Department of Clinical Medicine, Aarhus, Midtjylland, Denmark.
Objective: Validated patient-reported outcome measures for patients with hip-abductor tendon pathology are lacking. Recent studies indicate that the Copenhagen Hip and Groin Outcome Score (HAGOS), in the original or the revised scoring format, and the Oxford Hip Score (OHS) may be relevant for use in patients with hip-abductor tendon pathology. The objective of this study was to assess the validity of the Danish version of the HAGOS, the revised HAGOS, and the OHS psychometrically in patients with hip-abductor tendon pathology.
View Article and Find Full Text PDFJMIR Form Res
January 2025
Department of Computer Science, University Hospital of Geneva, Geneva, Switzerland.
Background: Mobile health apps have shown promising results in improving self-management of several chronic diseases in patients. We have developed a mobile health app (Cardiomeds) dedicated to patients with heart failure (HF). This app includes an interactive medication list; daily self-monitoring of symptoms, weight, blood pressure, and heart rate; and educational information on HF delivered through various formats.
View Article and Find Full Text PDFBehav Sci (Basel)
November 2024
Education & Training Evaluation Commission, Riyadh 12395, Saudi Arabia.
This study had two purposes: (1) to develop a forced-choice personality inventory to assess student personality characteristics based on the five-factor (FFM) personality model and (2) to examine its factor structure via the Thurstonian Item Response Theory (TIRT) approach based on Thurstone's law of comparative judgment. A total of 200 items were generated to represent the five dimensions, and through Principal Axis Factoring and the composite reliability index, a final pool of 75 items was selected. These items were then organized into 25 blocks, each containing three statements (triplets) designed to balance social desirability across the blocks.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!