In speaking assessment, many factors such as characteristics of test-takers, test tasks, rating bias, etc. may affect the speaking performance of test-takers. Besides, the stability of raters' rating of a speaking test might pose a threat to its reliability, validity, and fairness, which calls for longitudinal construct validation of the speaking test. This study explores the construct validity of PRETCO-Oral through analysis of data retrieved from various sources, including longitudinal ratings of performances of test-takers across four occasions, and perceptions of the construct of PRETCO-Oral from both raters and test-takers. The results indicate that raters' ratings keep stable and the PRETCO-Oral assessment is equipped with longitudinal reliability; tasks of Interpretation and Presentation represent a large amount of variance of the construct, while those of Reading Aloud and Question and Answer seem to be construct-underrepresented, as evidenced via analyzing the data collected from perceptions of raters and test-takers upon the test construct. Finally, factors that threaten the construct representation are also discussed.
Download full-text PDF |
Source |
---|---|
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322943 | PMC |
http://dx.doi.org/10.3389/fpsyg.2022.896453 | DOI Listing |
BMC Med Educ
September 2024
Faculty of Educational Sciences, Faculty of Medicine, University of Oslo, Oslo, Norway.
Background: The ability of an expert's item difficulty ratings to predict test-taker actual performance is an important aspect of licensure examinations. Expert judgment is used as a primary source of information for users to make prior decisions to determine the pass rate of test takers. The nature of raters involved in predicting item difficulty is central to set credible standards.
View Article and Find Full Text PDFGMS J Med Educ
August 2024
Acuity Insights, Toronto, Canada.
Objectives: Situational Judgement Tests (SJT) are a cost-efficient method for the assessment of personal characteristics (e.g., empathy, professionalism, ethical thinking) in medical school admission.
View Article and Find Full Text PDFFront Psychol
July 2022
School of English Studies, Shanghai International Studies University, Shanghai, China.
In speaking assessment, many factors such as characteristics of test-takers, test tasks, rating bias, etc. may affect the speaking performance of test-takers. Besides, the stability of raters' rating of a speaking test might pose a threat to its reliability, validity, and fairness, which calls for longitudinal construct validation of the speaking test.
View Article and Find Full Text PDFHeliyon
March 2020
Open Way Solutions LLC, 36 Fruit Street, Northampton MA, 01060, USA.
We describe research on the validity of a new theoretical framework and scoring methodology, called STAGES, for sentence completion tests of meaning-making maturity or complexity (also called ego development or perspective-taking capacity). STAGES builds upon research on the substantially validated Washington University Sentence Completion Test of Jane Loevinger as updated by Susanne Cook-Greuter. STAGES proposes an underlying structural explanation for the Cook-Greuter system based on three dimensions.
View Article and Find Full Text PDFPeerJ Comput Sci
August 2019
Faculty of Computers and Information, Computer Science Department, Cairo University, Cairo, Egypt.
Background: Writing composition is a significant factor for measuring test-takers' ability in any language exam. However, the assessment (scoring) of these writing compositions or essays is a very challenging process in terms of reliability and time. The need for objective and quick scores has raised the need for a computer system that can automatically grade essay questions targeting specific prompts.
View Article and Find Full Text PDFEnter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!