In speaking assessment, many factors such as characteristics of test-takers, test tasks, rating bias, etc. may affect the speaking performance of test-takers. Besides, the stability of raters' rating of a speaking test might pose a threat to its reliability, validity, and fairness, which calls for longitudinal construct validation of the speaking test. This study explores the construct validity of PRETCO-Oral through analysis of data retrieved from various sources, including longitudinal ratings of performances of test-takers across four occasions, and perceptions of the construct of PRETCO-Oral from both raters and test-takers. The results indicate that raters' ratings keep stable and the PRETCO-Oral assessment is equipped with longitudinal reliability; tasks of Interpretation and Presentation represent a large amount of variance of the construct, while those of Reading Aloud and Question and Answer seem to be construct-underrepresented, as evidenced via analyzing the data collected from perceptions of raters and test-takers upon the test construct. Finally, factors that threaten the construct representation are also discussed.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9322943PMC
http://dx.doi.org/10.3389/fpsyg.2022.896453DOI Listing

Publication Analysis

Top Keywords

raters test-takers
12
construct pretco-oral
8
test-takers test
8
speaking test
8
test-takers
6
construct
6
interrogating construct
4
pretco-oral
4
longitudinal
4
pretco-oral longitudinal
4

Similar Publications

Background: The ability of an expert's item difficulty ratings to predict test-taker actual performance is an important aspect of licensure examinations. Expert judgment is used as a primary source of information for users to make prior decisions to determine the pass rate of test takers. The nature of raters involved in predicting item difficulty is central to set credible standards.

View Article and Find Full Text PDF

Objectives: Situational Judgement Tests (SJT) are a cost-efficient method for the assessment of personal characteristics (e.g., empathy, professionalism, ethical thinking) in medical school admission.

View Article and Find Full Text PDF

In speaking assessment, many factors such as characteristics of test-takers, test tasks, rating bias, etc. may affect the speaking performance of test-takers. Besides, the stability of raters' rating of a speaking test might pose a threat to its reliability, validity, and fairness, which calls for longitudinal construct validation of the speaking test.

View Article and Find Full Text PDF

We describe research on the validity of a new theoretical framework and scoring methodology, called STAGES, for sentence completion tests of meaning-making maturity or complexity (also called ego development or perspective-taking capacity). STAGES builds upon research on the substantially validated Washington University Sentence Completion Test of Jane Loevinger as updated by Susanne Cook-Greuter. STAGES proposes an underlying structural explanation for the Cook-Greuter system based on three dimensions.

View Article and Find Full Text PDF

Automated language essay scoring systems: a literature review.

PeerJ Comput Sci

August 2019

Faculty of Computers and Information, Computer Science Department, Cairo University, Cairo, Egypt.

Background: Writing composition is a significant factor for measuring test-takers' ability in any language exam. However, the assessment (scoring) of these writing compositions or essays is a very challenging process in terms of reliability and time. The need for objective and quick scores has raised the need for a computer system that can automatically grade essay questions targeting specific prompts.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!