Applying the Many-Facet Rasch Measurement Model to Explore Reviewer Ratings of Conference Proposals.

J Appl Meas

Michael R. Peabody, American Board of Family Medicine, 1648 McGrathiana Pkwy., Lexington, KY 40503, USA,

Published: February 2017

For academic conferences, when proposals are submit they are often judged using a rating scale on identified criterion by reviewers who have a shared interest and expertise in the area under consideration. Given the multiple and varied reviewers, an analysis of psychometric properties like rater severity and consistency are important. However, many of the problems that plague the conference proposal selection process are the same issues that plague survey research: rater bias/severity, misuse of rating scale, and the use of raw scores as measures. We propose the use of the many-facet Rasch measurement model (MFRM) to combat these shortcomings and improve the quality of the conference proposal selection process. A set of American Educational Research Association (AERA) Special Interest Group (SIG) proposals is used as an example. The results identify proposals that were accepted based on calculating the mean of summed raw scores, but when MFRM is applied to adjust for judge severity the rank order of the proposals is substantially altered.

Download full-text PDF

Source

Publication Analysis

Top Keywords

many-facet rasch
8
rasch measurement
8
measurement model
8
rating scale
8
conference proposal
8
proposal selection
8
selection process
8
raw scores
8
proposals
5
applying many-facet
4

Similar Publications

Introduction: Ensuring examiner equivalence across distributed assessment locations is a priority within distributed Objective Structured Clinical Exams (OSCEs) but is challenging as examiners are typically fully nested within locations (i.e. no overlap in performances seen by different groups of examiners).

View Article and Find Full Text PDF

In assessing performance-based language assessment, the use of a valid and reliable scoring rubric is crucial to minimize measurement errors that become threats in the rating process. The validation process of rubrics which previously was based on qualitative data is not satisfying since empirical evidence is not present. Thus, drawing on the Assessment Use Argument (AUA) Validation Framework, this study aims to search for evidence to prove a claim which is Primary Trait Writing (PTW) rubrics for students self assessment activities are relevant to the construct being measured.

View Article and Find Full Text PDF

Introduction: The COVID-19 pandemic necessitated rapid adaptation of clinical competence assessments, including the transition of Objective Structured Clinical Examinations (OSCE) from in-person to virtual formats. This study investigates the construct equivalence of a high-stakes OSCE, originally designed for in-person delivery, when adapted for a virtual format.

Methods: A retrospective analysis was conducted using OSCE scores from the Internationally Educated Nurse Competency Assessment Program (IENCAP®).

View Article and Find Full Text PDF

Objective structured clinical examinations (OSCEs) are a widely used performance assessment for medical and dental students. A common limitation of OSCEs is that the evaluation results depend on the characteristics of raters and a scoring rubric. To overcome this limitation, item response theory (IRT) models such as the many-facet Rasch model have been proposed to estimate examinee abilities while taking into account the characteristics of raters and evaluation items in a rubric.

View Article and Find Full Text PDF

Influence of pairing in examiner leniency and stringency ('hawk-dove effect') in part II of the European Diploma of Anaesthesiology and Intensive Care: A cohort study.

Eur J Anaesthesiol

December 2024

From the Department of Anaesthesia, ITU and Pain Management, Mater Dei Hospital, Msida, Malta (SS), Department of Anaesthesiology, Erasmus University Medical Centre, Rotterdam, the Netherlands (MK), European Society of Anaesthesiology and Intensive Care, Brussels, Belgium (MK, BA, HS, RDL, JBE), Department of Anaesthesia, University Hospital of Wales, Cardiff, UK (BA), Institute for Medical Education, University of Bern, Bern, Switzerland (JBE), CINTESIS@RISE - Centre for Health Technology and Services Research, Porto, Portugal (JBE) and Institute of Anaesthesiology and Intensive Care, Salemspital, Hirslanden Medical Group, Bern, Switzerland (JBE).

Article Synopsis
  • The study examines the impact of examiner pairing on grading variances in the EDAIC Part II examination, focusing on the leniency and strictness of different examiner pairs.
  • Utilizing data from 325 examiners over three years, the research reveals that most examiner pairs had only slight differences in scoring, indicating a general consistency in leniency.
  • The findings highlight the potential 'hawk-dove effect', suggesting that different examiner combinations can significantly affect candidate performance and outcomes in the exam.
View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!