Purpose: The forthcoming UK Medical Licensing Assessment will require all medical schools in the UK to ensure that their students pass an appropriately designed Clinical and Professional Skills Assessment (CPSA) prior to graduation and registration with a licence to practice medicine. The requirements for the CPSA will be set by the General Medical Council, but individual medical schools will be responsible for implementing their own assessments. It is therefore important that assessors from different medical schools across the UK agree on what standard of performance constitutes a fail, pass or good grade.

Methods: We used an experimental video-based, single-blinded, randomised, internet-based design. We created videos of simulated student performances of a clinical examination at four scripted standards: clear fail (CF), borderline (BD), clear pass (CPX) and good (GD). Assessors from ten regions across the UK were randomly assigned to watch five videos in 12 different combinations and asked to give competence domain scores and an overall global grade for each simulated candidate. The inter-rater agreement as measured by the intraclass correlation coefficient (ICC) based on a two-way random-effects model for absolute agreement was calculated for the total domain scores.

Results: 120 assessors enrolled in the study, with 98 eligible for analysis. The ICC was 0.93 (95% CI 0.81-0.99). The mean percentage agreement with the scripted global grade was 74.4% (range 40.8-96.9%).

Conclusions: The inter-rater agreement amongst assessors across the UK when rating simulated candidates performing at scripted levels is excellent. The level of agreement for the overall global performance level for simulated candidates is also high. These findings suggest that assessors from across the UK viewing the same simulated performances show high levels of agreement of the standards expected of students at a 'clear fail,' 'borderline,' 'clear pass' and 'good' level.

Download full-text PDF

Source
http://dx.doi.org/10.1080/0142159X.2020.1845909DOI Listing

Publication Analysis

Top Keywords

inter-rater agreement
12
medical schools
12
global grade
8
simulated candidates
8
agreement
7
medical
5
assessors
5
simulated
5
national inter-rater
4
agreement standardised
4

Similar Publications

Therapeutic quality of exercise interventions for chronic low back pain: a meta-research study using i-CONTENT tool.

BMJ Evid Based Med

January 2025

Department of Neurosciences, Rehabilitation, Ophthalmology, Genetic and Maternal Infantile Sciences (DINOGMI), University of Genova, Genova, Italy.

Objective: To assess the therapeutic quality of exercise interventions delivered in chronic low back pain (cLBP) trials using the international Consensus on Therapeutic Exercise aNd Training (i-CONTENT) tool and its inter-rater agreement.

Methods: We performed a meta-research study, starting from the trials' arms included in the published Cochrane review (2021) 'Exercise therapy for chronic low back pain'. Two pairs of independent reviewers applied the i-CONTENT tool, a standardised tool designed to ensure the quality of exercise therapy intervention, in a random sample of 100 different exercise arms.

View Article and Find Full Text PDF

The pulmonary contusion score: Development of a simple scoring system for blunt lung injury.

Surg Pract Sci

June 2024

Division of Acute Care Surgery, Department of Surgery, Kirk Kerkorian School of Medicine at UNLV, 1701W. Charleston Blvd, Suite 490, Las Vegas, NV 89102, United States.

Background: Pulmonary contusions (PC) are common after blunt chest trauma and can be identified with computed tomography (CT). Complex scoring systems for grading PC exist, however recent scoring systems rely on computer-generated algorithms that are not readily available at all hospitals. We developed a scoring system for grading PC to predict the need for prolonged mechanical ventilation and initial hospital admission location.

View Article and Find Full Text PDF

Objective: This study evaluated ResNet-50 and U-Net models for detecting and segmenting vertical misfit in dental implant crowns using periapical radiographic images.

Methods: Periapical radiographs of dental implant crowns were classified by two experts based on the presence of vertical misfit (reference group). The misfit area was manually annotated in images exhibiting vertical misfit.

View Article and Find Full Text PDF

ChatGPT and oral cancer: a study on informational reliability.

BMC Oral Health

January 2025

Faculty of Dentistry, Department of Dentomaxillofacial Radiology, Tokat Gaziosmanpasa University, Tokat, Turkey.

Background: Artificial intelligence (AI) and large language models (LLMs) like ChatGPT have transformed information retrieval, including in healthcare. ChatGPT, trained on diverse datasets, can provide medical advice but faces ethical and accuracy concerns. This study evaluates the accuracy of ChatGPT-3.

View Article and Find Full Text PDF

Objective: Coronal malalignment is a common feature of adult spinal deformity, and accurate classification is essential for diagnosis and treatment planning. However, variations in interpretation among clinicians can impact classification consistency. By assessing the reliability and applicability of these systems across different medical experts, this study seeks to establish a standardized approach to enhance clinical outcomes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!