Reliability and Task Effects in CAPE-V Auditory-Perceptual Voice Assessments: Insights From the PVQD Subset.

J Voice

École d'orthophonie et d'audiologie, Faculté de médecine, Université de Montréal, Montréal, Québec, Canada.

Published: March 2025

Objectives: This study aimed to evaluate the inter- and intra-rater reliability of consensus auditory-perceptual evaluation of voice (CAPE-V) auditory-perceptual ratings and explore task-specific differences (sustained vowels versus sentences) in ratings and reliability.

Study Design: Cross-sectional reliability study using a curated subset of dysphonic voice samples (PVQD).

Methods: Thirty voice samples representing varying dysphonia severities were selected from the Perceptual Voice Qualities Database. Eight Quebecois speech-language pathologists (SLPs) rated the samples using the CAPE-V protocol on the Bridge2Practice platform. Ratings included six vocal features on a visual analog scale (VAS) and binary consistency (C/I) judgments. Reliability was assessed using intra-class correlation coefficients (ICCs) for VAS ratings and Gwet's AC1 for C/I ratings. Task effects were analyzed using Wilcoxon signed-rank tests and Spearman correlations.

Results: Overall severity ratings demonstrated good inter-rater reliability for both vowels (ICC = 0.79) and sentences (ICC = 0.87). Pitch and loudness ratings showed low inter-rater reliability (ICCs < 0.5) across tasks. Vowels were rated as more impaired for most features, except strain, which showed higher impairment on sentences. Inter-rater reliability was higher for roughness and breathiness on vowels, whereas strain showed better reliability on sentences. Intra-rater reliability was consistently higher on sentences for all features (ICCs > 0.75 for most). Consistency ratings were more reliable on vowels than sentences for most features, except loudness.

Conclusions: Task type significantly impacts CAPE-V ratings and their reliability. Vowels provided higher inter-rater reliability for roughness and breathiness, while sentences yielded better intra-rater consistency and strain reliability. These findings highlight the need for ongoing refinement of assessment tools and training protocols to ensure accurate and reliable voice evaluations.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jvoice.2025.02.020DOI Listing

Publication Analysis

Top Keywords

inter-rater reliability
12
reliability
9
ratings
9
task effects
8
cape-v auditory-perceptual
8
voice samples
8
reliability vowels
8
voice
6
reliability task
4
cape-v
4

Similar Publications

Early diagnosis of muscle wasting in critically ill patients with acute kidney injury requiring continuous kidney replacement therapy (AKI-CKRT) may improve outcomes timely rehabilitation and nutrition. Muscle ultrasound (MUS) has recently gained traction for assessing muscle atrophy in the intensive care unit (ICU) but requires training to achieve reproducibility. We evaluated the inter-rater reliability of MUS in patients with AKI-CKRT performed by multidisciplinary raters including nephrologists.

View Article and Find Full Text PDF

Purpose: To evaluate the reliability and validity of the Chinese version of the Stroke Upper Limb Capacity Scale (SULCS) compared to the Wolf Motor Function Test (WMFT) and the Fugl-Meyer Assessment-Upper Extremity (FMA-UE).

Materials And Methods: Patients with hemiplegia after stroke ( = 60). The intraclass correlation coefficient (ICC) and corresponding 95% confidence intervals (CI) were used to explore the reliability, and Spearman's correlation coefficient was used to analyze validity.

View Article and Find Full Text PDF

Introduction: This study aims to assess the feasibility, acceptability, and validity of EQ-5D instrument administration methods and proxy selection for evaluating health-related quality of life (HRQoL) in dementia populations. EQ-5D is a widely used measure of HRQoL and is recommended by the National Institute for Health and Care Excellence for cost-effectiveness analyses of health interventions.

Methods: Individual-level data from three trials were analyzed separately to evaluate missing data rates, inter-rater agreement, responsiveness, and predictors of EQ-5D (EQ-5D-3L and EQ-5D-5L) dimensions and index values.

View Article and Find Full Text PDF

There is a need for valid methods to evaluate young children's (4-8 years) psychological difficulties related to traumatic experiences. The Odense Child Trauma Screening (OCTS), developed by Danish researchers, is a play-based story stem assessment tool developed to screen for indicators of traumatization in young children. Just a few studies of the OCTS have been published so far.

View Article and Find Full Text PDF

Abstract: The Medical Research Council proposed a framework to develop and implement complex interventions in practice. How to adopt these interventions is described in recommendations of evidence-based clinical practice guidelines. Many factors may influence the complexity of a guideline.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!