Background: Assessment of the quality of medical evidence available on the web is a critical step in the preparation of systematic reviews. Existing tools that automate parts of this task validate the quality of individual studies but not of entire bodies of evidence and focus on a restricted set of quality criteria.

Objective: We proposed a quality assessment task that provides an overall quality rating for each body of evidence (BoE), as well as finer-grained justification for different quality criteria according to the Grading of Recommendation, Assessment, Development, and Evaluation formalization framework. For this purpose, we constructed a new data set and developed a machine learning baseline system (EvidenceGRADEr).

Methods: We algorithmically extracted quality-related data from all summaries of findings found in the Cochrane Database of Systematic Reviews. Each BoE was defined by a set of population, intervention, comparison, and outcome criteria and assigned a quality grade (high, moderate, low, or very low) together with quality criteria (justification) that influenced that decision. Different statistical data, metadata about the review, and parts of the review text were extracted as support for grading each BoE. After pruning the resulting data set with various quality checks, we used it to train several neural-model variants. The predictions were compared against the labels originally assigned by the authors of the systematic reviews.

Results: Our quality assessment data set, Cochrane Database of Systematic Reviews Quality of Evidence, contains 13,440 instances, or BoEs labeled for quality, originating from 2252 systematic reviews published on the internet from 2002 to 2020. On the basis of a 10-fold cross-validation, the best neural binary classifiers for quality criteria detected risk of bias at 0.78 F (P=.68; R=0.92) and imprecision at 0.75 F (P=.66; R=0.86), while the performance on inconsistency, indirectness, and publication bias criteria was lower (F in the range of 0.3-0.4). The prediction of the overall quality grade into 1 of the 4 levels resulted in 0.5 F. When casting the task as a binary problem by merging the Grading of Recommendation, Assessment, Development, and Evaluation classes (high+moderate vs low+very low-quality evidence), we attained 0.74 F. We also found that the results varied depending on the supporting information that is provided as an input to the models.

Conclusions: Different factors affect the quality of evidence in the context of systematic reviews of medical evidence. Some of these (risk of bias and imprecision) can be automated with reasonable accuracy. Other quality dimensions such as indirectness, inconsistency, and publication bias prove more challenging for machine learning, largely because they are much rarer. This technology could substantially reduce reviewer workload in the future and expedite quality assessment as part of evidence synthesis.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10131699PMC
http://dx.doi.org/10.2196/35568DOI Listing

Publication Analysis

Top Keywords

systematic reviews
24
quality
17
quality assessment
16
medical evidence
12
quality criteria
12
data set
12
evidence
9
set quality
8
grading recommendation
8
recommendation assessment
8

Similar Publications

The benefits and drawbacks of music in the operating room.

J Visc Surg

January 2025

Digestive Surgery, groupe hospitalier Diaconesses Croix Saint-Simon, 125, rue d'Avron, 75020 Paris, France.

Introduction: A significant proportion of surgeons listen to music in the operating room (MOR) during a surgical procedure. Over the last twenty years, this practice has been widely studied. The aim of this article is to analyze the effects of MOR on surgical performance, postoperative history and, more generally, on communication in the operating theater.

View Article and Find Full Text PDF

Voice Quality as Digital Biomarker in Bipolar Disorder: A Systematic Review.

J Voice

January 2025

Department of Surgery, UMONS Research Institute for Health Sciences and Technology, University of Mons (UMons), Mons, Belgium; Division of Laryngology and Bronchoesophagology, Department of Otolaryngology Head Neck Surgery, EpiCURA Hospital, Baudour, Belgium; Department of Otolaryngology-Head and Neck Surgery, Foch Hospital, School of Medicine, UFR Simone Veil, Université Versailles Saint-Quentin-en-Yvelines (Paris Saclay University), Paris, France; Department of Otolaryngology, Elsan Hospital, Paris, France. Electronic address:

Background: Voice analysis has emerged as a potential biomarker for mood state detection and monitoring in bipolar disorder (BD). The systematic review aimed to summarize the evidence for voice analysis applications in BD, examining (1) the predictive validity of voice quality outcomes for mood state detection, and (2) the correlation between voice parameters and clinical symptom scales.

Methods: A PubMed, Scopus, and Cochrane Library search was carried out by two investigators for publications investigating voice quality in BD according to Preferred Reporting Items for Systematic Reviews and Meta-Analyses (PRISMA) statements.

View Article and Find Full Text PDF

Objectives: To investigate the rate of, and factors affecting, acceptance of pharmacists' recommendations by medical prescribers following medication reviews conducted in non-hospitalized older adults.

Design: A systematic review and meta-analysis with meta-regression.

Setting And Participants: Older adults (mean age ≥55 years) residing in the community or in aged care facilities (ie, non-hospitalized) who had received an individualized medication review by a pharmacist.

View Article and Find Full Text PDF

Behind Closed Eyes: Understanding Nightmares in Children and Adolescents with Autism Spectrum Disorder - A Systematic Review.

Neurosci Biobehav Rev

January 2025

Department of Psychology, Sapienza University of Rome, Rome, Italy; Body and Action Lab, IRCCS Fondazione Santa Lucia, Rome, Italy. Electronic address:

Introduction: Brain and sleep development in childhood shapes emotional and cognitive growth, including the ability to recall dreams. In line with the continuity hypothesis of dreaming, several findings suggest a link between clinical symptoms and nightmare frequency. Sleep disorders and anxiety are among the most frequently co-occurring conditions in children and adolescents with autism spectrum disorder (ASD).

View Article and Find Full Text PDF

Expanding the practice placement capacity in pre-registration midwifery education: A scoping review.

Midwifery

December 2024

Health Research Institute, Department of Nursing and Midwifery, University of Limerick, Limerick, Ireland. Electronic address:

Background: Practice placements are an essential component of midwifery education, enabling students to apply their theoretical knowledge in a real-world midwifery setting. Exposure and immersion to practice is a core focus of midwifery education internationally. These placements are crucial for students to develop the skills and expertise needed to become safe, competent, and compassionate midwife practitioners.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!