Machine learning to optimize literature screening in medical guideline development.

Syst Rev

Department of Methodology and Statistics, Faculty of Social and Behavioral Sciences, Utrecht University, Utrecht, The Netherlands.

Published: July 2024

Objectives: In a time of exponential growth of new evidence supporting clinical decision-making, combined with a labor-intensive process of selecting this evidence, methods are needed to speed up current processes to keep medical guidelines up-to-date. This study evaluated the performance and feasibility of active learning to support the selection of relevant publications within medical guideline development and to study the role of noisy labels.

Design: We used a mixed-methods design. Two independent clinicians' manual process of literature selection was evaluated for 14 searches. This was followed by a series of simulations investigating the performance of random reading versus using screening prioritization based on active learning. We identified hard-to-find papers and checked the labels in a reflective dialogue.

Main Outcome Measures: Inter-rater reliability was assessed using Cohen's Kappa (ĸ). To evaluate the performance of active learning, we used the Work Saved over Sampling at 95% recall (WSS@95) and percentage Relevant Records Found at reading only 10% of the total number of records (RRF@10). We used the average time to discovery (ATD) to detect records with potentially noisy labels. Finally, the accuracy of labeling was discussed in a reflective dialogue with guideline developers.

Results: Mean ĸ for manual title-abstract selection by clinicians was 0.50 and varied between - 0.01 and 0.87 based on 5.021 abstracts. WSS@95 ranged from 50.15% (SD = 17.7) based on selection by clinicians to 69.24% (SD = 11.5) based on the selection by research methodologist up to 75.76% (SD = 12.2) based on the final full-text inclusion. A similar pattern was seen for RRF@10, ranging from 48.31% (SD = 23.3) to 62.8% (SD = 21.20) and 65.58% (SD = 23.25). The performance of active learning deteriorates with higher noise. Compared with the final full-text selection, the selection made by clinicians or research methodologists deteriorated WSS@95 by 25.61% and 6.25%, respectively.

Conclusion: While active machine learning tools can accelerate the process of literature screening within guideline development, they can only work as well as the input given by human raters. Noisy labels make noisy machine learning.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11238391PMC
http://dx.doi.org/10.1186/s13643-024-02590-5DOI Listing

Publication Analysis

Top Keywords

active learning
16
machine learning
12
guideline development
12
selection clinicians
12
literature screening
8
medical guideline
8
process literature
8
performance active
8
noisy labels
8
based selection
8

Similar Publications

Objective: The aging population represents a formidable global challenge, with China experiencing an accelerated demographic shift. While previous research has established a directional link between mental health literacy, social participation, and active aging, the moderating effect of socioeconomic status (SES) on these associations remains underexplored. This study sought to address this gap by employing moderated network analysis, in contrast to the total score approaches commonly used in prior literature.

View Article and Find Full Text PDF

Background And Purpose: The purpose of reflection in the learning process is to create meaningful and deep learning. Considering the importance of emphasizing active and student-centered methods in learning and the necessity of learners' participation in the education process, the present study was conducted to investigate the effect of flipped classroom teaching method on the amount of reflection ability in nursing students and the course of professional ethics.

Study Method: The current study is a quasi-experimental study using Solomon's four-group method.

View Article and Find Full Text PDF

Background: Sleep is an active process that affects human health and quality of life. Sleep is essential for learning and memory consolidation. Good sleep is required for good academic performance.

View Article and Find Full Text PDF

Background: Case-Based Learning (CBL) and Problem-Based Learning (PBL) are popular methods in medical education. However, we do not fully understand how they affect the clinical thinking skills of Assistant General Practitioner (AGP) trainees. This randomised controlled trial aimed to assess the effectiveness of combining CBL and PBL and compare their impact on the clinical thinking skills of AGP trainees with that of traditional lecture-based learning (LBL).

View Article and Find Full Text PDF

Background: Three-dimensional (3D) visualization has become increasingly prevalent in orthopedic education to tackle the distinct anatomical challenges of the field. However, there is a conspicuous lack of systematic reviews that thoroughly evaluate both the advantages and drawbacks of integrating 3D with problem-based learning (3D + PBL).

Methods: A rigorous search of English databases (Cochrane Library, Embase, PubMed, Scopus, and Web of Science) and Chinese databases (National Knowledge Infrastructure: CNKI, Chongqing VIP: VIP, and Wan Fang) were performed up to July 2024 to identify relevant studies.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!