PICO to PICOS: Weak Supervision to Extend Datasets with New Labels.

Stud Health Technol Inform

Informatics Institute, HES-SO Valais-Wallis, Switzerland.

Published: August 2024

Hand-labelling clinical corpora can be costly and inflexible, requiring re-annotation every time new classes need to be extracted. PICO (Participant, Intervention, Comparator, Outcome) information extraction can expedite conducting systematic reviews to answer clinical questions. However, PICO frequently extends to other entities such as Study type and design, trial context, and timeframe, requiring manual re-annotation of existing corpora. In this paper, we adapt Snorkel's weak supervision methodology to extend clinical corpora to new entities without extensive hand labelling. Specifically, we enrich the EBM-PICO corpus with new entities through an example of "Study type and design" extraction. Using weak supervision, we obtain programmatic labels on 4,081 EBM-PICO documents, achieving an F1-score of 85.02% on the test set.

Download full-text PDF

Source
http://dx.doi.org/10.3233/SHTI240775DOI Listing

Publication Analysis

Top Keywords

weak supervision
12
clinical corpora
8
pico picos
4
picos weak
4
supervision extend
4
extend datasets
4
datasets labels
4
labels hand-labelling
4
hand-labelling clinical
4
corpora costly
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!