We present a corpus of 5,000 richly annotated abstracts of medical articles describing clinical randomized controlled trials. Annotations include demarcations of text spans that describe the Patient population enrolled, the Interventions studied and to what they were Compared, and the Outcomes measured (the 'PICO' elements). These spans are further annotated at a more granular level, e.g., individual interventions within them are marked and mapped onto a structured medical vocabulary. We acquired annotations from a diverse set of workers with varying levels of expertise and cost. We describe our data collection process and the corpus itself in detail. We then outline a set of challenging NLP tasks that would aid searching of the medical literature and the practice of evidence-based medicine.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC6174533PMC

Publication Analysis

Top Keywords

medical literature
8
corpus multi-level
4
multi-level annotations
4
annotations patients
4
patients interventions
4
interventions outcomes
4
outcomes support
4
support language
4
language processing
4
medical
4

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!