Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of-the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations. Snorkel denoises their outputs without access to ground truth by incorporating the first end-to-end implementation of our recently proposed machine learning paradigm, data programming. We present a flexible interface layer for writing labeling functions based on our experience over the past year collaborating with companies, agencies, and research laboratories. In a user study, subject matter experts build models faster and increase predictive performance an average versus seven hours of hand labeling. We study the modeling trade-offs in this new setting and propose an optimizer for automating trade-off decisions that gives up to speedup per pipeline execution. In two collaborations, with the US Department of Veterans Affairs and the US Food and Drug Administration, and on four open-source text and image data sets representative of other deployments, Snorkel provides average improvements to predictive performance over prior heuristic approaches and comes within an average of the predictive performance of large hand-curated training sets.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC7075849PMC
http://dx.doi.org/10.1007/s00778-019-00552-1DOI Listing

Publication Analysis

Top Keywords

training data
12
predictive performance
12
labeling training
8
machine learning
8
hand labeling
8
labeling functions
8
data
5
labeling
5
snorkel
4
snorkel rapid
4

Similar Publications

Background: Mental health remains among the top 10 leading causes of disease burden globally, and there is a significant treatment gap due to limited resources, stigma, limited accessibility, and low perceived need for treatment. Problem Management Plus, a World Health Organization-endorsed brief psychological intervention for mental health disorders, has been shown to be effective and cost-effective in various countries globally but faces implementation challenges, such as quality control in training, supervision, and delivery. While digital technologies to foster mental health care have the potential to close treatment gaps and address the issues of quality control, their development requires context-specific, interdisciplinary, and participatory approaches to enhance impact and acceptance.

View Article and Find Full Text PDF

This study examined the effect of lymphedema self-care patient school education on patient functionality, quality of life, body value, and lymphedema volume in patients with lower extremity lymphedema. The study utilized a single-group quasi-experimental design. The study sample included 21 patients with primary and secondary lower extremity lymphedema.

View Article and Find Full Text PDF

Access to trained lymphedema care providers remains limited making patient-driven management solutions essential. One such option, sequential intermittent pneumatic compression (IPC), has gained traction as a supportive tool for lymphedema management. While newer IPC devices and innovative applications are being introduced to the market, questions regarding the safety and efficacy of this technology persist.

View Article and Find Full Text PDF

Importance: Climate change can adversely affect mental health, but the association of ambient temperature with psychiatric symptoms remains poorly understood.

Objective: To assess the association of ambient temperature exposure with internalizing, externalizing, and attention problems in adolescents from 2 population-based birth cohorts in Europe.

Design, Setting, And Participants: This cohort study analyzed data from the Dutch Generation R Study and the Spanish INMA (Infancia y Medio Ambiente) Project.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!