Vaccine
November 2022
The southern cattle fever tick (SCFT) Rhipicephalus (Boophilus) microplus, is considered the most important ectoparasite of livestock in the world because of high financial losses associated with direct feeding and transmission of the hemoparasites Babesia bovis, B. bigemina, and Anaplasma marginale. Unfortunately, SCFT in many parts of the world have evolved resistance to all market-available pesticides thus driving development of new control technologies.
View Article and Find Full Text PDFLabeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of-the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations.
View Article and Find Full Text PDFProc ACM SIGMOD Int Conf Manag Data
January 2019
Labeling training data is one of the most costly bottlenecks in developing machine learning-based applications. We present a first-of-its-kind study showing how existing knowledge resources from across an organization can be used as weak supervision in order to bring development time and cost down by an order of magnitude, and introduce Snorkel DryBell, a new weak supervision management system for this setting. Snorkel DryBell builds on the Snorkel framework, extending it in three critical aspects: flexible, template-based ingestion of diverse organizational knowledge, cross-feature production serving, and scalable, sampling-free execution.
View Article and Find Full Text PDFProceedings VLDB Endowment
November 2017
Labeling training data is increasingly the largest bottleneck in deploying machine learning systems. We present Snorkel, a first-of-its-kind system that enables users to train state-of- the-art models without hand labeling any training data. Instead, users write labeling functions that express arbitrary heuristics, which can have unknown accuracies and correlations.
View Article and Find Full Text PDFCurating labeled training data has become the primary bottleneck in machine learning. Recent frameworks address this bottleneck with generative models to synthesize labels at scale from weak supervision sources. The generative model's dependency structure directly affects the quality of the estimated labels, but selecting a structure automatically without any labeled data is a distinct challenge.
View Article and Find Full Text PDFOne of the most important obstacles to deploying predictive models is the fact that humans do not understand and trust them. Knowing which variables are important in a model's prediction and how they are combined can be very powerful in helping people understand and trust automatic decision making systems. Here we propose interpretable decision sets, a framework for building predictive models that are highly accurate, yet also highly interpretable.
View Article and Find Full Text PDFUnknown compounds in environmental samples are difficult to identify using standard mass spectrometric methods. Fourier transform mass spectrometry (FTMS) has revolutionized how environmental analyses are performed. With its unsurpassed mass accuracy, high resolution and sensitivity, researchers now have a tool for difficult and complex environmental analyses.
View Article and Find Full Text PDFBull World Health Organ
August 2004