Learning about individuals' health from aggregate data.

Annu Int Conf IEEE Eng Med Biol Soc

Published: July 2017

There is growing awareness that user-generated social media content contains valuable health-related information and is more convenient to collect than typical health data. For example, Twitter has been employed to predict aggregate-level outcomes, such as regional rates of diabetes and child poverty, and to identify individual cases of depression and food poisoning. Models which make aggregate-level inferences can be induced from aggregate data, and consequently are straightforward to build. In contrast, learning models that produce individual-level (IL) predictions, which are more informative, usually requires a large number of difficult-to-acquire labeled IL examples. This paper presents a new machine learning method which achieves the best of both worlds, enabling IL models to be learned from aggregate labels. The algorithm makes predictions by combining unsupervised feature extraction, aggregate-based modeling, and optimal integration of aggregate-level and IL information. Two case studies illustrate how to learn health-relevant IL prediction models using only aggregate labels, and show that these models perform as well as state-of-the-art models trained on hundreds or thousands of labeled individuals.

Download full-text PDF	Source
http://dx.doi.org/10.1109/EMBC.2017.8037514	DOI Listing

Publication Analysis

Top Keywords

aggregate data

aggregate labels

models

learning individuals'

individuals' health

aggregate

health aggregate

data growing

growing awareness

awareness user-generated

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!