Accurate inference of crowdsourcing properties when using efficient allocation strategies.

Sci Rep

Department of Mathematics and Statistics, University of Vermont, Burlington, VT, USA.

Published: April 2022

Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual tasks accurately. However, these algorithms introduce bias by preferentially allocating workers onto easy tasks, leading to sets of completed tasks that are no longer representative of all tasks. This bias challenges inference of problem-wide properties such as typical task difficulty or crowd properties such as worker completion times, important information that goes beyond the crowd responses themselves. Here we study inference about problem properties when using an allocation algorithm to improve crowd efficiency. We introduce Decision-Explicit Probability Sampling (DEPS), a novel method to perform inference of problem properties while accounting for the potential bias introduced by an allocation strategy. Experiments on real and synthetic crowdsourcing data show that DEPS outperforms baseline inference methods while still leveraging the efficiency gains of the allocation method. The ability to perform accurate inference of general properties when using non-representative data allows crowdsourcers to extract more knowledge out of a given crowdsourced dataset.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9046272PMC
http://dx.doi.org/10.1038/s41598-022-10794-9DOI Listing

Publication Analysis

Top Keywords

accurate inference
8
allocation strategies
8
inference problem
8
problem properties
8
properties
6
allocation
5
inference
5
inference crowdsourcing
4
crowdsourcing properties
4
properties efficient
4

Similar Publications

Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts.

Biol Methods Protoc

January 2025

Department of Physics, George Washington University, Washington, DC 20052, United States.

A mixture-of-experts (MoE) approach has been developed to mitigate the poor out-of-distribution (OOD) generalization of deep learning (DL) models for single-sequence-based prediction of RNA secondary structure. The main idea behind this approach is to use DL models for in-distribution (ID) test sequences to leverage their superior ID performances, while relying on physics-based models for OOD sequences to ensure robust predictions. One key ingredient of the pipeline, named MoEFold2D, is automated ID/OOD detection via consensus analysis of an ensemble of DL model predictions without requiring access to training data during inference.

View Article and Find Full Text PDF

Although the fractional polynomials (FPs) can act as a concise and accurate formula for examining smooth relationships between response and predictors, modelling conditional mean functions observes the partial view of a distribution of response variable, as distributions of many response variables such as blood pressure (BP) measures are typically skew. Conditional quantile functions with FPs provide a comprehensive relationship between the response variable and its predictors, such as median and extremely high-BP measures that may be often required in practical data analysis generally. To the best of our knowledge, this is new in the literature.

View Article and Find Full Text PDF

Accurate cancer risk estimation is crucial to clinical decision-making, such as identifying high-risk people for screening. However, most existing cancer risk models incorporate data from epidemiologic studies, which usually cannot represent the target population. While population-based health surveys are ideal for making inference to the target population, they typically do not collect time-to-cancer incidence data.

View Article and Find Full Text PDF

Hydrothermal vents are ecosystems inhabited by a highly specialized fauna. To date, more than 30 gastropod species have been recorded from vent fields along the Central and Southeast Indian Ridge and all of them are assumed to be vent-endemic. During the INDEX project, 701 representatives of the genus Anatoma (Mollusca: Vetigastropoda) were sampled from six abyssal hydrothermal vent fields.

View Article and Find Full Text PDF

JTF-SqueezeNet: A SqueezeNet network based on joint time-frequency data representation for egg-laying detection in individually caged ducks.

Poult Sci

January 2025

College of Mathematics Informatics, South China Agricultural University, Guangzhou 510642, China; Key Laboratory of Smart Agricultural Technology in Tropical South China, Ministry of Agriculture and Rural Affairs, Guangzhou 510642, China; Guangdong Engineering Research Center of Agricultural Big Data, Guangzhou 510642, China. Electronic address:

Accurate individual egg-laying detection is crucial for eliminating low-yielding breeder ducks and improving production efficiency. However, existing methods are often expensive and require strict environmental conditions. This study proposes a data processing method based on wearable sensors and joint time-frequency representation (TFR), aimed at accurately identifying egg-laying in ducks.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!