Accurate inference of crowdsourcing properties when using efficient allocation strategies.

Sci Rep

Department of Mathematics and Statistics, University of Vermont, Burlington, VT, USA.

Published: April 2022

Allocation strategies improve the efficiency of crowdsourcing by decreasing the work needed to complete individual tasks accurately. However, these algorithms introduce bias by preferentially allocating workers onto easy tasks, leading to sets of completed tasks that are no longer representative of all tasks. This bias challenges inference of problem-wide properties such as typical task difficulty or crowd properties such as worker completion times, important information that goes beyond the crowd responses themselves. Here we study inference about problem properties when using an allocation algorithm to improve crowd efficiency. We introduce Decision-Explicit Probability Sampling (DEPS), a novel method to perform inference of problem properties while accounting for the potential bias introduced by an allocation strategy. Experiments on real and synthetic crowdsourcing data show that DEPS outperforms baseline inference methods while still leveraging the efficiency gains of the allocation method. The ability to perform accurate inference of general properties when using non-representative data allows crowdsourcers to extract more knowledge out of a given crowdsourced dataset.

Download full-text PDF	Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9046272	PMC
http://dx.doi.org/10.1038/s41598-022-10794-9	DOI Listing

Publication Analysis

Top Keywords

accurate inference

allocation strategies

inference problem

problem properties

properties

allocation

inference

inference crowdsourcing

crowdsourcing properties

properties efficient

Similar Publications

Robust RNA secondary structure prediction with a mixture of deep learning and physics-based experts.

Biol Methods Protoc

January 2025

Department of Physics, George Washington University, Washington, DC 20052, United States.

Xiangyun Qiu

A mixture-of-experts (MoE) approach has been developed to mitigate the poor out-of-distribution (OOD) generalization of deep learning (DL) models for single-sequence-based prediction of RNA secondary structure. The main idea behind this approach is to use DL models for in-distribution (ID) test sequences to leverage their superior ID performances, while relying on physics-based models for OOD sequences to ensure robust predictions. One key ingredient of the pipeline, named MoEFold2D, is automated ID/OOD detection via consensus analysis of an ensemble of DL model predictions without requiring access to training data during inference.

View Article and Find Full Text PDF

Similar Publications

Bayesian fractional polynomial approach to quantile regression and variable selection with application in the analysis of blood pressure among US adults.

J Appl Stat

May 2024

Department of Mathematics, Brunel University London, Uxbridge, UK.

Sanna Soomro Keming Yu

Although the fractional polynomials (FPs) can act as a concise and accurate formula for examining smooth relationships between response and predictors, modelling conditional mean functions observes the partial view of a distribution of response variable, as distributions of many response variables such as blood pressure (BP) measures are typically skew. Conditional quantile functions with FPs provide a comprehensive relationship between the response variable and its predictors, such as median and extremely high-BP measures that may be often required in practical data analysis generally. To the best of our knowledge, this is new in the literature.

View Article and Find Full Text PDF

Similar Publications

Data-integration with pseudoweights and survey-calibration: application to developing US-representative lung cancer risk models for use in screening.

J R Stat Soc Ser A Stat Soc

January 2025

Division of Cancer Epidemiology & Genetics, National Cancer Institute, Biostatistics Branch, Rockville, USA.

Lingxiao Wang Yan Li Barry I Graubard Hormuzd A Katki

Accurate cancer risk estimation is crucial to clinical decision-making, such as identifying high-risk people for screening. However, most existing cancer risk models incorporate data from epidemiologic studies, which usually cannot represent the target population. While population-based health surveys are ideal for making inference to the target population, they typically do not collect time-to-cancer incidence data.

View Article and Find Full Text PDF

Similar Publications

High genomic connectivity within Anatoma at hydrothermal vents along the Central and Southeast Indian Ridge.

Sci Rep

January 2025

INES Integrated Environmental Solutions UG, Wilhelmshaven, Germany.

Katharina Kniesz Leon Hoffman Pedro Martínez Arbizu Terue C Kihara

Hydrothermal vents are ecosystems inhabited by a highly specialized fauna. To date, more than 30 gastropod species have been recorded from vent fields along the Central and Southeast Indian Ridge and all of them are assumed to be vent-endemic. During the INDEX project, 701 representatives of the genus Anatoma (Mollusca: Vetigastropoda) were sampled from six abyssal hydrothermal vent fields.

View Article and Find Full Text PDF

Similar Publications

JTF-SqueezeNet: A SqueezeNet network based on joint time-frequency data representation for egg-laying detection in individually caged ducks.

Poult Sci

January 2025

College of Mathematics Informatics, South China Agricultural University, Guangzhou 510642, China; Key Laboratory of Smart Agricultural Technology in Tropical South China, Ministry of Agriculture and Rural Affairs, Guangzhou 510642, China; Guangdong Engineering Research Center of Agricultural Big Data, Guangzhou 510642, China. Electronic address:

Siting Lv Yuanyang Mao Youfu Liu Yigui Huang Dakang Guo

Accurate individual egg-laying detection is crucial for eliminating low-yielding breeder ducks and improving production efficiency. However, existing methods are often expensive and require strict environmental conditions. This study proposes a data processing method based on wearable sensors and joint time-frequency representation (TFR), aimed at accurately identifying egg-laying in ducks.

View Article and Find Full Text PDF

Similar Publications

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!