Behavioral scientists have become increasingly reliant on online survey platforms such as Amazon's Mechanical Turk (Mturk). These platforms have many advantages, for example it provides ease of access to difficult to sample populations, a large pool of participants, and an easy to use implementation. A major drawback is the existence of bots that are used to complete online surveys for financial gain. These bots contaminate data and need to be identified in order to draw valid conclusions from data obtained with these platforms. In this article, we will provide a Bayesian latent class joint modeling approach that can be routinely applied to identify bots and simultaneously estimate a model of interest. This method can be used to separate the bots' response patterns from real human responses that were provided in line with the item content. The model has the advantage that it is very flexible and is based on plausible assumptions that are met in most empirical settings. We will provide a simulation study that investigates the performance of the model under several relevant scenarios including sample size, proportion of bots, and model complexity. We will show that ignoring bots will lead to severe parameter bias whereas the Bayesian latent class model results in unbiased estimates and thus controls this source of bias. We will illustrate the model and its capabilities with data from an empirical political ideation survey with known bots. We will discuss the implications of the findings with regard to future data collection online platforms.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC9093679PMC
http://dx.doi.org/10.3389/fpsyg.2022.789223DOI Listing

Publication Analysis

Top Keywords

bayesian latent
12
latent class
12
online surveys
8
will provide
8
bots will
8
bots
6
will
6
model
6
automated bot
4
bot detection
4

Similar Publications

Bayesian method for comparing F1 scores in the absence of a gold standard.

J Biopharm Stat

January 2025

Department of Biostatistics, School of Medicine, Yokohama City University, Yokohama, Japan.

In the field of medicine, evaluating the diagnostic performance of new diagnostic methods can be challenging, especially in the absence of a gold standard. This study proposes a methodology for assessing the performance of diagnostic tests by estimating the posterior distribution of the score using latent class analysis, without relying on a gold standard. The proposed method utilizes Markov Chain Monte Carlo sampling to estimate the posterior distribution of the score, enabling a comprehensive evaluation of diagnostic test methods.

View Article and Find Full Text PDF

Human schistosomiasis is a chronic neglected tropical disease caused by blood flukes of the genus Schistosoma, infecting 250 million people worldwide, mostly in sub-Saharan Africa. Recently, thousands of cases have been reported in immigrants to non-endemic countries, including Italy. Serological screening is recommended but so far, no accurate point-of-care (POC) and lab-free test is available.

View Article and Find Full Text PDF

The generation time, representing the interval between infections in primary and secondary cases, is essential for understanding and predicting the transmission dynamics of seasonal influenza, including the real-time effective reproduction number (Rt). However, comprehensive generation time estimates for seasonal influenza, especially since the 2009 influenza pandemic, are lacking. We estimated the generation time utilizing data from a 7-site case-ascertained household study in the United States over two influenza seasons, 2021/2022 and 2022/2023.

View Article and Find Full Text PDF

Estimation of Diagnostic Test Accuracy Without Gold Standards.

Stat Med

February 2025

Department of Biostatistics and Beijing International Center for Mathematical Research, Peking University, Beijing, China.

The ideal evaluation of diagnostic test performance requires a reference test that is free of errors. However, for many diseases, obtaining such a "gold standard" reference is either impossible or prohibitively expensive. Estimating test accuracy in the absence of a gold standard is therefore a significant challenge.

View Article and Find Full Text PDF

Data-driven discovery and parameter estimation of mathematical models in biological pattern formation.

PLoS Comput Biol

January 2025

Department of Anatomy and Cell Biology, Graduate School of Medical Sciences, Kyushu University, Fukuoka, Fukuoka, Japan.

Mathematical modeling has been utilized to explain biological pattern formation, but the selections of models and parameters have been made empirically. In the present study, we propose a data-driven approach to validate the applicability of mathematical models. Specifically, we developed methods to automatically select the appropriate mathematical models based on the patterns of interest and to estimate the model parameters.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!