Surveys are a crucial tool for understanding public opinion and behaviour, and their accuracy depends on maintaining statistical representativeness of their target populations by minimizing biases from all sources. Increasing data size shrinks confidence intervals but magnifies the effect of survey bias: an instance of the Big Data Paradox. Here we demonstrate this paradox in estimates of first-dose COVID-19 vaccine uptake in US adults from 9 January to 19 May 2021 from two large surveys: Delphi-Facebook (about 250,000 responses per week) and Census Household Pulse (about 75,000 every two weeks).
View Article and Find Full Text PDFWe propose a nonparametric test of independence, termed optHSIC, between a covariate and a right-censored lifetime. Because the presence of censoring creates a challenge in applying the standard permutation-based testing approaches, we use optimal transport to transform the censored dataset into an uncensored one, while preserving the relevant dependencies. We then apply a permutation test using the kernel-based dependence measure as a statistic to the transformed dataset.
View Article and Find Full Text PDFIdentifying causal relationships and quantifying their strength from observational time series data are key problems in disciplines dealing with complex dynamical systems such as the Earth system or the human body. Data-driven causal inference in such systems is challenging since datasets are often high dimensional and nonlinear with limited sample sizes. Here, we introduce a novel method that flexibly combines linear or nonlinear conditional independence tests with a causal discovery algorithm to estimate causal networks from large-scale time series datasets.
View Article and Find Full Text PDFThe use of covariance kernels is ubiquitous in the field of spatial statistics. Kernels allow data to be mapped into high-dimensional feature spaces and can thus extend simple linear additive methods to nonlinear methods with higher order interactions. However, until recently, there has been a strong reliance on a limited class of stationary kernels such as the Matérn or squared exponential, limiting the expressiveness of these modelling approaches.
View Article and Find Full Text PDFElectrophysiological data disclose rich dynamics in patterns of neural activity evoked by sensory objects. Retrieving objects from memory reinstates components of this activity. In humans, the temporal structure of this retrieved activity remains largely unexplored, and here we address this gap using the spatiotemporal precision of magnetoencephalography (MEG).
View Article and Find Full Text PDF