Informative cluster size in cluster-randomised trials: A case study from the TRIGGER trial.

Clin Trials

Department of Biostatistics, Epidemiology & Informatics, Perelman School of Medicine, University of Pennsylvania, Philadelphia, PA, USA.

Published: December 2023

Background: Recent work has shown that cluster-randomised trials can estimate two distinct estimands: the participant-average and cluster-average treatment effects. These can differ when participant outcomes or the treatment effect depends on the cluster size (termed informative cluster size). In this case, estimators that target one estimand (such as the analysis of unweighted cluster-level summaries, which targets the cluster-average effect) may be biased for the other. Furthermore, commonly used estimators such as mixed-effects models or generalised estimating equations with an exchangeable correlation structure can be biased for both estimands. However, there has been little empirical research into whether informative cluster size is likely to occur in practice.

Method: We re-analysed a cluster-randomised trial comparing two different thresholds for red blood cell transfusion in patients with acute upper gastrointestinal bleeding to explore whether estimates for the participant- and cluster-average effects differed, to provide empirical evidence for whether informative cluster size may be present. For each outcome, we first estimated a participant-average effect using independence estimating equations, which are unbiased under informative cluster size. We then compared this to two further methods: (1) a cluster-average effect estimated using either weighted independence estimating equations or unweighted cluster-level summaries, and (2) estimates from a mixed-effects model or generalised estimating equations with an exchangeable correlation structure. We then performed a small simulation study to evaluate whether observed differences between cluster- and participant-average estimates were likely to occur even if no informative cluster size was present.

Results: For most outcomes, treatment effect estimates from different methods were similar. However, differences of >10% occurred between participant- and cluster-average estimates for 5 of 17 outcomes (29%). We also observed several notable differences between estimates from mixed-effects models or generalised estimating equations with an exchangeable correlation structure and those based on independence estimating equations. For example, for the EQ-5D VAS score, the independence estimating equation estimate of the participant-average difference was 4.15 (95% confidence interval: -3.37 to 11.66), compared with 2.84 (95% confidence interval: -7.37 to 13.04) for the cluster-average independence estimating equation estimate, and 3.23 (95% confidence interval: -6.70 to 13.16) from a mixed-effects model. Similarly, for thromboembolic/ischaemic events, the independence estimating equation estimate for the participant-average odds ratio was 0.43 (95% confidence interval: 0.07 to 2.48), compared with 0.33 (95% confidence interval: 0.06 to 1.77) from the cluster-average estimator.

Conclusion: In this re-analysis, we found that estimates from the various approaches could differ, which may be due to the presence of informative cluster size. Careful consideration of the estimand and the plausibility of assumptions underpinning each estimator can help ensure an appropriate analysis methods are used. Independence estimating equations and the analysis of cluster-level summaries (with appropriate weighting for each to correspond to either the participant-average or cluster-average treatment effect) are a desirable choice when informative cluster size is deemed possible, due to their unbiasedness in this setting.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC10638852PMC
http://dx.doi.org/10.1177/17407745231186094DOI Listing

Publication Analysis

Top Keywords

cluster size
36
informative cluster
32
estimating equations
28
independence estimating
28
95% confidence
20
confidence interval
20
cluster-level summaries
12
generalised estimating
12
equations exchangeable
12
exchangeable correlation
12

Similar Publications

Summary: With the increased reliance on multi-omics data for bulk and single cell analyses, the availability of robust approaches to perform unsupervised learning for clustering, visualization, and feature selection is imperative. We introduce nipalsMCIA, an implementation of multiple co-inertia analysis (MCIA) for joint dimensionality reduction that solves the objective function using an extension to Non-linear Iterative Partial Least Squares (NIPALS). We applied nipalsMCIA to both bulk and single cell datasets and observed significant speed-up over other implementations for data with a large sample size and/or feature dimension.

View Article and Find Full Text PDF

Background: The observed growth variability of different aquaculture species in captivity hinders its large-scale production. For the sandfish Holothuria scabra, a tropical sea cucumber species, there is a scarcity of information on its intestinal microbiota in relation to host growth, which could provide insights into the processes that affect growth and identify microorganisms with probiotic or biochemical potential that could improve current production strategies. To address this gap, this study used 16 S rRNA amplicon sequencing to characterize differences in gut and fecal microbiota among large and small juveniles reared in floating ocean nurseries.

View Article and Find Full Text PDF

Genome-wide association study of rice (Oryza sativa L.) inflorescence architecture.

Plant Sci

January 2025

Department of Cell & Molecular Biology, Faculty of Life Sciences & Biotechnology, Shahid Beheshti University, Tehran, Iran.

Rice yield strongly depends on panicle size and architecture but the genetics underlying these traits and their coordination with environmental cues through various signaling pathways have remained elusive. A genome-wide association study (GWAS) was performed to pinpoint the underlying genetic determinants for rice panicle architecture by analyzing 20 panicle-related traits using a data set consisting of 44,100 SNPs. We defined QTL windows around significant SNPs by the rate of LD decay for each chromosome and used these windows to identify putative candidate genes associated with the trait.

View Article and Find Full Text PDF

Background: Friedreich ataxia is a rare neurodegenerative disorder caused by frataxin deficiency. Both underweight and overweight occur in mitochondrial disorders, each with adverse health outcomes. We investigated the longitudinal evolution of anthropometric abnormalities in Friedreich ataxia and the hypothesis that both weight loss and weight gain are associated with faster disease progression.

View Article and Find Full Text PDF

Background: The recent Movement Disorders Society (MDS)-progressive supranuclear palsy (PSP) diagnostic criteria conceptualized three clinical diagnostic certainty levels: "suggestive of PSP" for sensitive early diagnosis based on subtle clinical signs, "possible PSP" balancing sensitivity and specificity, and "probable PSP" highly specific for PSP pathology.

Objective: The aim of this study was to prospectively validate the criteria against long-term clinical follow-up and characterize the diagnostic certainty increase over time.

Methods: Patients with "possible PSP" or "suggestive of PSP" diagnosis and clinical follow-up were recruited in two German multicenter longitudinal observational studies (ProPSP and DescribePSP).

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!