AI Article Synopsis

  • Missing data is common in observational studies, especially affecting key variables needed for estimating causal effects; this study reviews how such data is managed, focusing on multiple imputation (MI) approaches.
  • Analyzing observational studies published from 2019 to 2021 across five top epidemiology journals, the review found that a high percentage of studies excluded missing data or had multivariable missingness, complicating the validity of their findings.
  • Although many studies reported MI methods and software used, only a minority clearly justified their missing data assumptions, raising concerns about the transparency and robustness of their analyses.

Article Abstract

Background: Missing data are common in observational studies and often occur in several of the variables required when estimating a causal effect, i.e. the exposure, outcome and/or variables used to control for confounding. Analyses involving multiple incomplete variables are not as straightforward as analyses with a single incomplete variable. For example, in the context of multivariable missingness, the standard missing data assumptions ("missing completely at random", "missing at random" [MAR], "missing not at random") are difficult to interpret and assess. It is not clear how the complexities that arise due to multivariable missingness are being addressed in practice. The aim of this study was to review how missing data are managed and reported in observational studies that use multiple imputation (MI) for causal effect estimation, with a particular focus on missing data summaries, missing data assumptions, primary and sensitivity analyses, and MI implementation.

Methods: We searched five top general epidemiology journals for observational studies that aimed to answer a causal research question and used MI, published between January 2019 and December 2021. Article screening and data extraction were performed systematically.

Results: Of the 130 studies included in this review, 108 (83%) derived an analysis sample by excluding individuals with missing data in specific variables (e.g., outcome) and 114 (88%) had multivariable missingness within the analysis sample. Forty-four (34%) studies provided a statement about missing data assumptions, 35 of which stated the MAR assumption, but only 11/44 (25%) studies provided a justification for these assumptions. The number of imputations, MI method and MI software were generally well-reported (71%, 75% and 88% of studies, respectively), while aspects of the imputation model specification were not clear for more than half of the studies. A secondary analysis that used a different approach to handle the missing data was conducted in 69/130 (53%) studies. Of these 69 studies, 68 (99%) lacked a clear justification for the secondary analysis.

Conclusion: Effort is needed to clarify the rationale for and improve the reporting of MI for estimation of causal effects from observational data. We encourage greater transparency in making and reporting analytical decisions related to missing data.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC11373423PMC
http://dx.doi.org/10.1186/s12874-024-02302-6DOI Listing

Publication Analysis

Top Keywords

missing data
36
observational studies
16
data
12
multivariable missingness
12
data assumptions
12
studies
11
missing
9
multiple imputation
8
"missing random"
8
analysis sample
8

Similar Publications

Background: Colorectal adenomas (CAs) represent a significant global health issue, particularly in China, where lifestyle modifications have contributed to their increased prevalence. These adenomas are precursors to colorectal cancer. While high-fiber diets have been shown to decrease risk, the implications of food-specific serum immunoglobulin G reactivity (FSsIgGR) on CAs remain uncertain and warrant further investigation.

View Article and Find Full Text PDF

Objective: The paper aims to address the problem of massive unlabeled patients in electronic health records (EHR) who potentially have undiagnosed diabetic retinopathy (DR). It is desired to estimate the actual DR prevalence in EHR with 96 % missing labels.

Materials And Methods: The Cerner Health Facts data are used in the study, with 3749 labeled DR patients and 97,876 unlabeled diabetic patients.

View Article and Find Full Text PDF

Background: Fully automatic skull-stripping and tumor segmentation are crucial for monitoring pediatric brain tumors (PBT). Current methods, however, often lack generalizability, particularly for rare tumors in the sellar/suprasellar regions and when applied to real-world clinical data in limited data scenarios. To address these challenges, we propose AI-driven techniques for skull-stripping and tumor segmentation.

View Article and Find Full Text PDF

Background: Psychiatric illness is thought to be a brain somatic crosstalk disorder. However, the existing phenomenology-based Diagnostic and Statistical Manual of Mental Disorders, Fifth Edition (DSM-5) diagnostic framework overlooks various dimensions other than symptoms. In this study, we investigated the associations between peripheral blood test indexes with various symptom levels of major depressive disorder (MDD), bipolar disorder (BD), and schizophrenia (SCZ) to explore the availability of peripheral blood test indexes.

View Article and Find Full Text PDF

Background: A traumatic diaphragm defect is a rare injury. A missed diaphragm injury may cause serious morbidity and mortality. Detection rate during the first assessment of trauma patients is notoriously low.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!

A PHP Error was encountered

Severity: Notice

Message: fwrite(): Write of 34 bytes failed with errno=28 No space left on device

Filename: drivers/Session_files_driver.php

Line Number: 272

Backtrace:

A PHP Error was encountered

Severity: Warning

Message: session_write_close(): Failed to write session data using user defined save handler. (session.save_path: /var/lib/php/sessions)

Filename: Unknown

Line Number: 0

Backtrace: