Missing data are much studied in epidemiology and statistics. Theoretical development and application of methods for handling missing data have mostly been conducted in the context of prospective research data and with a goal of description or causal explanation. However, it is now common to build predictive models using routinely collected data, where missing patterns may convey important information, and one might take a pragmatic approach to optimizing prediction. Therefore, different methods to handle missing data may be preferred. Furthermore, an underappreciated issue in prediction modeling is that the missing data method used in model development may not match the method used when a model is deployed. This may lead to overoptimistic assessments of model performance. For prediction, particularly with routinely collected data, methods for handling missing data that incorporate information within the missingness pattern should be explored and further developed. Where missing data methods differ between model development and model deployment, the implications of this must be explicitly evaluated. The trade-off between building a prediction model that is causally principled, and building a prediction model that maximizes the use of all available information, should be carefully considered and will depend on the intended use of the model.

Download full-text PDF

Source
http://dx.doi.org/10.1016/j.jclinepi.2020.03.028DOI Listing

Publication Analysis

Top Keywords

missing data
28
data
9
missing
8
description causal
8
causal explanation
8
methods handling
8
handling missing
8
routinely collected
8
collected data
8
model
8

Similar Publications

Identify the underlying true model from other models for clinical practice using model performance measures.

BMC Med Res Methodol

January 2025

School of Mathematical Sciences, Xiamen University, Xiamen, 361005, People's Republic of China.

Objective: To assess whether the outcome generation true model could be identified from other candidate models for clinical practice with current conventional model performance measures considering various simulation scenarios and a CVD risk prediction as exemplar.

Study Design And Setting: Thousands of scenarios of true models were used to simulate clinical data, various candidate models and true models were trained on training datasets and then compared on testing datasets with 25 conventional use model performance measures. This consists of univariate simulation (179.

View Article and Find Full Text PDF

Analysis of resting state fMRI (rs-fMRI) typically excludes images substantially degraded by subject motion. However, data quality, including degree of motion, relates to a broad set of participant characteristics, particularly in pediatric neuroimaging. Consequently, when planning quality control (QC) procedures researchers must balance data quality concerns against the possibility of biasing results by eliminating data.

View Article and Find Full Text PDF

Introduction: Driving safety may be compromised in people with dementia or mild cognitive impairment (MCI). Occupational therapists assess and screen for driving safety in older people with cognitive impairment. However, little is known about their perspectives relating to these assessments.

View Article and Find Full Text PDF

Background: Persons with disabilities are at higher risk of experiencing intimate partner violence (IPV) during the perinatal period than persons without disabilities. Although screening for IPV during the perinatal period is recommended by many organizations, little is known about screening rates for IPV by disability status.

Methods: Our objective was to compare rates of IPV screening during the perinatal period among persons with and without disabilities in the United States.

View Article and Find Full Text PDF

Modular comparison of untargeted metabolomics processing steps.

Anal Chim Acta

January 2025

Christian Doppler Laboratory for Innovative Gut Health Concepts of Livestock, Austria; BOKU University, Vienna, Dept. IFA-Tulln, Institute of Bioanalytics and Agro-Metabolomics, Tulln, Austria.

Background: Untargeted metabolomics requires robust and reliable strategies for data processing to extract relevant information form the underlying raw data. Multiple platforms for data processing are available, but the choice of software tool can have an impact on the analysis. This study provides a comprehensive evaluation of four workflows based on commonly used metabolomics software tools: XCMS, Compound Discoverer, MS-DIAL, and MZmine.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!