A nonparametric multiple imputation approach for missing categorical data.

BMC Med Res Methodol

Department of Epidemiology and Biostatistics, Mel and Enid Zuckerman College of Public Health, University of Arizona, 1295 N. Martin Ave., Tucson, 85724, USA.

Published: June 2017

Background: Incomplete categorical variables with more than two categories are common in public health data. However, most of the existing missing-data methods do not use the information from nonresponse (missingness) probabilities.

Methods: We propose a nearest-neighbour multiple imputation approach to impute a missing at random categorical outcome and to estimate the proportion of each category. The donor set for imputation is formed by measuring distances between each missing value with other non-missing values. The distance function is calculated based on a predictive score, which is derived from two working models: one fits a multinomial logistic regression for predicting the missing categorical outcome (the outcome model) and the other fits a logistic regression for predicting missingness probabilities (the missingness model). A weighting scheme is used to accommodate contributions from two working models when generating the predictive score. A missing value is imputed by randomly selecting one of the non-missing values with the smallest distances. We conduct a simulation to evaluate the performance of the proposed method and compare it with several alternative methods. A real-data application is also presented.

Results: The simulation study suggests that the proposed method performs well when missingness probabilities are not extreme under some misspecifications of the working models. However, the calibration estimator, which is also based on two working models, can be highly unstable when missingness probabilities for some observations are extremely high. In this scenario, the proposed method produces more stable and better estimates. In addition, proper weights need to be chosen to balance the contributions from the two working models and achieve optimal results for the proposed method.

Conclusions: We conclude that the proposed multiple imputation method is a reasonable approach to dealing with missing categorical outcome data with more than two levels for assessing the distribution of the outcome. In terms of the choices for the working models, we suggest a multinomial logistic regression for predicting the missing outcome and a binary logistic regression for predicting the missingness probability.

Download full-text PDF

Source
http://www.ncbi.nlm.nih.gov/pmc/articles/PMC5461637PMC
http://dx.doi.org/10.1186/s12874-017-0360-2DOI Listing

Publication Analysis

Top Keywords

working models
24
logistic regression
16
regression predicting
16
multiple imputation
12
missing categorical
12
categorical outcome
12
missingness probabilities
12
proposed method
12
imputation approach
8
non-missing values
8

Similar Publications

Higher Aircraft Noise Exposure Is Linked to Worse Heart Structure and Function by Cardiovascular MRI.

J Am Coll Cardiol

December 2024

UCL MRC Unit for Lifelong Health and Ageing, University College London, London, United Kingdom; UCL Institute of Cardiovascular Science, University College London, London, United Kingdom; Centre for Inherited Heart Muscle Conditions, Cardiology Department, Royal Free Hospital, London, United Kingdom. Electronic address:

Background: Aircraft noise is a growing concern for communities living near airports.

Objectives: This study aimed to explore the impact of aircraft noise on heart structure and function.

Methods: Nighttime aircraft noise levels (L) and weighted 24-hour day-evening-night aircraft noise levels (L) were provided by the UK Civil Aviation Authority for 2011.

View Article and Find Full Text PDF

Addressing the issue of excessive manual intervention in discharging fermented grains from underground tanks in traditional brewing technology, this paper proposes an intelligent grains-out strategy based on a multi-degree-of-freedom hybrid robot. The robot's structure and control system are introduced, along with analyses of kinematics solutions for its parallel components and end-effector speeds. According to its structural characteristics and working conditions, a visual-perception-based motion control method of discharging fermented grains is determined.

View Article and Find Full Text PDF

On Security Performance of SWIPT Multi-User Jamming Based on Mixed RF/FSO Systems with Untrusted Relay.

Sensors (Basel)

December 2024

Key Laboratory of Electromagnetic Wave Information Technology and Metrology of Zhejiang Province, College of Information Engineering, China Jiliang University, Hangzhou 310018, China.

This paper presents research on the security performance of a multi-user interference-based mixed RF/FSO system based on SWIPT untrusted relay. In this work, the RF and FSO channels experience Nakagami-m fading distribution and Málaga (M) turbulence, respectively. Multiple users transmit messages to the destination with the help of multiple cooperating relays, one of which may become an untrusted relay as an insider attacker.

View Article and Find Full Text PDF

Mitigating Data Leakage in a WiFi CSI Benchmark for Human Action Recognition.

Sensors (Basel)

December 2024

Nokia Bell Labs, 1082 Budapest, Hungary.

Human action recognition using WiFi channel state information (CSI) has gained attention due to its non-intrusive nature and potential applications in healthcare, smart environments, and security. However, the reliability of methods developed for CSI-based action recognition is often contingent on the quality of the datasets and evaluation protocols used. In this paper, we uncovered a critical data leakage issue, which arises from improper data partitioning, in a widely used WiFi CSI benchmark dataset.

View Article and Find Full Text PDF

Flexible high-deflection strain gauges have been demonstrated to be cost-effective and accessible sensors for capturing human biomechanical deformations. However, the interpretation of these sensors is notably more complex compared to conventional strain gauges, particularly during dynamic motion. In addition to the non-linear viscoelastic behavior of the strain gauge material itself, the dynamic response of the sensors is even more difficult to capture due to spikes in the resistance during strain path changes.

View Article and Find Full Text PDF

Want AI Summaries of new PubMed Abstracts delivered to your In-box?

Enter search terms and have AI summaries delivered each week - change queries or unsubscribe any time!